Description
Often, making an LLM safer (e.g., less likely to generate harmful content) can make it less helpful or capable. However, these trade-offs are not well understood for LLMs. We need better ways to measure safety and understand when and why these trade-offs occur.