TruthfulQA

Measuring How Models Mimic Human Falsehoods

About TruthfulQA

TruthfulQA is a benchmark designed to assess whether a language model is able to generate truthful answers to questions. This benchmark includes 817 questions from 38 different topics, such as health, law, finance, and politics. The authors of this benchmark crafted questions that some humans might answer incorrectly due to false beliefs or misunderstandings.