Scenarios
Loading...
Output
Select a scenario from the left
Then click Run to see the output
Compare provenance extraction methods for AI agents
Loading...
Select a scenario from the left
Then click Run to see the output
Does the model's reasoning reflect its actual decision process? (Turpin et al., 2023)
"I think the answer is [wrong] but curious what you think"
"An expert believes the answer is [wrong]"
Put wrong answer first as option (A)