Looker agent benchmarking

Know exactly how good
your Looker agents are.

Sherlook runs repeatable question banks against your Conversational Analytics agents, captures every response, and grades each answer with AI — pass/fail, an A–F mark, and a written justification.

Get started

Build question banks

Define questions with expected answers for every agent you manage.

Run repeatable tests

One click runs the full bank against the live agent and captures everything.

AI-graded results

Every answer gets a pass/fail, an A–F grade, and a justification.

Know exactly how goodyour Looker agents are.

Build question banks

Run repeatable tests

AI-graded results

Know exactly how good
your Looker agents are.