Automated Review & Verification
Checking AI-generated code with machines, not just human eyes -- because agents now produce code faster than people can read it. As generation gets cheap, review becomes the bottleneck, so verification shifts left and goes programmatic: tests the agent cannot bypass, static analysis, conformance checks, and adversarial reviewer agents.
The Pattern
"A loop without [verification] just produces wrong software faster." -- Geoffrey Huntley (source)
Automated review and verification is the discipline of checking AI-generated code with machines, not only human eyes -- because the rate at which agents produce code has outrun the rate at which people can read it. As generation gets cheap, review becomes the bottleneck, so the work shifts left and goes programmatic: tests the agent cannot bypass, static analysis, architecture-conformance checks, and a separate adversarial reviewer agent graded on finding problems rather than finishing work.
The principle underneath is the asymmetry of verification: many tasks are far easier to check than to solve, and coding suits this because so much of its verification can be made mechanical. The feedback is worth a lot in practice -- an agent that reads its test failures, writes a short lesson, and retries scored 91% on a coding benchmark against 80% for the same model without that loop (Reflexion, via Jazz Tong).
Why It Matters
When humans stop reviewing every line, verification is the control that replaces them. It is the gate the eval service runs and the backpressure that keeps a loop from shipping wrong code quickly. The honest caveat is sharp: verification catches what you specified and nothing more. The Bun port passed 99.8% of its test suite while accumulating thousands of unsafe blocks no test was looking for -- so the review system is only ever as good as what it is told to check. Pair mechanical gates with adversarial review to widen what "checked" means.
Sources
Last reviewed: 2026-06-24