· ai-security

Verification Is a Ladder

Why incremental verification can work, with a few nuclear verification anecdotes.

Contents

    Some people default to thinking that verification is bleak. How could we possibly get labs to trust one another? Is there any way that the US and China can believe the other party has paused AI training? Is it at all possible to make adversarially robust verification technologies?

    Whilst a perfect, formally verified, zero-trust verification regime may seem like a heavy lift, we believe there is a ladder of technical mechanisms that help to build trust and enable high-stakes coordination over AI development.

    We’re already on the first rung of the ladder: dangerous capability evaluations. Each of the major labs includes dangerous capability evaluations in their system cards and works with the UK AISI to test their models. It is not clear that in all worlds we’re fortunate enough to have a fairly functional, voluntary, dangerous capability evaluation ecosystem. This gives us some hope that, given effort and investment, we can continue to move up the verification ladder.

    This is the clear start of a system of monitoring, evaluation, and coordination of advanced AI. We are ultimately working towards technologies that enable high-trust coordination (lab-to-lab, government-to-government). This will be an incremental process where every additional step makes the development and deployment of advanced AI a little safer.

    Diagram of two parallel axes. Tech: from simple confidence-building measures to mutually trusted verification technologies. Coordination: from unilateral declarations to build trust to multilateral regimes with verification.
    The two main axes of development. Tech: simple confidence-building measures → mutually trusted verification technologies. Coordination: unilateral declarations to build trust → multilateral regimes with verification.

    This internal doc stopped there; on our backlog is writing out the “verification ladder” that we expect to happen. To land the point that verification is a process, not a one-shot solution, we like to refer to a few nuclear verification anecdotes. Nuclear verification analogies do break down, but we find them helpful for providing common framing for thinking about AI verification.

    Low-cost (and easy to cheat) cooperative measures can create common evidence

    START I Article XII agrees that the parties will, upon request, move their heavy bombers and road-mobile ICBM launchers into the open so that the counterparty can observe them using their satellites.

    This is easy to cheat on, but when combined with National Technical Means (aka spying), it can be the basis for cooperation. When we look at AI verification technologies as not being completely trustworthy, we should remember that smaller measures like putting your bombers out into the open enable early-stage international cooperation.

    Unilateral commitments may precede reciprocal coordination

    Whilst we may want lab-lab and government-government coordination, there may need to be a unilateral first mover. A strong example came with the 1991–92 Presidential Nuclear Initiatives: President George H. W. Bush unilaterally pledged to withdraw, eliminate, or de-alert large categories of nuclear weapons, especially tactical nuclear weapons; Gorbachev, and then Yeltsin, responded with reciprocal unilateral pledges.

    These were not treaties, and they lacked the full verification architecture of later arms-control agreements, but they still produced major reductions, showing how unilateral moves can create reciprocal restraint before a formal regime exists.

    Test ban verification had an ARPA programme

    In test ban treaties, parties agree not to carry out certain nuclear weapons tests. The technology to verify these treaties was not yet good enough, so governments had to run R&D programmes to develop it.

    Project Vela Uniform was the ARPA programme on seismic detection of underground and underwater tests. We need to launch ambitious R&D projects for AI verification, building the robust technologies that can ladder on top of simpler, less robust declarations and verification techniques.

    Trust will take test runs

    We should be preparing to run cooperative verification experiments — for example, AI labs partnering with us to red-team different verification technologies. Here is the nuclear precedent:

    Joint Verification Experiment (1988): The US and USSR each detonated one underground nuclear test on their own soil — Kearsarge in Nevada, Shagan at Semipalatinsk — with the other side’s scientists physically present using their own instruments to measure the yield. This broke a 14-year deadlock over the Threshold Test Ban Treaty by letting both sides directly compare measurement methods rather than argue about them on paper, and the treaty was ratified 98-0 in 1990.

    Black Sea Experiment (1989): US scientists from the NRDC and Soviet scientists from the Academy of Sciences jointly used gamma-ray and neutron detectors on the Soviet cruiser Slava off Yalta to detect a real nuclear-armed cruise missile in its launch tube. It was an NGO-led demonstration that warhead detection was technically feasible — directly contradicting the Bush administration’s position in INF hearings — and it both opened up warhead verification as a serious field and exposed the core dilemma that good detectors leak weapon-design information.

    Start with monitoring, head towards treaties

    The Comprehensive Test Ban Treaty Organisation (CTBTO) is a >$100M a year international body verifying a ban on nuclear tests, but that ban isn’t yet enforced.

    The CTBT was signed in 1996, but has not formally entered into force. The CTBTO runs an International Monitoring System (>300 monitoring facilities) to watch for tests, without enforcing anything about them.

    Mirroring the CTBTO, we might want to implement verification technologies ahead of ratifying a treaty. The existence of a monitoring system both buys optionality and acts as the basis for international discussions.

    On-site sensors and inspections are ultimately possible

    Nuclear agreements started with satellites but have ultimately headed to on-site sensors and inspection. There is precedent for quite involved international collaboration where it is needed. The IAEA have 1250 cameras installed in 250 facilities in 33 countries and ran over 3000 on-site verifications in 2024.

    In the fullness of time, it is possible to get significant cooperation from nations.