Defence AI Playbook live since February 2024

UK Defence AI Playbook

Show MoD you’ve got AI under control.

The UK MoD’s Defence AI Playbook (January 2024) sets the obligation pattern every defence supplier is expected to follow on AI: a TEVV gate before a model influences an operational decision, a meaningful human in the loop on every AI-assisted output, an audit trail covering provenance and confidence, and an “ambitious, safe and responsible” review across the broad range of harms a Defence AI capability can cause. Secruna discovers the AI tools across your stack, classifies them against the six Playbook obligation patterns, and keeps the evidence pack ready for the next bid gate or MoD assurance question.

Get a 30-min defence AI scope call

Why this matters

Three exposures bite — in this order.

The Defence AI Playbook is already in force. The pressure on an MoD supplier comes from three directions at once — and each of them gets harder to answer the longer the AI register sits unstarted.

Procurement gate

MoD bid documentation increasingly requires Playbook-aligned governance evidence — a documented AI register, named human decision-owners on AI-assisted outputs, and a TEVV pack ready to share with the contracting authority. The question now appears on Defence pre-qualification questionnaires, on capability-area RFIs, and on supplier-assurance reviews held by primes on behalf of MoD. A bid that cannot answer it will not progress beyond the assurance gate, regardless of how strong the technical proposal is.

Supply-chain propagation

The Playbook’s obligations cascade through the supplier chain. Even small AI use — a CoStar AVM run on a property valuation for an MoD estate, a Copilot draft of a technical narrative inside a contract deliverable, an ML anomaly check on a piece of supply-chain telemetry — sits inside scope once the deliverable touches MoD. Tier-2 suppliers and chartered firms with a single MoD-portfolio instruction inherit the same obligation pattern as a defence prime; the inheritance is not optional and not negotiable.

Reputation and national security

A serious AI incident in a Defence context carries national-security weight on top of the normal regulatory and reputational risk a private-sector incident would carry. The Playbook is explicit that Defence AI must be “ambitious, safe and responsible” across a broad range of harms (p.3, p.8). A supplier that cannot show its working — TEVV, oversight, audit trail — when an incident lands is the one the prime de-risks first. The cost of that de-risking event is measured in framework positions, not retainers.

The five-step path

What you have to do, in order.

The same five gates apply to every framework Secruna covers, including the Defence AI Playbook. Start at step one — the rest only make sense once you know which AI tools your programme actually touches.

1
Inventory
Discover every AI tool in the supplier’s stack — the M365 / Azure side (Copilot seats, Azure OpenAI deployments, Power Platform AI Builder), the programme-specific side (bespoke ML pipelines, MODCloud workloads, predictive-maintenance engines), and the third-party SaaS tooling (analyst-triage assistants, document discovery, threat-intel platforms). The first scan almost always surfaces three to five AI tools the programme management team did not know were in use.
2
Classify
Map each tool to the relevant Defence AI Playbook obligation pattern — TEVV gate, human oversight, audit trail, safe-and-responsible review, or out of scope. Tools commonly land in more than one pattern — a satellite-imagery triage tool typically requires meaningful human oversight (p.9) and an audit trail preserving its confidence scores (p.10) at the same time.
3
Document
Generate the Defence AI Use Statement per AI system — the artefact a prime contractor or an MoD assurance reviewer expects to read. Secruna pre-fills the statement from the supplier register so a programme lead edits a draft, not a blank page, and the contracting authority sees a consistent statement across every system the supplier ships into Defence.
4
Review
Maintain the supplier-side AI register that an MoD assurance reviewer will ask for: tool name, vendor, Playbook obligation pattern, named human decision-owner, TEVV evidence reference, last-reviewed date, supplier context (which contract or programme the AI supports). Secruna keeps the register live as the programme stack evolves and flags entries that have aged past their review date.
5
Audit trail
Retain the evidence in line with MoD record-retention expectations — every classification change, every TEVV re-run, every human-override entry, every safe-and- responsible review note. When the prime contractor or the contracting authority opens a question, the pack is one click away rather than three weeks of email archaeology across the programme team.

Decision support

When AI helps a staff officer plan, who owns the decision?

The buyer’s question. A programme ships an AI tool that helps a staff officer cross-correlate terrain, meteorological and force-structure data to produce a course-of-action option. Or an imagery-analyst triage system flags satellite images for review. Or an LLM-driven assistant drafts staff-officer prose. The AI shapes the output the human signs. What does the Playbook require?

Defence AI Playbook reference. p.11 frames AI Assisted Operational Planning as “decision support tools to automate elements of operational planning” — not decision-making tools. p.9 lands the canonical phrasing on Object Detection in Satellite Imagery: “the analyst remains responsible for interpreting what is seen to deliver actionable insight to the frontline.” p.10 (Analysis of RF Signals) requires the model’s confidence measure to flow through to the human reviewer. Together these anchor the meaningful-human-in-the-loop obligation.

What counts as compliant. A named human decision-owner per AI tool — staff officer, imagery analyst, signals analyst by role. The model’s confidence / ranking metadata preserved end-to-end so the reviewer is not flying blind. A record of where the AI was relied upon, where it was overridden, and the basis of the override. Operator training on the AI’s known failure modes before the tool is integrated into the workflow. Without an override record, the human-in-the-loop control is unauditable after the fact.

What Secruna ships for decision support. Detection of bespoke ML / NLP planning assistants, Microsoft Copilot and Azure OpenAI workflows used by staff officers, MODCloud-hosted decision-support pipelines, object-detection ML on imagery, and RF signal classifiers. A pre-filled decision-support obligation block in the Defence AI Use Statement. A confidence-preservation check that fires when an AI score is dropped before it reaches the human reviewer’s screen.

See this in your dashboard at: /inventory?framework=uk_defence_ai_playbook&category=requires_human_oversight filtered to decision-support tools, with the obligation statement preview on each card.

Intelligence processing

When a model classifies a signal, who keeps the receipts?

The buyer’s question. A capability uses NLP and graph visualisation to expose hidden relationships across a classified document corpus. Or an ML classifier labels RF waveforms with a confidence score. Or change-detection runs across satellite imagery to flag activity to a Defence Intelligence analyst. The AI’s output shapes an intelligence product. What does the Playbook require the supplier to retain?

Defence AI Playbook reference. p.8 (Intelligent Search & Document Discovery, Public Perception passage) requires “a broad range of potential harms from AI in Defence” to be considered, with the assessment held under “constant review” as data sources and models evolve. p.10 (Analysis of RF Signals) is explicit that the AI emits a confidence measure on its classification — the obligation is to preserve that measure end-to-end so downstream consumers can weight the AI’s output rather than treat it as ground truth.

What counts as compliant. A provenance record per AI classification or extraction that flows into an intelligence product — model name, version, training-data caveats, runtime confidence. Confidence scores preserved end-to-end through the analyst’s workflow. The broad-range-harms assessment kept under constant review and re-run as new classified data sources or model revisions land. Analyst overrides logged with the basis of the override, so the audit trail captures both the AI’s and the human’s contribution.

What Secruna ships for intelligence. Detection of NLP / vector search systems on classified corpora, ML signal classifiers, change-detection pipelines on imagery, and ML-assisted HUMINT triage. A provenance-record schema that snaps onto the intelligence product and preserves the AI’s confidence score. A constant-review schedule that fires the broad-range-harms re-assessment when a model or data source changes.

See this in your dashboard at: /inventory?framework=uk_defence_ai_playbook&category=requires_audit_trail filtered to intelligence-processing tools, with the provenance-record template available for one-click export.

Logistics and predictive maintenance

Before a model influences procurement, where is the TEVV pack?

The buyer’s question. A predictive algorithm forecasts spare-parts failure across a legacy fleet using a decade of historical service records. Or an ML pipeline optimises store holdings against operational demand. Or an autonomous-resupply RAS uses AI navigation to cross unknown terrain. The AI’s output drives a procurement, deployment, or operational decision. What gate does the Playbook impose before the model is allowed to do that?

Defence AI Playbook reference. p.13 (Last-Mile Resupply) is explicit, in capitals: “Test, Evaluation, Verification and Validation (TEVV) and assurance” — the canonical TEVV anchor in the Playbook. p.6 (Spare Parts Failure Prediction) flags the data-quality and accuracy review obligation when ML runs on legacy maintenance data. p.7 (AI at the Edge) sets the expected pipeline shape: model training, compression and testing before deployment to the platform.

What counts as compliant. A TEVV gate run before the model influences a procurement, deployment or operational decision. Documented data-quality limits — particularly acute on legacy platforms with manually recorded service records (p.6) and on edge platforms with constrained compute (p.7). Testing against the deployment environment, not just the training environment — the Playbook is explicit that deployment may differ significantly from training (p.13). A TEVV evidence pack kept alongside the capability so MoD assurance can re-run the gate at any point in the system’s life.

What Secruna ships for logistics. Detection of predictive-maintenance pipelines, autonomous- navigation stacks, and store-optimisation ML in MODCloud and supplier Azure / AWS accounts. A TEVV evidence-pack template aligned to the Playbook’s canonical anchors. A re-run trigger that fires when training data, model version, or deployment environment changes.

See this in your dashboard at: /inventory?framework=uk_defence_ai_playbook&category=requires_tevv filtered to logistics-and-predictive tools, with the TEVV pack template attached to each capability record.

Personnel systems

When AI ranks a candidate or a posting, who decides?

The buyer’s question. A capability ranks candidates for a recruitment shortlist, scores service personnel for a posting decision, or recommends promotion candidates from workforce data. The Playbook lists Career Management in the AI Now band on p.4 — the tooling is in service today. Defence personnel are stakeholders. What does the supplier owe the affected individual, the chain of command, and MoD assurance?

Defence AI Playbook reference. p.3 (Cdre Singleton introduction) sets the “ambitious, safe and responsible” principle that is the spine of the Playbook. p.8 (Public Perception) is explicit: “we carefully consider a broad range of potential harms from AI in Defence, in order to maintain confidence of all stakeholders, including the public.” Defence personnel are stakeholders by name. The substantive obligation is meaningful human oversight of every personnel-affecting decision, with disclosure to the affected individual.

What counts as compliant. A meaningful human decision-owner on every posting, promotion, or recruitment outcome — AI ranks, shortlists, or flags; the chain of command decides. Disclosure of AI involvement to the affected personnel through the relevant grievance or appeal channel. A documented training-data, model-version, and fitness-for-purpose record (with particular attention to representative coverage of protected characteristics across the Defence workforce). Recurring adverse-impact audits, not just a one-shot deployment review. Where the EU AI Act Annex III §4 (employment / HR) overlaps, the Secruna matcher fires both rules in parallel.

What Secruna ships for personnel. Detection of HR / personnel ML pipelines, recruitment shortlisting tools, career-management analytics, and workforce-planning models that touch Defence personnel data. A disclosure block tailored for the chain-of-command context. A recurring adverse-impact audit schedule attached to the system record.

See this in your dashboard at: /inventory?framework=uk_defence_ai_playbook&category=requires_human_oversight filtered to personnel systems, with the disclosure block ready for the human-resources policy team to adopt.

Cyber defence

Before an AI blocks a packet, where is the test pack?

The buyer’s question. A cyber capability runs ML threat detection on network telemetry, automates incident response, classifies malware samples, or uses an LLM to triage threat-intelligence feeds. The model’s output drives a control decision — block / allow / quarantine / escalate. A wrong call has operational consequences inside the Defence enterprise. What gate does the Playbook impose?

Defence AI Playbook reference. p.13 lands the TEVV anchor that applies symmetrically here — a cyber-defence model influencing a control decision sits in the same slot as a safety-critical autonomous system. p.7 (AI at the Edge) sets the training, compression, testing pipeline expected before deployment. p.12 (LLMs for Defence — Unknown Risks) flags the information-security, IP, and reputational risks of cloud-hosted AI on cyber-relevant data.

What counts as compliant. A TEVV gate before the model influences a control decision. Testing against adversarial and evasion conditions, not just clean test traffic. The model’s confidence or score preserved on every decision so a cyber analyst can weight the automated recommendation. Cleared-environment data kept inside cleared environments — the Playbook is explicit on the information-security risk of cloud-hosted AI on Defence data (p.12). A re-test cadence that fires when the threat surface or the model changes.

What Secruna ships for cyber. Detection of ML threat-detection platforms, automated incident-response pipelines, malware classifiers, and LLM-augmented threat-intel triage tools. A cyber-specific TEVV evidence-pack template covering the adversarial-test cases the Playbook anchors require. A data-residency check that flags cloud-hosted AI calls against the supplier’s declared cleared-environment boundary.

See this in your dashboard at: /inventory?framework=uk_defence_ai_playbook&category=requires_tevv filtered to cyber-defence tools, with the adversarial- testing checklist attached to each capability record.

Supplier-chain propagation

A chartered firm runs an AVM on MoD estate. Now what?

The buyer’s question. A chartered firm uses CoStar AI on a valuation of MoD estate. A tier-2 supplier uses Copilot to draft technical narrative inside an MoD deliverable. An SME partner runs an ML anomaly check on telemetry feeding a prime’s MoD-funded programme. The AI is a commercial product the supplier already owned; the deliverable touches MoD. What inherits down the chain?

Defence AI Playbook reference. p.3 (Cdre Singleton introduction) is explicit that the Defence AI ecosystem extends to industry, non-traditional suppliers, SMEs, and AUKUS partners: “We are building an AI ecosystem to strengthen our relationship with industry, establish new links with non-traditional suppliers and small and medium enterprises”. The Playbook’s substantive obligations propagate down the supplier chain accordingly. p.8 (Public Perception) reinforces that broad-range-harms review applies wherever Defence AI lands, including via supplier deliverables.

What counts as compliant. Register the AI tool against the MoD contract or programme it supports, with artifact-level metadata that lets the prime audit downstream AI involvement. Retain a provenance record — model, version, training-data caveats, runtime confidence — for every AI-assisted deliverable that flows into MoD. Comply with the relevant Playbook obligation pattern of the underlying use case (decision support, intelligence, logistics, personnel, cyber); the Secruna matcher fires the relevant rules in parallel where the supplier-side AI activity matches. Preserve the audit trail through the chain — the prime’s assurance question can land on a sub-supplier’s AI tool at any point.

What Secruna ships for suppliers. MoD-context tagging on any AI tool used on a Defence deliverable (so a CoStar AVM on MoD estate is tagged differently from a CoStar AVM on a private-sector job). A supplier-side provenance-record export designed to satisfy a prime’s assurance question. Inheritance logic that fires the substantive Playbook rules in parallel where the supplier-side AI activity matches.

See this in your dashboard at: /inventory?framework=uk_defence_ai_playbook&category=requires_audit_trail filtered to supplier-context tools, with the MoD-context tag visible per instruction record.

See where your firm stands.
In 30 minutes.

A 30-minute scope call gives you a concrete answer for each of the six Defence AI Playbook obligation patterns above — which of the AI tools your programme uses today fall under which obligation, and what evidence is missing before the next bid gate or MoD assurance question.

Get a 30-min defence AI scope call See the RICS page

Or call our UK lead — we’re on +44 20 0000 0000. (Placeholder — see TODO at the top of this file; the real number lands once the founder confirms it.)

Show MoD you’ve got AI under control.

Three exposures bite — in this order.

Procurement gate

Supply-chain propagation

Reputation and national security

What you have to do, in order.

Inventory

Classify

Document

Review

Audit trail