Agentic Continuous Delivery (ACD)

Extend continuous delivery with constraints, first-class artifacts, and practices for AI agent-generated changes.

7 minute read

Agentic continuous delivery (ACD) defines the additional constraints and artifacts needed when AI agents contribute to the delivery pipeline. The pipeline must handle agent-generated work with the same rigor applied to human-generated work, and in some cases, more rigor. These constraints assume the team already practices continuous delivery. Without that foundation, the agentic extensions have nothing to extend.

Don't put the AI cart before the CI horse - Integrating AI is software engineering. To be great at this, you need to be great at DevOps and CI.

What Is ACD?

An agent-generated change must meet or exceed the same quality bar as a human-generated change. The pipeline does not care who wrote the code. It cares whether the code is correct, tested, and safe to deploy.

ACD is the application of continuous delivery in environments where software changes are proposed by agents. It exists to reliably constrain agent autonomy without slowing delivery.

Without additional artifacts beyond what human-driven CD requires, agent-generated code accumulates drift, quality issues, and technical debt faster than teams can detect it. By the time test coverage gaps or architectural drift surface in production incidents, the accumulated debt is too large to address incrementally. Six first-class artifacts and eight constraints address this.

Agents introduce unique challenges that require these additional constraints:

Agents can generate changes faster than humans can review them
Agents may lack context about organizational norms, business rules, or unstated constraints
Agents cannot exercise judgment about risk in the same way humans can
Agents may introduce subtle correctness issues that pass automated tests but violate intent

Before jumping into agentic workflows, ensure your team has the prerequisite delivery practices in place. The AI Adoption Roadmap provides a step-by-step sequence: quality tools, clear requirements, hardened guardrails, and reduced delivery friction, all before accelerating with AI coding.

What You’ll Find in This Section

AI Adoption Roadmap - the prerequisite sequence before adopting agentic workflows
Agent-Assisted Specification - how agents help sharpen intent, draft BDD scenarios, and surface gaps in the specification stages
The Six First-Class Artifacts - detailed definitions with examples for each artifact that agents and humans must maintain
Pipeline Enforcement and Expert Agents - how quality gates and expert validation agents enforce ACD constraints automatically
Pitfalls and Metrics - common failure modes and how to measure whether ACD is working
Tokenomics - how to architect agents and code to minimize unnecessary token consumption without sacrificing quality
Small-Batch Sessions - how to structure agent sessions so context stays manageable and commits stay small

ACD Extensions to MinimumCD

ACD extends MinimumCD by the following constraints:

Explicit, human-owned intent exists for every change
Intent and architecture are represented as first-class artifacts
All first-class artifacts are versioned and delivered together with the change
Intended behavior is represented independently of implementation
Consistency between intent, tests, implementation, and architecture is enforced
Agent-generated changes must comply with all documented constraints
Agents implementing changes must not be able to promote those changes to production
While the pipeline is red, agents may only generate changes restoring pipeline health

These constraints are not mandatory practices. They describe the minimum conditions required to sustain delivery pace once agents are making changes to the system.

The Six First-Class Artifacts

Every first-class artifact is part of the delivery contract, not a convenience. Agents may read any or all artifacts. Agents may generate some artifacts. Agents may not redefine the authority of any artifact. Humans own the accountability.

Intent Description - why the change exists (human-owned)
User-Facing Behavior - what users experience (externally observable)
Feature Description - architectural trade-offs and constraints (engineering-owned)
Executable Truth - automated tests that make intent falsifiable (pipeline-enforced)
Implementation - the code (fully constrained by other artifacts)
System Constraints - global invariants (system-level rules)

These artifacts are intentionally overlapping in content but non-overlapping in authority. When an agent detects a conflict between artifacts, it cannot resolve that conflict by modifying the artifact it does not own. See The Six First-Class Artifacts for the authority hierarchy, detailed definitions, and examples.

The ACD Workflow

When an AI agent contributes to a CD pipeline, the workflow extends the standard pipeline:

Stage	Actor	Activity
Intent Definition	Human	Define Intent Description (why the change exists)
Behavior Specification	Human	Define User-Facing Behavior (BDD scenarios, the functional tests)
Architecture Specification	Human	Define Feature Description (architecture, constraints, performance budgets)
Acceptance Criteria	Human	Define acceptance criteria for non-functional tests (latency thresholds, security requirements, resource limits)
Test Generation	Agent	Generate test code from Behavior Specification, Architecture Specification, and Acceptance Criteria
Test Validation	Human → Agent	Validate test code is decoupled from implementation and faithful to specs
Implementation	Agent	Generate implementation
Pipeline Verification	Pipeline	Validate implementation against executable truth (automated tests)
Code Review	Human → Agent	Review implementation (code review)
Deployment	Pipeline	Deploy (same pipeline as any other change)

Behavior Specification, Architecture Specification, and Acceptance Criteria together define the complete Executable Truth specification. Behavior Specification covers what the user experiences (BDD scenarios become the functional tests). Architecture Specification and Acceptance Criteria cover what the system must satisfy beyond user-visible behavior: performance budgets, security constraints, architectural boundaries, and operational requirements.

Key differences from standard CD:

The four specification stages (Intent Definition through Acceptance Criteria) happen before any code generation. Specification-first should already be standard practice without agents. Every downstream stage - Test Generation, Implementation, Code Review, and Deployment - depends on the quality of these specifications. With agents, that dependency becomes absolute: an agent cannot compensate for missing or ambiguous specifications the way a human sometimes can. This is not big upfront design. You specify the next small step, not the entire feature set. See Agent-Assisted Specification for how agents make this work fast.
Test Generation and Test Validation separate test definition from test code. Teams often conflate the two because they happen at the same time, but they are distinct activities. Defining tests means deciding what scenarios, edge cases, and acceptance criteria to verify. Test code is the machine-runnable implementation of those decisions. Humans define the tests before development begins. Agents generate the test code, which must be validated for behavior focus and spec fidelity before implementation starts.
System constraints are checked automatically in the pipeline during Pipeline Verification. This is standard CD practice. The difference is that agents require these constraints to be stated explicitly as artifacts rather than carried as team knowledge.

Migrating Test Validation and Code Review to expert agents

Manual review at Test Validation and Code Review is a deliberate interim state, not the design. Every manual validation creates a batching point - failures become harder to trace, feedback loops extend, and unvalidated changes accumulate. When agents generate changes faster than humans review them, wait time dominates the delivery cycle.

The target state replaces manual review with expert validation agents using the same replacement cycle used throughout the CD migration:

Stage	Starting State	Target State
Test Validation	Human validates test code	Expert agent validates test code is decoupled from implementation and faithful to specs; human reviews exceptions
Code Review	Human reviews implementation	Expert agent validates architectural conformance and intent alignment; human reviews agent-flagged concerns

Start with human validation only (the workflow as shown above)
Deploy an expert agent that runs in parallel with the human reviewer
Compare results until you are confident the agent matches or exceeds human judgment
Shift the human role from “review everything” to “review what the agent flags and spot-check according to risk”

What does not migrate: The four specification stages (Intent Definition through Acceptance Criteria) remain human responsibilities. Defining intent, specifying behavior, documenting architecture, and setting acceptance criteria require judgment about what matters to the business and the user. Agents validate whether specifications are met. Humans decide what the specifications should be.

See Pipeline Enforcement and Expert Agents for the full set of expert agents and how to adopt them.

Pipeline Reference Architecture - quality gates sequenced by defect detection priority
Replacing Manual Validations - the replacement cycle for adopting expert validation agents
Defect Sources - where defects originate, informing executable truth and system constraints
Small Batches - limiting change size, with extra rigor for agent-generated changes
Code Coverage Mandates - an anti-pattern especially dangerous when agents optimize for coverage rather than intent
Pressure to Skip Testing - an anti-pattern that ACD counters by making test-first workflow mandatory
High Coverage but Ineffective Tests - a testing symptom that undermines the executable truth agents depend on

Content contributed by Michael Kusters and Bryan Finster. Image contributed by Scott Prugh.

Agentic Continuous Delivery (ACD)

What Is ACD?

What You’ll Find in This Section

ACD Extensions to MinimumCD

The Six First-Class Artifacts

The ACD Workflow

Migrating Test Validation and Code Review to expert agents

AI Adoption Roadmap

Agent-Assisted Specification

The Six First-Class Artifacts

Pipeline Enforcement and Expert Agents

Pitfalls and Metrics

Tokenomics: Optimizing Token Usage in Agent Architecture

Small-Batch Agent Sessions

Recommended Agent Configuration for Coding and Review

Agentic Continuous Delivery (ACD)

What Is ACD?

What You’ll Find in This Section

ACD Extensions to MinimumCD

The Six First-Class Artifacts

The ACD Workflow

Migrating Test Validation and Code Review to expert agents

Related Content