AI Adoption Roadmap
A prescriptive guide for incorporating AI into your delivery process safely - remove friction and add safety before accelerating with AI coding.
7 minute read
Agentic continuous delivery (ACD) defines the additional constraints and artifacts needed when AI agents contribute to the delivery pipeline. The pipeline must handle agent-generated work with the same rigor applied to human-generated work, and in some cases, more rigor. These constraints assume the team already practices continuous delivery. Without that foundation, the agentic extensions have nothing to extend.
An agent-generated change must meet or exceed the same quality bar as a human-generated change. The pipeline does not care who wrote the code. It cares whether the code is correct, tested, and safe to deploy.
ACD is the application of continuous delivery in environments where software changes are proposed by agents. It exists to reliably constrain agent autonomy without slowing delivery.
Without additional artifacts beyond what human-driven CD requires, agent-generated code accumulates drift, quality issues, and technical debt faster than teams can detect it. By the time test coverage gaps or architectural drift surface in production incidents, the accumulated debt is too large to address incrementally. Six first-class artifacts and eight constraints address this.
Agents introduce unique challenges that require these additional constraints:
Before jumping into agentic workflows, ensure your team has the prerequisite delivery practices in place. The AI Adoption Roadmap provides a step-by-step sequence: quality tools, clear requirements, hardened guardrails, and reduced delivery friction, all before accelerating with AI coding.
ACD extends MinimumCD by the following constraints:
These constraints are not mandatory practices. They describe the minimum conditions required to sustain delivery pace once agents are making changes to the system.
Every first-class artifact is part of the delivery contract, not a convenience. Agents may read any or all artifacts. Agents may generate some artifacts. Agents may not redefine the authority of any artifact. Humans own the accountability.
These artifacts are intentionally overlapping in content but non-overlapping in authority. When an agent detects a conflict between artifacts, it cannot resolve that conflict by modifying the artifact it does not own. See The Six First-Class Artifacts for the authority hierarchy, detailed definitions, and examples.
When an AI agent contributes to a CD pipeline, the workflow extends the standard pipeline:
| Stage | Actor | Activity |
|---|---|---|
| Intent Definition | Human | Define Intent Description (why the change exists) |
| Behavior Specification | Human | Define User-Facing Behavior (BDD scenarios, the functional tests) |
| Architecture Specification | Human | Define Feature Description (architecture, constraints, performance budgets) |
| Acceptance Criteria | Human | Define acceptance criteria for non-functional tests (latency thresholds, security requirements, resource limits) |
| Test Generation | Agent | Generate test code from Behavior Specification, Architecture Specification, and Acceptance Criteria |
| Test Validation | Human → Agent | Validate test code is decoupled from implementation and faithful to specs |
| Implementation | Agent | Generate implementation |
| Pipeline Verification | Pipeline | Validate implementation against executable truth (automated tests) |
| Code Review | Human → Agent | Review implementation (code review) |
| Deployment | Pipeline | Deploy (same pipeline as any other change) |
Behavior Specification, Architecture Specification, and Acceptance Criteria together define the complete Executable Truth specification. Behavior Specification covers what the user experiences (BDD scenarios become the functional tests). Architecture Specification and Acceptance Criteria cover what the system must satisfy beyond user-visible behavior: performance budgets, security constraints, architectural boundaries, and operational requirements.
Key differences from standard CD:
Manual review at Test Validation and Code Review is a deliberate interim state, not the design. Every manual validation creates a batching point - failures become harder to trace, feedback loops extend, and unvalidated changes accumulate. When agents generate changes faster than humans review them, wait time dominates the delivery cycle.
The target state replaces manual review with expert validation agents using the same replacement cycle used throughout the CD migration:
| Stage | Starting State | Target State |
|---|---|---|
| Test Validation | Human validates test code | Expert agent validates test code is decoupled from implementation and faithful to specs; human reviews exceptions |
| Code Review | Human reviews implementation | Expert agent validates architectural conformance and intent alignment; human reviews agent-flagged concerns |
What does not migrate: The four specification stages (Intent Definition through Acceptance Criteria) remain human responsibilities. Defining intent, specifying behavior, documenting architecture, and setting acceptance criteria require judgment about what matters to the business and the user. Agents validate whether specifications are met. Humans decide what the specifications should be.
See Pipeline Enforcement and Expert Agents for the full set of expert agents and how to adopt them.
Content contributed by Michael Kusters and Bryan Finster. Image contributed by Scott Prugh.
A prescriptive guide for incorporating AI into your delivery process safely - remove friction and add safety before accelerating with AI coding.
How to use agents as collaborators during specification and why small-scope specification is not big upfront design.
Detailed definitions and examples for the six artifacts that agents and humans must maintain in an ACD pipeline.
How quality gates enforce ACD constraints and how expert validation agents extend the pipeline beyond standard tooling.
Common failure modes when adopting ACD and the metrics that tell you whether it is working.
How to architect agents and code to minimize unnecessary token consumption without sacrificing quality or capability.
How to structure agent sessions so context stays manageable, commits stay small, and the pipeline stays green.
A recommended orchestrator, agent, and sub-agent configuration for coding and pre-commit review, with rules, skills, and hooks mapped to the defect sources catalog.