Pipeline Enforcement and Expert Agents
4 minute read
The pipeline is the enforcement mechanism for agentic continuous delivery (ACD). Standard quality gates handle mechanical checks. Expert validation agents handle the judgment calls that standard tools cannot make.
For the framework overview, see ACD. For the artifacts the pipeline enforces, see The Six First-Class Artifacts.
How Quality Gates Enforce ACD
The Pipeline Verification and Deployment stages of the ACD workflow are where the Pipeline Reference Architecture does the heavy lifting. Each pipeline stage enforces a specific ACD constraint:
- Pre-commit gates (linting, type checking, secret scanning, SAST) catch the mechanical errors agents produce most often: style violations, type mismatches, and accidentally embedded secrets. These run in seconds and give the agent immediate feedback.
- CI Stage 1 (build + unit tests) validates the executable truth artifact. If human-defined tests fail, the agent’s implementation is wrong regardless of how plausible the code looks.
- CI Stage 2 (contract + schema tests) enforces the system constraints artifact at integration boundaries. Agent-generated code is particularly prone to breaking implicit contracts between modules or services.
- CI Stage 3 (mutation testing, performance benchmarks, security integration tests) catches the subtle correctness issues that agents introduce: code that passes tests but violates non-functional requirements or leaves untested edge cases.
- Acceptance tests validate the user-facing behavior artifact in a production-like environment. This is where the BDD scenarios from Behavior Specification become automated verification.
- Production verification (canary deployment, health checks, SLO monitors with auto-rollback) provides the final safety net. If agent-generated code degrades production metrics, it rolls back automatically.
The Pre-Feature Baseline
The pre-feature baseline lists nine gates that must be active before any feature work begins. These are a prerequisite for ACD. Without them passing on every commit, agent-generated changes bypass the minimum safety net.
See the pipeline patterns for concrete architectures that implement these gates:
Expert Validation Agents
Standard quality gates cover what conventional tooling can verify: linting, type checking, test execution, vulnerability scanning. But ACD introduces validation needs that standard tools cannot address. No conventional tool can verify that test code faithfully implements a human-defined test specification. No conventional tool can verify that an agent-generated implementation matches the architectural intent in a feature description.
Expert validation agents fill this gap. These are AI agents dedicated to a specific validation concern, running as pipeline gates alongside standard tools:
| Expert Agent | What It Validates | Artifact It Enforces |
|---|---|---|
| Test fidelity agent | Test code exercises the scenarios, edge cases, and assertions defined in the test specification | Executable Truth |
| Implementation coupling agent | Test code verifies observable behavior, not internal implementation details | Executable Truth |
| Architectural conformance agent | Implementation follows the constraints in the feature description | Feature Description |
| Intent alignment agent | The combined change addresses the problem stated in the intent description | Intent Description |
| Constraint compliance agent | Code respects system constraints that static analysis cannot check | System Constraints |
Adopting Expert Agents: The Same Replacement Cycle
Expert validation agents are new automated checks. Adopt them using the same replacement cycle that drives every brownfield CD migration:
- Identify a manual validation currently performed by a human reviewer. For example, checking whether test code actually tests what the specification requires.
- Automate the check by deploying an expert agent as a pipeline gate. The agent runs on every change and produces a pass/fail result with reasoning.
- Validate by running the expert agent in parallel with the existing human review. Compare results across at least 20 review cycles. If the agent matches human decisions on 90%+ of cases and catches at least one issue the human missed, proceed to the removal step.
- Remove the manual check once the expert agent has proven at least as effective as the human review it replaces.
Do not skip the parallel run. Expert validation agents need calibration. An agent that flags too many false positives trains the team to ignore it. An agent that misses real issues creates false confidence. The parallel run is where you tune the agent’s prompts, context, and thresholds until its judgment matches or exceeds human review.
Expert validation agents run on every change, immediately, eliminating the batching that manual review imposes. Humans steer; agents validate at pipeline speed.
With the pipeline and expert agents in place, the next question is what goes wrong and how to measure progress. See Pitfalls and Metrics.
Related Content
- ACD - the framework overview, eight constraints, and workflow
- The Six First-Class Artifacts - the artifacts the pipeline enforces
- Pipeline Reference Architecture - the full quality gate sequence
- Replacing Manual Validations - the replacement cycle for adopting automated checks
- Pitfalls and Metrics - what goes wrong and how to measure progress
- AI Adoption Roadmap - the prerequisite sequence, especially Harden Guardrails and Reduce Delivery Friction