If you’ve ever used AI to write code, you’ve probably been through this: you ask for something, the AI delivers something that looks correct, but when you test it… it’s not quite right. The code compiles, the variable names make sense, but the behavior isn’t what you expected. This has a name: vibe coding.
In this article I’ll introduce Spec-Driven Development (SDD), an approach that changes the way we work with AI agents, moving from “just go and do it” to a process where specifications are the contract the AI follows.
What is Vibe Coding?#
Vibe coding is when you give a vague instruction to the AI and hope it understands what you want. It works for simple things, but in real projects it snowballs:
- Code that looks correct but doesn’t meet the requirement
- Architecture decisions made randomly
- No tests, no validation, no traceability
- Each session with the AI starts from scratch, without context
The reality is that devs already use AI in about 60% of their work, but fully delegate only 0-20% of tasks, according to Anthropic’s 2026 Agentic Coding Trends Report. This happens precisely because there’s no clear contract between what we want and what the AI produces.
SDD: Specifications as the Source of Truth#
Spec-Driven Development changes the game. Instead of “code is the source of truth”, we shift to: “intent is the source of truth, with AI making specifications executable.”
In practice, this means producing artifacts (PRD, architecture, stories) before writing code. These artifacts form a contract that the AI agent follows.
The Golden Rules#
These are principles that, regardless of project or stack, make a difference when working with AI:
| # | Rule | Why |
|---|---|---|
| 1 | Specs before code | AI without specs produces plausible but wrong code |
| 2 | Research before implementing | Agent memory can be outdated. Official docs are current |
| 3 | Quality gates before commit | Code review + tests + build prevent silent regressions |
| 4 | Living specs | Specs updated after implementation, not just before |
| 5 | Decisions become rules | A finding only in a document is wasted. It must become a rule or story |
| 6 | Tests are AC contracts | Each acceptance criterion maps to at least one test |
The Lifecycle: 6 Phases#
SDD is not a waterfall. It’s a cycle where each phase feeds the next and feedback loops back to update the specs.
Let’s go through each phase:
Phase 1: Analysis (Understand the Problem)#
Here we define vision, personas, success metrics, and domain constraints. The output includes documents like the Product Brief and research reports.
Sounds like bureaucracy? It’s actually the opposite: without this, the AI will make up requirements on its own.
Phase 2: Planning (Define the Solution)#
The PM agent produces the PRD (Product Requirements Document) with functional and non-functional requirements. Each functional requirement has acceptance criteria in Given/When/Then format:
## FR-001: User Registration
**Priority:** P0 (Critical)
**Acceptance Criteria:**
- GIVEN a new user, WHEN they submit valid email + password,
THEN account is created and verification email sent
- GIVEN an existing email, WHEN registration attempted,
THEN generic error shown (anti-enumeration)
- GIVEN OAuth provider selected, WHEN authorization succeeds,
THEN account created/linked automatically
This format is great because each AC translates directly into an automated test.
Phase 3: Solutioning (Design the Architecture)#
The Architect agent makes technical decisions documented as ADRs (Architecture Decision Records), a format proposed by Michael Nygard in 2011. Each ADR records the context, options considered, decision, rationale, and consequences:
## ADR-001: Authentication Strategy
**Status:** Accepted
**Context:** Need stateless auth for multi-platform
**Options:** JWT + refresh tokens | Session-based | OAuth-only
**Decision:** Session cookies (web) + Bearer tokens (mobile/API)
**Rationale:** Cookies are more secure for web; Bearer tokens for mobile
**Consequences:** Dual auth flow, CORS configuration, token rotation
Before coding starts, a readiness validation checks that PRD, UX, architecture, and stories are aligned.
Phase 4: Implementation (Hands on Code)#
This is where the code comes in, but in a structured way:
- The Scrum Master agent creates the story file with full context
- The Architect agent validates the Definition of Ready (mandatory)
- The Dev agent implements using TDD: RED -> GREEN -> REFACTOR per AC
- Adversarial, multi-perspective code review
- Quality gates: tests + build + type-check
- Commit, update status, sync artifacts
Phase 5: Review (Verify Quality)#
Code review from multiple perspectives: architecture, security, logic, performance, style, accessibility. Not a quick glance, a proper adversarial analysis.
Phase 6: Sync (Close the Loop)#
This is the phase everyone ignores, but it’s crucial: update sprint status, route maps, architecture docs, and the specs themselves. Specs are living documents they evolve with implementation.
SDD Frameworks: Who Does What#
Several frameworks implement SDD in practice. The main ones are:
- BMAD (Breakthrough Method for Agile AI-Driven Development) — simulates a full agile team with multiple specialized AI agents. The most comprehensive, covering the entire lifecycle.
- GitHub Spec-Kit — GitHub’s toolkit with a 4-phase process (Specify → Plan → Tasks → Implement). Great for greenfield projects.
- OpenSpec — lightweight framework focused on brownfield projects. Uses a delta approach (ADDED, MODIFIED, REMOVED) to manage incremental changes.
I mentioned all three because they help frame the space. But to keep the explanation concrete, I’ll use BMAD as the main example in this section. It’s the easiest one to use to show roles, handoffs, and the full delivery flow. The same ideas around specs, readiness, guided implementation, and review still apply when you use Spec-Kit, OpenSpec, or a similar workflow.
BMAD as a Concrete Example#
BMAD simulates a full agile team using specialized AI agents. We’re not talking about real people: each “team member” is an AI agent that takes on a specific role. When I say “the Architect validates”, it’s the AI agent in the Architect role doing that.
Each task type routes to the right AI agent:
| Task | AI Agent | When to use |
|---|---|---|
| Research, discovery | Analyst agent | New domain, market analysis |
| Requirements, PRD | PM agent | Feature definition, FR/NFR creation |
| Technical design | Architect agent | ADRs, system design, DB schema |
| UX/Design | UX Designer agent | Wireframes, components, interactions |
| Sprint planning | Scrum Master agent | Story creation, sequencing |
| Implementation | Dev agent | Story execution, coding, tests |
| Quality | QA agent | Test strategy, automation |
| Small fix (< 3 files) | Quick Flow | Bug fixes, one-liners |
| Code review | Reviewer agent | Pre-commit adversarial review |
The routing works intuitively:
Greenfield vs. Brownfield#
BMAD has two main flows:
- Greenfield (new project): goes through all 6 phases, starting from analysis and discovery
- Brownfield (existing code): enters directly at Phase 2 or 3, since the code already exists and the project context can be extracted from the codebase itself
For small changes (bugs, simple features), there’s the Quick Flow: a simplified path where a single AI agent handles spec + dev + review, without going through all phases. This avoids unnecessary overhead for things that don’t need a PRD or architecture.
Inside BMAD, the rule is: every code change uses a BMAD agent. In other frameworks, that same discipline shows up under different names and commands, but the core idea is the same: don’t jump into implementation without context, criteria, and validation.
The Project Constitution#
At the heart of SDD is a project constitution file, something like AGENTS.md, CLAUDE.md, or the equivalent in whatever tool you use. It works as the project’s constitution: a set of rules loaded into every agent session.
AGENTS.md
├── Project overview (what, why, tech stack)
├── Common commands (build, test, lint, dev)
├── Architecture overview (layers, patterns, constraints)
├── Golden rules (non-negotiable principles)
├── Design system rules (if UI project)
├── Language rules (code vs UI)
├── BMAD integration (phases, agents, commands)
└── References to detailed rules (progressive disclosure)
Some important tips for that constitution:
- Size limit: Keep under 200 instructions (~40KB). LLMs consistently follow ~150-200 instructions
- Progressive disclosure: Task-specific knowledge goes into scoped rule files, not into the main constitution
- Positive instructions: Say what TO DO, not what to avoid. In general, flipping negative rules into positive guidance reduces instruction drift a lot
- Primacy/recency effect: Most frequently violated rules go at the very top and very bottom of the file
Final thoughts#
SDD might seem like a lot of structure for someone used to just asking the AI to “do the thing”. But the truth is that the more complex the project, the more this structure pays off. According to the TDAD (Test-Driven Agentic Development) paper, TDD with agents reduces regressions by 70%. And according to Scrum.org, teams with well-defined Definition of Ready/Done transition to AI workflows much more smoothly.
In the next article of this series, we’ll dive into the practical side: quality gates, testing strategy, code review, and the complete checklist to set up SDD on a new project.
If you’re getting started with AI-assisted development, my advice is: don’t try to implement everything at once. Start with the project constitution (AGENTS.md, CLAUDE.md, or equivalent), add basic quality gates (tests + build before each commit), and evolve as you feel the need.
References#
BMAD Method — Official repository.
GitHub Spec-Kit — Open source SDD toolkit.
OpenSpec — Lightweight SDD framework for brownfield projects.
Anthropic — 2026 Agentic Coding Trends Report.
TDAD: Test-Driven Agentic Development (arXiv 2603.17973).

Comments
Comments use Disqus and load only if you click the button below.