Spec-Driven Development with AI
- Write a structured task spec using the four-element format before prompting
- Break a multi-part feature into AI-sized tasks appropriate for individual sessions
- Recognise when a task is too large or ambiguous and needs further decomposition
- Use spec writing as a design discipline to surface underspecified requirements early
Why Most AI Code Falls Short
When AI generates code that misses the mark, the cause is almost never the model itself. It is the prompt. Specifically: the prompt was ambiguous, missing constraints, or trying to do too much at once. The AI did what was asked — it just did not have enough information to ask correctly.
The fix is not a better model. It is a better spec.
Spec-driven development is a pattern popularised by developers like Addy Osmani (engineering lead at Google): before you write a single prompt asking AI to generate code, you write a short structured specification that defines exactly what you want built, what inputs and outputs look like, what constraints must be preserved, and what is out of scope. This two-minute investment consistently produces better output than an unstructured prompt — and it often reveals underspecified requirements before any code is touched.
What a Good Spec Contains
A spec for AI does not need to be formal or lengthy. It needs to answer four questions:
- What are we building? — A one-sentence description of the function, feature, or change. "A function that accepts a user ID and returns their order history from the last 30 days, paginated."
- What are the inputs and outputs? — Types, formats, edge cases. "Input: integer user ID and optional page number (default 1). Output: array of order objects, each with order_id, total, status, and created_at. Returns empty array if no orders."
- What constraints must be preserved? — Backward compatibility, performance requirements, architectural conventions the AI must respect. "Must use the existing OrderRepository pattern. No raw SQL. Must handle user IDs that do not exist without throwing."
- What is out of scope? — Explicitly stating what you do not want prevents AI from "helpfully" adding things you did not ask for. "Do not implement caching. Do not add new endpoints. Just the repository method."
Breaking a Feature Into AI-Sized Tasks
LLMs perform best when given one tight, well-scoped task at a time. The rule of thumb: if your spec would require explaining two unrelated things, split it into two prompts.
Here is how to break a realistic feature down. Suppose you are building a user notification preference system. A single AI session cannot do all of this reliably — the scope is too large and the parts are too interdependent. Break it into:
- Task 1: Database schema for notification preferences table (columns, constraints, indexes)
- Task 2: Repository methods — save preferences, load preferences for a user, reset to defaults
- Task 3: Service layer — validate input, call repository, emit change event
- Task 4: API endpoint — route, request validation, call service, return response
Each task can be spec'd independently and executed in its own clean session. The output of Task 2 informs the spec for Task 3, which informs Task 4. This sequential approach also means if Task 2 produces something you want to change, you only need to redo Task 3 and 4 — not the whole feature.
Spec Writing Is Also Design Thinking
One of the less obvious benefits of spec-driven development is that writing the spec forces you to think through the design before any code exists. Many developers have found that the act of writing "inputs: X, outputs: Y, constraints: Z" surfaces ambiguities and missing decisions that would otherwise only appear as bugs after the code was generated.
"The spec is where I find the holes in my own thinking, before I pay for them in rework." — common practitioner observation
If you cannot write a complete spec for a task, that is a signal the task is not ready to be implemented yet — regardless of whether AI or a human is doing the implementation.
The Prompt Plan Pattern
For larger features that require multiple sessions, Osmani recommends maintaining a prompt plan file: a simple text file that lists the tasks in order, with the spec for each. You work through them one by one. This has two benefits: it keeps the work ordered and prevents you from jumping ahead, and it gives you a record of what you asked and what constraints you specified — useful context if you need to revisit or extend the feature later.
- A two-minute spec investment consistently prevents ten minutes of downstream correction
- Good specs answer four questions: what to build, inputs/outputs, constraints, and what is out of scope
- One AI-sized task = one topic that can be explained without needing to cover two unrelated things
- If you cannot write a complete spec for a task, the task is not ready to be implemented yet
- A prompt plan file — tasks listed in order with their specs — keeps multi-session features coherent