I’m going to be honest with you before I say anything else.

We’re figuring this out too.

Not from scratch — 9 years and 120+ shipped products give you pattern recognition that matters. But anyone who tells you they’ve cracked the AI-native development workflow is either selling something or hasn’t been doing it long enough to hit the edge cases. The people I trust most on this are the ones who say the same thing: we’re better at it than we were yesterday, and we’re watching what works for other teams.

That’s the context for this post. Not a methodology handed down from a mountain. A practitioner’s field notes from a studio that decided, two years ago, to rebuild how it works from the ground up.

§

The old crime

For three decades, the software industry treated documentation as a deliverable.

You wrote the spec to get sign-off. You wrote the PRD to satisfy the process. You wrote the README when someone complained there wasn’t one. And the moment the sprint started, those documents began their slow, inevitable drift away from reality. Six weeks later, the spec said one thing, the code did another, and the only person who knew the truth was the developer who’d been staring at both.

This wasn’t laziness. It was structural. Documents and code lived in different worlds — different tools, different owners, different update cycles. The spec was a photograph of an intention. The code was the thing that actually happened. They were never going to stay in sync.

We accepted this. We called it “technical debt” and moved on.

§

What AI agents just broke open

Here’s the thing about AI agents that nobody warned me about clearly enough: they have no intuition.

A junior developer joining your team can pick up conventions by osmosis. They notice how you name things. They see patterns in the existing code and follow them. They ask questions when something doesn’t make sense.

An AI agent does none of this.

As McKinsey’s engineering team found when building agentic development workflows, every guideline must be explicit and machine-readable — because agents bring no background knowledge, no intuition from past projects. Feed an agent an ambiguous spec and it doesn’t ask a clarifying question. It fills the gap with its best guess. Sometimes that guess is good. Sometimes it builds you something technically correct that solves the wrong problem entirely.

The teams I’ve watched burn the most money on AI-assisted development share one pattern: they started with the models before they fixed the inputs. They pointed a capable agent at a vague brief and expected quality to emerge. It didn’t.

The teams doing it well started somewhere less exciting: they fixed their documents first.

§

The new artifact

What’s emerging — and “emerging” is the right word, because this isn’t settled — is a different kind of document.

Not a spec you write to communicate with humans, then file away. A structured artifact that serves two audiences simultaneously: a person reads it and understands intent; a machine reads it and knows exactly what to execute. It lives in Git. It has a defined state. It travels with the code it describes. When the intent changes, the artifact changes first — not after, not eventually, not “we should probably update the docs.”

Gartner’s 2025 AI Hype Cycle named AI-native software engineering as a new category for the first time — describing the shift to practices where AI is embedded across every phase of the SDLC, not just coding assistance. One of the quiet prerequisites of that shift, barely mentioned in the headline findings, is this: you can’t embed AI across every phase if there’s nothing structured for it to read at each phase.

How we build
We map every user journey before a line of code is written.

The spec is the first deliverable. Reviewed, iterated, and signed off. Then it lives in the repo — and drives everything that follows. Fixed price. Shipped in weeks.

See how it works
§

There is no right way. Yet.

This is the part I want to be careful about.

Every team I’ve spoken to that’s doing this seriously has arrived at a different shape. Different file formats, different state machines, different levels of structure, different opinions on what lives in the artifact versus what lives in the code comments versus what gets generated dynamically at runtime.

The thing they all agree on: the right level of structure depends on how much human judgment you’re keeping in the loop.

This is not a binary. It’s a dial.

At one end: a single developer working with an AI coding assistant, maintaining a lightweight spec in Markdown that mostly serves as their own thinking tool, with the AI reading it for context. The human is deeply in the loop. The spec can afford to be rough.

At the other end: an agentic pipeline where multiple specialised agents hand off work across phases — a requirements agent, an architecture agent, a coding agent, a testing agent — where the artifact is the only connective tissue between them. Here, every artifact needs a consistent structure, a defined state machine, and machine-readable metadata. The human-in-the-loop exists at review gates, not inside every step. The artifact has to carry everything the next agent needs.

Most teams are somewhere in between. And where you sit on that dial should drive the shape of your artifacts — not the other way around.

This is exactly what happened with Agile. Nobody ran “pure Scrum.” The principles stayed consistent. The implementation diverged in a thousand productive directions. Artifact-driven development is going the same way.

§

What a well-managed artifact actually unlocks

When a spec lives in Git, properly structured, it stops being documentation. It becomes the product’s memory.

For build agents: Not just “what to write” but what constraints apply, what the human already decided and why, what is explicitly out of scope, and what “done” looks like before a single test runs.

For test agents: The artifact defines what correct behaviour looks like. A test agent reading a well-formed spec doesn’t have to infer intent — it’s stated. Edge cases the team thought about are captured. The difference between a bug and a deliberate design choice is documented.

50K+
GitHub stars for Spec Kit — GitHub’s open-source spec tooling reached this milestone in 2025, largely on the insight that treating specs as executable artifacts rather than throwaway docs gives AI coding assistants something to be predictable against. The demand signal is real and accelerating.

For maintenance agents: This is the one most people miss. The product you’re maintaining 18 months from now is going to be touched by agents that have no memory of the conversation in which the original decisions were made. An artifact that captured intent — not just implementation — is the difference between a maintenance agent that understands what it’s looking at and one that’s pattern-matching on code it doesn’t contextually understand.

A codebase without living artifacts gets harder to maintain over time, whether a human or an agent is doing the maintaining. A codebase with living artifacts gets easier — because the knowledge compounds.

§

What we’re doing at specshop.dev

When we rebuilt from the ground up, the spec-first model wasn’t a client-facing promise about reducing scope creep (though it does that). It was an engineering decision about where knowledge should live.

Every project starts with a spec. The spec is the first deliverable. It’s reviewed, iterated, and signed off before a line of code is written. It lives in the repo. When things change — and they always do — the spec changes first.

At specshop.dev

We built spec2web partly to operationalise this at the toolchain level. The ops layer around a project shouldn’t have to be rebuilt from scratch every time. The patterns for how artifacts are structured, versioned, and consumed by the tools we use should be reusable. They should get better over time.

Are we done? No. We’re better at this than we were six months ago. We’ll be better still in six months. The field is young enough that the most honest contribution anyone can make right now is to show their working.

§

The question worth sitting with

If an AI agent were added to your current project tomorrow — not to replace your team, just to help — what would it read to understand what you’re building and why?

If the answer is “the code, I guess” or “there’s a Google Doc somewhere from Q3”, that’s the gap.

The artifact isn’t the boring part of software development. In an AI-native workflow, it’s the foundation everything else stands on.

How you build that foundation is yours to figure out. There’s no certification, no framework to buy. Just the discipline of writing things down in a way that still makes sense six months later — to a human, to an agent, and to whoever picks this up after you.

§

Questions we get asked about this

What is a structured artifact in software development?
A structured artifact is a document that serves two audiences simultaneously: a person reads it and understands intent; a machine reads it and knows exactly what to execute. It lives in Git alongside the code it describes, has a defined state, and travels with the project it governs. When intent changes, the artifact changes first — not after, not eventually.
Why should software specs live in Git?
When a spec lives in Git, it stops being documentation and becomes the product’s memory. It versions with the code, so the intent behind any change is traceable. A spec filed in a shared drive drifts away from reality the moment a sprint starts. A spec in Git has no option to drift — it changes when the intent changes, under the same review process as the code itself.
How does spec-first development help AI agents produce better output?
AI agents have no intuition. Unlike a human developer who asks a clarifying question when something doesn’t make sense, an agent fills ambiguity with its best guess. A structured spec eliminates the ambiguity. It defines scope, user intent, success conditions, and explicit out-of-scope boundaries before the agent receives a brief. The result is an agent that executes against a defined target rather than an assumed one.
What is the difference between documentation and a living artifact?
Documentation is a photograph of an intention — written to communicate, then filed away. A living artifact is a versioned, structured file that evolves with the product. The difference isn’t the format. It’s the ownership model: documentation is a deliverable; a living artifact is infrastructure.