In 2020 I sat through a SAFe training with a good instructor and walked away thinking: this is brilliant — for someone else.

We were running a 15–20 person product studio at the time. Dual-track agile. Dedicated discovery track running ahead of delivery. Product people — not just engineers and designers — deeply involved in shaping what got built before a line of code was written. We shipped well. Clients trusted us. The work was good.

SAFe felt like a framework designed for the org chart we’d never have. Agile Release Trains. Program Increment planning. Value streams. All of it made sense in the abstract and felt like overhead in practice. I didn’t finish the certification. I didn’t think I needed it.

What we mean by these terms

Spec-first development is a delivery methodology in which every user journey is mapped, every screen is defined, and human sign-off is obtained before any code is written or any AI agent is given a brief. The spec is the primary coordination artifact — not the working prototype.

A coordination layer is the set of explicit specs, defined handoffs, and human review gates that governs how multiple actors — human or AI — align on intent, sequence their work, and validate outputs before proceeding. In an AI-native workflow, the coordination layer does the work that team culture used to do when teams were small enough for proximity to cover for process.

I was right about that. I was completely wrong about why.

§

The certification I didn’t finish

What I missed is that SAFe was never about team size.

SAFe was about the coordination problem — what happens when multiple actors with different contexts, moving at different speeds, need to converge on a shared outcome without stepping on each other or building in the wrong direction.

Large enterprises had that problem because they had hundreds of people. We didn’t have it because we had 15. The problem was proportional to the team, and our team was small enough that alignment happened through proximity and culture rather than process.

Then we added AI agents.

§

What I missed

Here’s what nobody in the “AI will 10x your productivity” conversation is being honest about.

The productivity gains are real. But the teams that are not seeing those gains? They bolted AI onto broken or uncoordinated workflows. And they got faster chaos.

40%
of agentic AI projects will be cancelled by end of 2027 — not because the models aren’t capable. Because of escalating costs, unclear business value, and inadequate risk controls. That’s not a technology failure. That’s a coordination failure with a technology label on it. (Gartner, 2025)
§

The productivity gains nobody’s honest about

McKinsey research from late 2025 found that top-performing AI software teams — the ones seeing quality improvements of 30–45% and real time-to-market gains — weren’t just adding AI tools to existing workflows. They were embedding AI across the entire development lifecycle. Spec to deployment. Not just code generation.

Key finding

Companies reaching 80–100% developer AI adoption saw productivity gains above 110%. The constraint wasn’t the tools — it was the clarity of intent flowing into them. The teams that got the gains were the ones where spec quality, dependency management, and feedback loops kept pace with the speed of generation.

That is a SAFe insight. Expressed without the ceremonies.

This is the operating principle behind how we build at specshop.dev — spec clarity as the input constraint, not an afterthought.

§

Why the team size argument breaks with agents

A director at a major tech company said something last year that’s been sitting with me:

“We’re starting to rename 2-pizza teams to 1-pizza teams. With AI, large teams just no longer make sense.”

He’s right about the human headcount. But the coordination surface didn’t shrink — it exploded.

A two-person studio running multi-agent workflows is making the equivalent coordination decisions of a 15-person team. Except faster, with less friction, and with much less forgiveness when the alignment is off.

The dangerous part

An agent that’s been given an ambiguous brief will build confidently and completely in the wrong direction. It won’t push back. It won’t ask the question a good developer would ask. It will simply execute. At full speed. Until someone reviews the output and finds it’s elegantly wrong.

§

What SAFe actually looks like in a two-person AI-native studio

It doesn’t look like PI Planning. It doesn’t look like ARTs or SAFe Scrum Masters or LACE centres of excellence.

It looks like this:

  • The spec is the alignment artifact. Every piece of work — before any agent touches it — goes through a structured spec. The spec defines scope, the user’s intent, the success condition, and the explicit boundaries of what the agent should not do. That is PI Planning. For one project. In two hours. Not two days.
  • The spec gates the build. No agent generates production code from an ambiguous brief. The spec is reviewed, challenged, and locked. That is the inspect-and-adapt loop, running before the sprint rather than after.
  • The handoffs between agents are defined. Which agent does what, in what sequence, with what output. That is dependency management — without the sticky notes on a physical board.
  • The human stays in the architecture seat. We review every agent output against the spec before it moves forward. Not because we don’t trust the models — because we’ve shipped 120+ products and we know that confidence and correctness are not the same thing.

None of this is SAFe by name. All of it is SAFe by principle.

Here is the mapping explicitly — what each SAFe principle looks like when implemented spec-first in a two-person AI-native studio:

SAFe Principle SAFe Ceremony (enterprise) Spec-first equivalent (2-person AI studio)
Alignment before execution PI Planning (2-day quarterly event) Structured spec review before any agent receives a brief (2 hours per project)
Spec clarity before build Feature definition in Program Backlog Every screen defined and signed off before build starts
Dependency management Dependency tracking boards, ART sync Explicit agent sequencing — which agent does what, in what order, with what output defined
Inspect and adapt Sprint retrospectives, PI retrospectives Human review of every agent output against the spec before it moves forward
Feedback loops at every handoff System demos, Scrum of Scrums 3-tier QA: agent output → architect review → client sign-off
§

The insight I missed in 2020: principles are separable from ceremonies

SAFe built the ceremonies because large organisations needed rituals to enforce alignment across people who didn’t share a building, a culture, or sometimes even a timezone. The ceremonies were the delivery mechanism for the principles, not the principles themselves.

When you’re two people with a shared toolchain and a decade of product intuition between you, you don’t need the ceremonies. But you absolutely need the principles — especially when your effective output surface is running at the speed of agents.

2026
Gartner’s projection: 40% of enterprise applications will be integrated with task-specific AI agents by end of 2026. The teams that capture value from that shift aren’t the ones with the most agents — they’re the ones with the clearest coordination model running underneath them.
§

What I’d tell myself in 2020

Don’t finish the certification. But don’t dismiss the principles either.

The ceremonies were built for a scale you’ll never reach. But alignment before execution, spec clarity before build, dependency gates, feedback loops at every handoff — those aren’t enterprise problems. They’re software problems. They show up at every team size. They just become invisible when the team is small enough that culture covers for process.

Add AI agents to the workflow, and the culture can’t cover anymore. The coordination has to be explicit. The spec has to do the work that a good product conversation used to do.

That’s the thing about building on the right principles without understanding them: you get away with it, until the leverage changes. AI changed the leverage.

At specshop.dev

We run a spec-first model — not because it’s a methodology we read about, but because 9 years and 120+ shipped products taught us that the work always breaks in the same place: between what someone meant and what got built. The spec is the gap-closer. The agents are faster now. The gap is more expensive than ever.

If you’re a founder or a product team starting to add agentic workflows and wondering why the outputs keep drifting from the intent — the answer probably isn’t the model. It’s the coordination layer you haven’t built yet.

§

Questions we get asked about this

Why do AI agents need coordination frameworks?
AI agents execute confidently from whatever brief they are given. Unlike a human developer who will push back, ask clarifying questions, or flag ambiguity, an agent will build completely in the wrong direction if the spec is unclear. The coordination layer — clear spec, defined handoffs, human review gates — is what keeps agent output aligned with intent.
What is spec-first development?
Spec-first development is a software delivery methodology in which every user journey is mapped and every screen is defined and approved before any code is written. The spec — not the first working build — is the primary alignment artifact between a product team and its builders, whether human or AI.
Can SAFe principles apply to small teams using AI agents?
Yes. The ceremonies of SAFe — PI Planning, ARTs, LACEs — are designed for large organisations. The underlying principles — alignment before execution, spec clarity before build, dependency management, inspect-and-adapt loops — apply at any team size. A two-person studio running multi-agent workflows faces the same coordination surface as a 15-person team, just without the headcount to absorb misalignment.
Why do agentic AI projects fail?
Gartner projects that 40% of agentic AI projects will be cancelled by end of 2027 due to escalating costs, unclear business value, and inadequate risk controls. These are coordination failures with a technology label on them, not technology failures. Teams that bolted AI onto uncoordinated workflows got faster chaos, not faster delivery.
What is a coordination layer for AI-native software development?
A coordination layer for AI-native software development is the set of explicit specs, defined agent handoffs, and human review gates that govern how AI agents receive work, execute it, and have their outputs validated. Without it, agents operating in sequence or parallel will build in conflicting directions regardless of how capable the underlying models are.
What is the difference between SAFe ceremonies and SAFe principles?
SAFe ceremonies — PI Planning, Agile Release Trains, LACE centres of excellence — are the delivery mechanism for the principles. They exist because large organisations need rituals to enforce alignment across people who don’t share context, culture, or timezone. The principles underneath — alignment before execution, spec clarity, dependency management, feedback loops — apply independent of team size.
How does specshop.dev manage AI agents in the build process?
specshop.dev runs a spec-first model in which every piece of work is specified — scope, user intent, success condition, explicit agent boundaries — before any AI agent touches it. Agent outputs are reviewed against the spec by a human before moving forward. This approach is built on 9 years and 120+ products shipped across 5 countries, and one consistent finding: confidence and correctness are not the same thing.