Toward an AI-only software factory

There is a quiet but important difference between AI-first and AI-only, and almost all of the energy in the industry today sits on the first side of that line.

AI-first means a human still drives. The human writes the spec, reviews the pull request, prioritizes the backlog, and approves the deploy. AI accelerates each of those steps, sometimes dramatically, but the operating loop still runs through a person. Remove the human and the loop stops.

AI-only is the harder claim. It says the loop itself runs on agents. Humans set the objective function, fund the system, and hold the kill switch. Everything between the objective and the outcome, the planning, the building, the reviewing, the shipping, the distribution, is operated by software.

Why bother

The honest answer is leverage. A human-in-the-loop system scales with headcount. An autonomous loop scales with compute and with the quality of its feedback. If you can make the loop self-correcting, you stop hiring your way to throughput.

The less obvious answer is clarity. When you commit to designing humans out of the operational loop, you are forced to be precise about what the loop actually is. Every implicit judgment call a senior engineer makes has to become an explicit gate, a vote, or a measurable signal. That precision is valuable even if you never reach full autonomy.

The irreducible inputs

I want to be careful not to oversell this. Some inputs do not automate away:

The objective function. Someone has to say what good looks like.
Funding. Compute and infrastructure are not free.
Legal liability. A company, not an agent, signs the contract.
The kill switch. You always keep the ability to stop the machine.

Everything else, I treat as fair game.

What the loop looks like

In practice the factory is a set of loops that hand off to each other:

Plan decomposes an objective into work.
Build executes the work in isolated environments.
Review runs adversarial, multi-agent verification before anything merges.
Distribute turns finished work into outward artifacts: posts, case studies, outbound.

The interesting engineering is not in any single agent. It is in the gatekeeping between stages: consensus among independent reviewers, automated checks that fail closed, and feedback that flows backward so the next pass is better than the last.

Where I am

I am not claiming a dark factory exists today. I am claiming it is the right north star, and that building toward it produces better systems even at the AI-first stage along the way. Every engagement I run feeds reusable skills and agents back into the loop. The compounding is the point.

If you are working on the same problem, I would like to compare notes.