When your agents start breaking each other's code

Two agents modified the same file independently and created database locks. The fleet hit 135 issues in one day — and the coordination problem that comes with it.

When two people edit the same document at the same time, we call it a collaboration problem. When two AI agents edit the same file independently and one of them creates database locks that crash the production engine, we call it Tuesday.

This happened today in Authexis. The content model was being rewritten — a JSONB blob flattened into proper columns, the stage machinery replaced with a simple status field, 38 commits in a single day. Two parallel agents were working on different issues that both touched the same function. Both added dual-write logic to update_stage_field. The result: double UPDATEs that created database locks and hung the engine.

Nobody coordinated them. Nobody told agent B that agent A had already modified that function. There was no pull request review, no merge conflict, no warning. Both agents did exactly what their issue specs said. Both were correct in isolation. Together, they broke production.

This is the coordination problem that nobody talks about when they talk about AI agents. The discourse is about making agents smarter, giving them better tools, longer context windows, more sophisticated reasoning. All individual capability. But the moment you have two agents working on the same codebase, you have an organizational problem, not a capability problem. And organizational problems are not solved by making individuals smarter.

The second thing worth examining is what happened everywhere else while Authexis was having its coordination crisis. Dinly shipped a post-meal feedback loop and a pantry inventory system — 18 issues in one session, test count doubled. Eclectis fixed XSS vulnerabilities and added 28 handler tests. Scholexis went from zero tests to 75 and fixed an IDOR vulnerability that would have let an attacker delete another user’s data. Prakta’s serve-don’t-show experience went end-to-end. Phantasmagoria replaced its entire generation architecture. SimpleBooks completed its Rails-to-Next.js port.

135 issues closed across 11 projects. In one day.

This number should probably worry me more than it does. Not because the work is bad — the work is good. Real security fixes. Real architectural improvements. Real tests catching real bugs. But 135 issues means 135 changes to 11 codebases that nobody reviewed holistically. Each one was correct within its spec. But specs don’t know about each other. The Authexis database lock proved that.

The third insight is about testing as a fleet-wide pattern. Scholexis went from zero tests to 75. Dinly went from 33 to 65. Eclectis from 135 to 163. That’s 130 new tests across three projects in one day, all discovered independently by scout agents running parallel scans.

Nobody mandated a testing initiative. No one wrote a memo saying “all projects need better test coverage.” Each scout looked at its own codebase and independently concluded “this critical path is untested.” The pattern emerged because the same methodology — the scout skill with its parallel scan dimensions — was applied to different codebases and found the same category of problem everywhere.

This is how standards actually form in organizations. Not through mandate. Through independent discovery of the same truth. When three separate teams all conclude they need more tests, that’s stronger than a CTO saying “write more tests.” It’s convergent evolution. The conclusion is more robust because it was reached independently.

The fourth thing is about architectural rewrites happening simultaneously. Authexis rebuilt its content model. Phantasmagoria switched from roller-first to narrative-first generation. Prakta completed the serve-don’t-show experience. Three fundamental redesigns of how each product works, all shipping on the same day.

In a traditional organization, a rewrite is a quarter-long initiative. You plan it, you allocate resources, you sequence it against other priorities, you do the migration in phases. Here, three rewrites shipped in parallel, each driven by a single agent session that understood its own codebase deeply enough to make the change in one sitting.

The advantage is speed. The risk is exactly what happened in Authexis. When a rewrite ships in one session, there’s no time for the broader team to adjust. The function that agent B assumed was unchanged had been rewritten by agent A three hours earlier. In a quarter-long migration, that information would have propagated through code reviews, standup meetings, shared documentation. In a one-day rewrite, the information gap is measured in hours, and hours is enough for a production incident.

The fifth thing: the fleet moved to a server today. All 11 sessions now run on speedy-gonzales, a Mac Mini on Tailscale. Dev servers accessible remotely. Discord channel on the supervisor. Orchestrate agents cycling. The fleet no longer depends on my laptop being open.

This is a quiet change that restructures everything. When the fleet ran on my laptop, I was the infrastructure. If I closed the lid, everything stopped. Now the fleet runs while I sleep. The agents work overnight. The orchestrate loop dispatches issues at 2am. The daily rollover writes work logs at the day boundary.

The implications are still settling. When you can close your laptop and the work continues, your relationship to the work changes. You stop being the engine and start being the steering wheel. The daily rhythms shift — instead of “what should I build today?” it becomes “what did the fleet build while I was away, and is any of it wrong?”

That’s the real coordination problem. Not two agents editing the same file. That’s a technical bug with a technical fix — better locking, sequential merges, file-level ownership. The deeper problem is: when the fleet works faster than you can review, how do you stay confident that the work is correct? 135 issues in a day means 135 decisions you didn’t make. Some of them were security fixes that matter. Some of them were architectural choices that compound. And you find out about all of them in the morning, after they’ve already shipped to main.

The sixth thing: today we also built a marketing agency. Not a marketing tool. An agency. With roles (strategist, writer, launch manager), a methodology (NVN atoms — noun-verb-noun, hermetically sealed engagements), client onboarding, deliverable templates, and a review cycle that feeds corrections back into the methodology.

The core insight was Leibniz’s. The monad has no windows. Each engagement takes all its inputs at the start — the role doc, the guide, the template, the client documents — and produces a deliverable at the end. No side channels during execution. No interrupting the strategist mid-analysis to ask a question. If the inputs are incomplete, the dependency resolver catches it before the monad fires.

The first real engagement worked. The strategist role produced a competitive analysis for Eclectis. An external review found gaps. Those gaps became patterns in the strategist’s role doc and the competitive analysis guide. The next engagement for any client will be better because of this one. The methodology compounds.

What made this work wasn’t AI capability. It was structure. The role doc, the guide, and the template are just markdown files. They could be loaded into any LLM. The intelligence isn’t in the model. It’s in the documents the model reads. And those documents get smarter after every engagement because corrections update them.

This is the same principle as the testing convergence. The value isn’t in the individual agent. It’s in the methodology that the agents execute. The scout skill finds the same categories of problems across all codebases because the skill encodes what to look for. The strategist produces a competitive analysis because the guide encodes how. The model provides the reasoning. The documents provide the judgment.

Today the fleet closed 135 issues, added 130 tests, shipped three architectural rewrites, migrated to a server, and ran its first marketing engagement. The number that matters isn’t 135. It’s the one database lock that two agents created together that neither would have created alone.

The question I’m sitting with: is the coordination problem solvable at all, or is it just the tax you pay for parallel execution? Every system that runs agents in parallel will eventually have two of them touch the same resource. Locks, queues, ownership rules, review gates — these are all mitigations, not solutions. The fundamental tension is that agents work fastest when they’re independent, and independence means they don’t know what each other did.

Maybe the answer isn’t coordination. Maybe it’s detection. Don’t try to prevent the conflict. Build systems that catch it immediately when it happens. The database lock was caught within hours, not weeks. In a traditional organization, that dual-write bug could have lurked for months.

Speed of discovery might matter more than prevention of errors. Build fast, break fast, catch fast. 135 issues and one production incident might be a better ratio than 20 issues and zero incidents, if what you’re optimizing for is learning, not safety.

I’m not sure about that yet. But I notice the fleet is teaching me faster than I could learn on my own.

When your agents start breaking each other's code

Why customer tools are organized wrong

Infrastructure shapes thought

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

The work of being available now

The practice of work in progress

The removal tax

The product changed its mind

Your project management tool was made for a non-human (AI) factory, not for you

The removal tax

The product changed its mind

The last mile is all the miles