Dev reflection - February 15, 2026

I want to talk about what happens when something stops being a tool and becomes plumbing. Because that shift is happening in my work right now, and I think it's happening everywhere, and most peopl...

Duration: 7:09 | Size: 6.56 MB

Hey, it’s Paul. Sunday, February 15, 2026.

I want to talk about what happens when something stops being a tool and becomes plumbing. Because that shift is happening in my work right now, and I think it’s happening everywhere, and most people haven’t noticed the consequences yet.

Here’s what I mean. When you use a tool, you pick it up, you do something with it, you put it down. If it breaks, you see the break. You know what went wrong. A hammer with a cracked handle is obviously a hammer with a cracked handle. But when that same capability gets embedded into your infrastructure—when it becomes part of the pipes that connect everything else—the failure modes change completely. You don’t see a broken hammer anymore. You see water coming out brown and you have no idea why.

AI just made that transition in several of my systems. Not today specifically—it’s been building for weeks—but today I looked at what was running and realized: the AI isn’t producing the output anymore. It’s routing the work. It’s scoring, sequencing, deciding what goes where and in what order. It’s middleware. And that changes everything about how you think about quality, about failure, about trust.

Think about this in any organization. When you hire a consultant to write a report, that’s a tool. The report lands on your desk, you read it, you judge it. But when that same consultant’s methodology gets baked into your quarterly planning process—when their framework is just how decisions flow through the org—you stop evaluating the framework. You start evaluating the decisions that come out of it, and if those decisions start drifting, you blame the people making them. You blame the data. You blame the market. You almost never trace it back to the invisible methodology shaping everything upstream.

That’s the problem with AI-as-infrastructure. When AI was generating a draft and handing it to you, quality was your problem. You could see it, judge it, fix it. When AI is orchestrating a pipeline—deciding what content gets scored how, what gets routed where, what gets compiled into what—quality becomes a systems problem. And systems problems are harder to see, harder to diagnose, and way harder to fix, because the failure looks like it’s happening somewhere downstream of where it actually originated.

So the question isn’t whether AI is good enough to be infrastructure. It probably is, for a lot of tasks. The question is whether your observability is good enough to handle AI as infrastructure. And for most people, most teams, most organizations, the honest answer is no. Not even close.

Second thing I’ve been noticing. There’s a pattern in how I’m handling coordination across multiple systems, and it’s a pattern I see in every organization I’ve ever consulted with. When you need two or more systems to agree on what just happened, you have two choices. You can build positive coordination—explicit handshakes, transaction protocols, formal agreements that guarantee consistency. Or you can build exclusion logic—rules about what to ignore, what to skip, what to filter out so that conflicts can’t arise in the first place.

Almost everyone picks exclusion. Almost every time.

And it works. It works really well, actually, right up until the moment it doesn’t. Because exclusion logic is essentially saying: I’m going to make the world small enough that coordination isn’t necessary. I’m going to limit the scope of what can happen so that the hard synchronization problem never comes up.

You see this in organizations constantly. Instead of building a real process for cross-departmental decision-making, you draw boundaries. Marketing doesn’t touch product decisions. Engineering doesn’t touch pricing. You avoid the coordination problem by making sure the parties never collide. And it’s efficient. It’s clean. Until someone needs to make a decision that crosses those boundaries, and then nobody knows how, because the system was designed to prevent that situation rather than handle it.

The tradeoff is stark: simpler systems that are nearly impossible to debug when they fail in ways you didn’t anticipate. Because the failure isn’t a bug in the logic. The failure is reality violating the assumption that made the exclusion work in the first place. And your system has no vocabulary for that. It just… behaves strangely. Outputs go wrong. People blame each other. Nobody can point to the root cause because the root cause is an invisible assumption, not a line of code or a policy document.

I don’t have a clean answer for this. Positive coordination is expensive and slow. Exclusion logic is fragile in ways you can’t see until it breaks. But I think the minimum viable response is this: if you’re using exclusion logic—if your system works by avoiding certain states rather than handling them—you need to at least document the assumptions. Write down what you’re assuming can’t happen. Because when it does happen, and it will, that document is the only thing that’ll help you find the problem before you waste weeks looking in the wrong place.

Third thing. I’ve been watching the line between “this needs human review” and “this can ship automatically” become an explicit design decision rather than an implicit cultural norm. And that’s genuinely new.

One of my systems now has an auto-approve flag. Skip the review, send it straight to done. And the existence of that flag is more interesting than what it does, because it forces a question that most organizations answer by feel: when is human oversight actually adding value, and when is it just ceremony?

Here’s the uncomfortable version of that question. If you trust the automated output enough to build a bypass flag, why is review the default? And if you don’t trust it enough to bypass review, why does the flag exist? The flag is an admission that you haven’t resolved the underlying question. It’s a shrug encoded in software.

But I think that shrug is honest. Because the real answer is: it depends. It depends on the stakes, the reversibility, the cost of being wrong. And those things change. A marketing email for an internal update? Auto-approve, who cares. A marketing email to your entire customer base announcing a pricing change? You want human eyes on that.

The interesting design problem isn’t whether to auto-approve. It’s building systems that know the difference. Systems that understand consequence. And right now, almost nobody is building that. We’re building bypass flags and hoping people use good judgment about when to flip them.

That’s the delegation problem every manager faces, by the way. Not “can this person do the work” but “does this person know when to escalate.” The answer to that question is never binary. It’s contextual. And encoding context into systems is one of the hardest problems in any domain—software, management, policy, anything.

Last thing, briefly. Half my projects were quiet yesterday. No real changes, just maintenance commits. And that’s actually the most important signal of the day. Because silence from infrastructure is either stability or neglect, and they look identical from the outside.

This is true for every foundation you build—technical, organizational, personal. The habits that run quietly in the background, the processes that just work, the relationships you don’t have to actively manage. Silence means they’re either solid or eroding. And you can’t tell which without actively checking. Which means you need a practice of checking. Not fixing. Checking. Because the worst failure mode isn’t something breaking loudly. It’s something degrading quietly while you focus on the exciting work at the edges.

So here’s the question I’m sitting with: as more of the work gets automated, as more of the judgment gets embedded in pipelines and orchestration layers, what does it even look like to check the foundations? What are you looking for when the system isn’t failing—yet?

Dev reflection - February 15, 2026

Why customer tools are organized wrong

Infrastructure shapes thought

Server-Side Dashboard Architecture: Why Moving Data Fetching Off the Browser Changes Everything

The work of being available now

The practice of work in progress

Building in public is broken — here's how to fix your signal-to-noise ratio

You can't skip the hard part

Dev reflection - February 14, 2026

Dev reflection - February 14, 2026

Dev reflection - February 13, 2026

Dev reflection - February 12, 2026