Paul Welty, PhD AI, WORK, AND STAYING HUMAN

· development

Nobody promotes you to operator

There's a moment in every project where the work stops being about building and starts being about keeping things running. Nobody announces this transition. Nobody gives you new tools for it. And most people keep building long past the point where they should have stopped.

Duration: 4:39 | Size: 4.27 MB


Every organization has a moment where the most valuable work shifts from creation to maintenance, and almost nobody recognizes it when it happens.

I ran five projects today. Twenty-four issues closed. And the thing that struck me wasn’t the volume — it was the character of the work. Almost none of it was new. Almost all of it was making existing things more reliable. Adding a timeout so a subprocess can’t hang forever. Putting bounds checks on query results that usually have three columns but sometimes don’t. Fixing an import that referenced a function by the wrong name. Replacing a loop that created race conditions with a single batch call.

None of this is glamorous. None of it would make a conference talk. But every single one of these fixes prevents a silent failure at two in the morning when nobody’s watching. And that’s the point.

Here’s what I think most teams get wrong. They treat building and operating as the same skill, the same mode, the same mindset. They’re not. Building is additive — you’re always making something new, and the feedback loop is immediate. You write code, you run it, it works or it doesn’t. Operating is subtractive — you’re removing the ways things can fail, and the feedback loop is invisible. You add a timeout, and nothing happens. That’s the whole point. Nothing happening is the success case.

This creates a weird incentive problem. The person who ships a new feature gets visible credit. The person who adds a thirty-second timeout to a subprocess call prevents a cascade failure that would have taken down the orchestration loop at three AM on a Tuesday, and nobody ever knows. The feature got a pull request with a description. The timeout got a one-line commit. But I’d argue the timeout was worth more.

One of my projects effectively entered maintenance mode today. All four milestones closed. Twenty-eight issues total, everything shipped. The ready-for-dev queue is empty. And my first instinct — the instinct I had to consciously resist — was to create more issues. To keep building. Because building feels productive in a way that monitoring doesn’t.

But here’s the thing. That project has a latent build failure hidden behind a CDN cache. The observability tools aren’t verified. There are a handful of backlogged items that might matter and might not. The right move isn’t to create new features. The right move is to verify the things that already exist actually work under stress. That’s operating. And it requires a completely different kind of attention than building does.

I noticed this pattern across the whole portfolio today. The automation tooling spent its day on defensive fixes — timeouts, bounds checking, bot comment markers to prevent false-positive re-queuing. The content platform balanced eight new features against three Sentry triage sweeps that cleared twenty-eight noise issues. The publishing platform ran four cleanup tasks through parallel agents. Even the newsletter pipeline was in research mode, not creation mode — evaluating candidates, not writing new material.

The portfolio is shifting from “build it” to “make it run without watching.” And that shift happened gradually, without any announcement, without any ceremony. Nobody promoted me to operator. The work just changed.

This is what I think organizations miss when they automate. They focus on the building phase — getting the system up, getting it working, shipping the first version. That’s the exciting part. That’s where the demos are. But the actual value of automation lives in the operating phase, in the months and years after launch when the system runs unattended and does the boring work reliably. And that phase requires a fundamentally different investment: not in features, but in guardrails. Not in capabilities, but in failure modes. Not in what the system can do, but in what it does when something goes wrong.

I had multiple autonomous sessions running in parallel today, all pushing to the same repository. They started stepping on each other — git conflicts, divergent branches, failed fast-forwards. The workaround is fragile. The real fix requires rethinking how concurrent agents coordinate. But I wouldn’t have even discovered this problem if I were still in building mode. Building mode tests the happy path. Operating mode discovers what happens when five things run at once and two of them try to push to main at the same time.

The uncomfortable truth is that most systems are in operating mode long before anyone admits it. The features are shipped. The users are using them. The critical path is stability, not novelty. But the team is still organized around building. Still measuring velocity. Still celebrating new features while the monitoring dashboard is orange and the error tracker has a hundred unacknowledged alerts.

You don’t get promoted to operator. You don’t get a new title or a new set of tools. The work just quietly changes underneath you, and the question is whether you notice. Whether you resist the pull of the next shiny feature and instead spend your Sunday adding a timeout to a subprocess call.

Twenty-four issues. Five projects. And the most important thing I did was make sure nothing breaks tonight.

What would change in your organization if you measured “things that didn’t go wrong” with the same enthusiasm you measure “things we shipped”?

Why customer tools are organized wrong

This article reveals a fundamental flaw in how customer support tools are designed—organizing by interaction type instead of by customer—and explains why this fragmentation wastes time and obscures the full picture you need to help users effectively.

Infrastructure shapes thought

The tools you build determine what kinds of thinking become possible. On infrastructure, friction, and building deliberately for thought rather than just throughput.

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

How choosing server-side rendering solved security, CORS, and credential management problems I didn't know I had.

The work of being available now

A book on AI, judgment, and staying human at work.

The practice of work in progress

Practical essays on how work actually gets done.

The job you didn't know you were hiring for

Most organizations hire for tasks. The ones that survive hire for attention. And attention turns out to be the hardest thing to delegate.

The second project problem

Your system works. Then you try it somewhere else and it falls apart. The gap between 'works here' and 'works anywhere' is where most automation dies — and most organizations never look.

The smartest code you'll ever delete

The most dangerous kind of waste isn't the thing that doesn't work. It's the thing that works beautifully and shouldn't exist.

The job you didn't know you were hiring for

Most organizations hire for tasks. The ones that survive hire for attention. And attention turns out to be the hardest thing to delegate.

The smartest code you'll ever delete

The most dangerous kind of waste isn't the thing that doesn't work. It's the thing that works beautifully and shouldn't exist.

The difference between shipping and finishing

Shipping is mechanical. Finishing is a judgment call. And most organizations have quietly made it impossible to tell the difference.