Paul Welty, PhD AI, WORK, AND STAYING HUMAN

Dev reflection - January 27, 2026

So here's something I've been sitting with this week. I've been building systems that generate content—podcast scripts, social media posts, that kind of thing—and almost immediately after getting t...

Duration: 8:02 | Size: 7.37 MB


Hey, it’s Paul. January 27th, 2026.

So here’s something I’ve been sitting with this week. I’ve been building systems that generate content—podcast scripts, social media posts, that kind of thing—and almost immediately after getting them working, I found myself building a second layer on top. A layer whose entire job is to make the AI-generated stuff sound less like AI wrote it.

And that’s… interesting, right? Because it raises this question I can’t quite shake: what does it mean when you can systematically identify and remove the fingerprints of machine authorship? And maybe more importantly—what’s left after you do that?

Here’s the thing. When you work with language models enough, you start to notice patterns. Not bugs exactly. More like verbal tics. The word “crucial” shows up constantly. “Testament to” appears way more than any human would naturally use it. “Additionally” as a transition. “Groundbreaking” as an intensifier. These aren’t mistakes. They’re defaults. The model reaches for them because they’re safe, they’re confident-sounding, they hedge uncertainty by sounding emphatic.

What’s wild is that the model knows what these patterns look like. You can literally ask it to identify and remove its own fingerprints, and it will. It’s like… imagine if you had a distinctive accent, and someone could ask you to speak without it, and you could just… do that. That’s what’s happening here.

So I built what I’ve been calling a humanizer. It’s basically a checklist—twenty-four specific phrases pulled from, of all places, Wikipedia’s guide to identifying AI-written text. The system generates something, then runs it through a second pass that strips out the telltale markers. Two-pass processing. Generate, then clean.

But here’s where it gets philosophically murky. Is the result “human-written”? I mean, a human—me—prompted the generation. A human decided what patterns to remove. A human is reading the output and deciding whether to publish it. But the actual words? Those came from a machine, got cleaned by the same machine, and now exist in this weird liminal space.

I don’t have a clean answer to that. I’m not sure there is one.

What I do know is that there are two approaches to this problem, and I’ve been using both. The humanizer is reactive—it strips patterns after they appear. But you can also be proactive. You can write prompts that explicitly forbid the patterns in the first place. “Do not start with ‘I built a system that…’” “Do not use the word crucial.” “Here’s a translation guide: instead of ‘automation,’ say ‘systems that run themselves.’”

Both approaches acknowledge the same reality: AI writing has systematic fingerprints. The question is just when you intervene—during generation or after.

The proactive approach, the better prompts, seems to work. You can actually shift how the model expresses ideas by being explicit about what you don’t want. The defaults aren’t inevitable. They’re just defaults. And defaults can be overridden.

But this creates a cost question I haven’t fully resolved. Every piece of content now makes two calls instead of one. Generate, then humanize. For something substantial—a long synthesis, a podcast script—that doubles the expense. It’s not huge in absolute terms, maybe a quarter instead of twelve cents. But it adds up. At what volume does this become prohibitive? I don’t know yet. Worth watching.

Okay, so that’s one thread. Here’s another that’s been on my mind.

I built automation infrastructure this week for things that aren’t ready to be automated yet. Like, literally built the plumbing, tested it, confirmed it works, and then commented it out. Left it sitting there with little notes saying “disabled for now.”

On the surface, that sounds like wasted effort. Why build something you’re not going to use?

But here’s what I’ve realized: building the infrastructure forced me to answer questions I hadn’t thought to ask. When should this automation run? How do I prevent it from running twice and wasting money? What happens when something fails halfway through? Those questions matter, and I wouldn’t have asked them in the abstract. I asked them because I was writing the actual code and had to make actual decisions.

So the commented-out automation isn’t waste. It’s reconnaissance. It’s a specification written in executable form. The infrastructure documents intent, even when the intent isn’t ready to execute.

There’s a related distinction I keep bumping into: the difference between something being built and something being shipped.

I have a text editor interface that works. You can use it. It does what it’s supposed to do. But to use it, you’d need to compile it from source, which means you’d need a whole development environment set up. Is that “shipped”?

For practical purposes, no. “You can use this if you compile from source” is true for other developers. It’s irrelevant for everyone else.

The work of shipping isn’t about making things function. It’s about removing barriers to use. Sometimes that’s packaging—bundling a binary so people can just download and run it. Sometimes it’s testing—gaining enough confidence to turn on the automation you already built. Sometimes it’s documentation—explaining what something is for so people know whether they want it.

Built but not shipped is a meaningful category. A lot of what I’ve been working on lives there right now. Functional but not accessible. The gap between those two states is smaller than the gap between “doesn’t exist” and “works,” but it’s the gap that actually matters for whether anyone uses the thing.

Here’s one more thread, and then I’ll try to pull these together.

I’ve been wrestling with how to handle something that should be simple: the title of a piece of content. It’s just a title. Basic metadata. But in the system I’m building, I wanted titles to work like every other piece of content data—configurable, editable inline, participating in the same workflows.

The problem is that titles have always been special. They’re indexed for search. They’re used as references. The database schema treats them differently because that’s how databases work—you want your most-queried field to be efficient.

So I ended up with dual storage. The title lives in two places: the traditional spot where the database expects it, and the new configurable spot where my field system expects it. They sync automatically. Change one, the other updates.

Is this elegant? No. It’s a workaround. But it lets me have database efficiency and interface consistency at the same time. The cost is sync logic—making sure the two versions never disagree.

Every special case is a choice like this. You can accept the complexity of being special, with all the duplicated logic that implies. Or you can accept the complexity of making it not-special, with all the synchronization that implies. Neither is free.

Okay, so what ties all this together?

I think it’s something about cost models changing what matters.

When generating content was essentially free—just run a script, no external calls—it didn’t matter if you ran it twice by accident. Now that generation costs actual money, duplicate detection becomes necessary. The infrastructure has to encode that constraint.

When editing was manual—open a file, make changes, save—inline edit buttons were a luxury. Nice to have, not essential. But when editing is part of an automated flow, those buttons become the interface. They’re not optional anymore.

The tools shape what matters. As the tools change, what matters changes too.

And right now, what matters most is turning potential into actual. The humanizer works—now write prompts that don’t need it. The editor interface exists—now make it installable. The automation is built—now decide when to run it.

The work isn’t building new things. It’s activating what’s already there.

There’s something almost anticlimactic about that. We tend to celebrate the building. The moment of creation. But so much of what determines whether something is useful happens after it technically works. The packaging. The testing. The decision to turn it on.

I’ve been thinking about this as a kind of discipline. The discipline of finishing. Not in the sense of perfecting—things are never perfect—but in the sense of closing the gap between “I could use this” and “anyone could use this.”

That gap is where value actually gets created. Or doesn’t.

So that’s where I am. Building systems to make AI sound less like AI. Building automation for things that aren’t ready to automate. Wrestling with the difference between built and shipped. Watching how cost models reshape what matters.

No tidy conclusions. Just the ongoing work of turning potential into actual, one small decision at a time.

· development · daily, reflection, podcast

Featured writing

When your brilliant idea meets organizational reality: a survival guide

Transform your brilliant tech ideas into reality by navigating organizational challenges and overcoming hidden resistance with this essential survival guide.

Server-Side Dashboard Architecture: Why Moving Data Fetching Off the Browser Changes Everything

How choosing server-side rendering solved security, CORS, and credential management problems I didn't know I had.

AI as Coach: Transforming Professional and Continuing Education

Transform professional and continuing education with AI-driven coaching, offering personalized support, accountability, and skill mastery at scale.

Books

The Work of Being (in progress)

A book on AI, judgment, and staying human at work.

The Practice of Work (in progress)

Practical essays on how work actually gets done.

Recent writing

Textorium is live on the App Store

Textorium launches on Mac App Store - a native editor for Hugo, Jekyll & Eleventy that manages hundreds of posts with table views and smart filtering.

Dev reflection - January 25, 2026

I spent part of today watching a game fall apart in my hands. Not because it was broken—technically everything worked fine. It fell apart because I'd confused being clever with being usable.

Notes and related thinking

Dev reflection - January 25, 2026

I spent part of today watching a game fall apart in my hands. Not because it was broken—technically everything worked fine. It fell apart because I'd confused being clever with being usable.

Dev reflection - January 24, 2026

So here's something I've been sitting with lately. I spent the last couple days working across a bunch of different projects, and I noticed something strange. In almost every single one, the most i...

Dev Reflection - January 22, 2026

Hey, it's Paul. January 22nd, 2026. Today was a launch day, which means it was also a "things broke immediately" day. Dialex went live at dialex.io, and the first thing that happened was every request got blocked with a 403 Forbidden error. I talk about reasonable decisions accumulating into unreasonable situations, why iteration speed matters more than initial tool choice, and how dashboards make accumulated state visible.