Your AI agent is probably not an agent

The word 'agent' has become meaningless. Everyone from chatbot vendors to autonomous system builders uses it. We've been here before — with self-driving cars — and it didn't end well.

Duration: 3:14 | Size: 3.7 MB

Here’s a fun exercise. Go to any tech company’s website and count how many times they use the word “agent.” Now try to figure out what they mean by it.

You won’t be able to, because the word has been stretched past the point of meaning anything. A chatbot that answers FAQ questions? Agent. An autocomplete that finishes your code? Agent. A system that decomposes problems, uses tools, adapts when steps fail, and maintains state across sessions? Also agent. One of these things is a calculator with a personality. The other is something genuinely new. Calling them the same thing isn’t just sloppy — it’s expensive.

Researchers have noticed. A paper out of Colorado State draws a hard line between “AI Agents” (modular, task-specific, LLM-driven) and “Agentic AI” (multi-agent collaboration, dynamic task decomposition, persistent memory, coordinated autonomy). These aren’t degrees of the same thing. They’re architecturally different systems with different failure modes. Agents hallucinate and get brittle. Agentic systems have emergent behavior and coordination failure. Treating them as interchangeable is like treating a calculator and a spreadsheet as the same product because they both do math.

Meanwhile, the Swarmia team and a group at Columbia have independently proposed five-level autonomy frameworks — think SAE Levels for AI. The levels are defined not by what the AI can do, but by what the human’s role becomes: operator, collaborator, consultant, approver, observer. Each step up means less human involvement between the AI receiving a goal and delivering a result.

And that’s where I start getting nervous. Because we’ve seen this movie before.

The Society of Automotive Engineers created Levels 0–5 for self-driving cars. The intention was clarity. What actually happened was that every car company claimed “Level 4 autonomy” while shipping what amounted to glorified cruise control. “Level 4” became a marketing term divorced from its technical meaning. People trusted it. Some of them died.

I’m not being dramatic. The stakes with AI agents are lower than with literal cars, but the pattern is identical: a taxonomy designed for engineers gets captured by marketing, and the gap between what people think they’re buying and what they’re actually getting widens until something breaks.

The Cloud Security Alliance already recognizes this. Their January 2026 guidance says different autonomy levels should require different authorization authority — and that Level 5 (fully autonomous) “is not appropriate for enterprise deployment today.” The Linux Foundation launched the Agentic AI Foundation in late 2025, trying to play the W3C role before the definitions calcify around whatever vendors find most profitable.

So what’s the practical takeaway? When someone sells you an “AI agent,” ask one question: what happens when it fails? If the answer is “it stops and asks you,” that’s a chatbot with extra steps. If the answer is “it tries a different approach, logs why, and keeps going,” you might be looking at something real. The failure mode tells you the autonomy level. The marketing never will.

Featured writing

Why customer tools are organized wrong

This article reveals a fundamental flaw in how customer support tools are designed—organizing by interaction type instead of by customer—and explains why this fragmentation wastes time and obscures the full picture you need to help users effectively.

Infrastructure shapes thought

The tools you build determine what kinds of thinking become possible. On infrastructure, friction, and building deliberately for thought rather than just throughput.

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

How choosing server-side rendering solved security, CORS, and credential management problems I didn't know I had.

Books

The work of being available now

A book on AI, judgment, and staying human at work.

The practice of work in progress

Practical essays on how work actually gets done.

Recent writing

Your design philosophy is already written

Builders who work across multiple projects leave fingerprints everywhere. The same mind solves the same problem differently in every domain — and usually doesn't notice. You need someone to read it back to you.

The day nothing satisfying happened

The most productive day in an organization's life usually looks like nothing happened. No launches, no features, no announcements. Just people quietly making the existing work more honest.

The 19% slowdown nobody wants to talk about

Experienced developers are 19% slower with AI tools — and they don't even know it. The data says the productivity revolution isn't about faster code. It's about fixing the system around the code.

View all writing →

Related thinking

Your AI agent is probably not an agent

Why customer tools are organized wrong

Infrastructure shapes thought

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

The work of being available now

The practice of work in progress

Your design philosophy is already written

The day nothing satisfying happened

The 19% slowdown nobody wants to talk about

The 19% slowdown nobody wants to talk about

Manual fluency is the prerequisite for agent supervision

Your process was built for a different speed