When the Grinder Becomes the Bottleneck

What happens when your automation tooling gets so good at executing work that the human becomes the constraint—not in deciding what to build, but in supervising how it gets built? Across three projects on February 22nd, a pattern emerged that challenges the typical narrative about AI-assisted development: the problem isn’t agent capability anymore, it’s context switching costs and the cognitive overhead of babysitting autonomous systems that are perfectly capable of running unsupervised.

The Two-Tab Insight: Separating Supervision from Strategy

The paulos planning session diagnosed something subtle: when a single Claude conversation handles both strategic design discussions and grind supervision (monitoring sub-agents, merging PRs, handling failures), the design conversations get starved. Not because the agent can’t multitask, but because supervision creates constant interruptions that fragment the context needed for deep architectural thinking. The solution—a two-tab workflow where one tab runs /grind autonomously while another stays conversational for /scout, /prep, and design work—reveals something important about how AI development tooling should be structured.

The handoff mechanism matters here: GitHub issues become the contract between tabs. Tab 2 creates and preps issues with complete specifications, tab 1 grinds them without needing shared conversational context. This validates the investment in /prep as a tool—if issues are truly self-contained work packets, agents don’t need massive context windows or conversational history to execute them. The Authexis session proved this works in practice: one tab shipped 28 issues across two full context windows while the other tab designed an entire REST API architecture and created 7 new grindable issues. No coordination overhead, no context pollution.

But this creates a new problem: the grinder tab committed directly to main multiple times despite branch mode configuration. When supervision and execution live in separate contexts, configuration drift becomes dangerous. The branch mode bug (tracked but unfixed) didn’t matter much in single-tab workflows where you could catch and fix mistakes conversationally. In a two-tab world where the grinder runs unsupervised, it becomes a production risk.

The API Extraction Pattern: When Business Logic Outgrows Its Container

Authexis hit an inflection point where the server action architecture—perfectly fine for a Next.js web app—became a constraint. The trigger wasn’t technical debt or performance problems; it was the need to build a universal Apple app that shares the same business logic. The audit revealed the real insight: 35% of server actions contain genuine business logic (state machines, orchestration, queue management), 40% are thin Supabase wrappers, and 25% are external service calls. The architecture question isn’t “should we build an API?” but “where does business logic live when multiple clients need it?”

The design decision—extract business logic into shared modules that both server actions and API routes call—is more interesting than it sounds. It’s not a rewrite or a refactor; it’s a recognition that the web app’s server actions were accidentally doing double duty as both UI glue and business logic. The REST API at authexis.app/api/v2/ becomes the canonical interface, and server actions become thin clients of that API (or direct callers of the shared modules). This means the web app and the Apple app are peers, not a primary and a port.

The seven grindable issues (GH-401 through GH-407) covering auth middleware, contents CRUD, pipeline actions, research endpoints, deliverables, distribution, and settings represent a complete decomposition of the business domain. But they’re blocked on GH-401 (auth middleware + rate limiting) because authentication is the foundation. This sequencing matters: you can’t grind API endpoints in parallel until the auth layer exists, which means the two-tab workflow can’t fully parallelize this work yet. The grinder tab will be idle until the design tab finishes the foundation.

The Grindability Spectrum: What Makes Work Ready for Autonomous Execution

Three projects, three different states of grindability. Authexis had 28 issues ready to grind and shipped them all in one session. Paulos had zero grindable issues—the session was pure planning. Polymathic-h had carry-over issues but none were picked up because they weren’t quite ready. The difference isn’t project complexity or domain difficulty; it’s specification completeness and dependency clarity.

The Authexis citations system (GH-368 through GH-371) shipped in four sequential phases without any conversational intervention: source enrichment, citation-aware generation, bibliography output, deliverable attribution. Each issue was a complete work packet with clear inputs, outputs, and acceptance criteria. The grinder executed them in order and moved on. Contrast this with the polymathic-h newsletter automation, which has been “carry-over” for multiple sessions because it’s described as “matching the podcast pattern” without a concrete spec. The pattern exists in code, but translating it to a new domain requires design decisions that aren’t yet documented.

The API v2 issues (GH-402–407) sit in an intermediate state: they exist, they’re decomposed, but they need “augmenting with exact file paths and response shapes before they’re fully grindable.” This is the /prep gap—the issues were created from the design conversation but haven’t been run through the preparation step that turns architectural decisions into executable specifications. The two-tab workflow makes this gap more visible: if the design tab creates issues but doesn’t prep them, the grinder tab can’t pick them up, and the parallelization benefit disappears.

The Pricing Decision as Configuration Archaeology

Buried in the Authexis grinder output is a detail that reveals how product decisions propagate through code: pricing alignment across landing page, /pricing, and billing page, all reading from a single plans.ts config (GH-385). Individual $99/mo, Team $249/mo. The implementation created new Stripe env var names (STRIPE_PRICE_INDIVIDUAL, STRIPE_PRICE_TEAM), which means the deploy is blocked until those products exist in Stripe and the env vars are set in Vercel. Legacy price IDs are handled in the webhook for existing subscribers.

This is configuration archaeology—the code now embeds a product decision (two tiers, specific pricing) that requires external system setup (Stripe products) and deployment configuration (Vercel env vars) before it can work. The risk isn’t the code; it’s the coordination between the code change, the Stripe admin work, and the deployment. The grinder can’t do this coordination because it requires access to external systems and production credentials. The human has to do it, but the work log flags it clearly: “If new Stripe products aren’t created and env vars aren’t set before deploy, checkout will break.”

The trial enforcement issue (GH-384) was closed as “not-planned, nag only for now”—a product decision to show trial status banners but not gate features yet. This is the kind of decision that can’t be automated because it’s about business strategy (how aggressive to be about conversion) not technical implementation. The grinder can build either version, but only the human can decide which version to build.

Questions This Raises

If the two-tab workflow works, should /grind become a long-running background process rather than a conversational command? What would supervision look like if the grinder ran continuously and reported status asynchronously?
When business logic gets extracted from server actions into shared modules, who owns the tests—the API layer, the shared modules, or both? Does the grinder know how to maintain test coverage across that refactor?
Why is branch mode still broken after being flagged multiple times? Is it a paulos bug, a Claude MCP limitation, or a GitHub API issue? Does it matter more now that unsupervised grinders are committing directly to main?
What’s the right granularity for grindable issues? The citations system worked as four sequential issues, but could it have been one larger issue? Would that have been more or less grindable?
If /prep is the bottleneck between design and execution, should it be automated? Could the design tab create rough issues and hand them to a prep agent that augments them with file paths and response shapes before the grinder picks them up?

What Matters About This

The two-tab workflow isn’t just a productivity hack; it’s a recognition that AI development assistance has moved from “help me write this code” to “execute this work while I think about the next thing.” The constraint is no longer agent capability—the Authexis grinder shipped 28 issues across two context windows without human intervention—it’s human bandwidth for specification and supervision. When the grinder can outpace the designer, the architecture of the tooling needs to change.

The API extraction pattern in Authexis represents a broader inflection point: when a product grows beyond a single client, the business logic needs to live somewhere that isn’t coupled to any particular UI framework. The decision to extract shared modules rather than just build a parallel API reveals an understanding that the web app and the Apple app should be peers calling the same business logic, not a primary app and a second-class port. This matters because it changes what gets built: not “an API for the mobile app” but “the canonical business logic interface that all clients use.”

The grindability spectrum—from Authexis’s 28 ready-to-grind issues to polymathic-h’s perpetual carry-over—shows that specification completeness is the real bottleneck. The grinder is fast, reliable, and capable. But it can only grind what’s been prepped. The investment in /prep as a tool, and the discipline to run it before handing issues to the grinder, is what enables the two-tab workflow to actually parallelize work. Without that discipline, the design tab creates issues the grinder can’t execute, and the parallelization benefit disappears.

Where This Could Go

Immediate (next session):

Fix branch mode in paulos so grinders use feature branches—this is now a production risk with unsupervised execution
Create and prep the Stripe products (Individual $99, Team $249) and set Vercel env vars before deploying Authexis pricing changes
Run /prep on GH-402–407 (API v2 endpoints) to add file paths and response shapes, making them grindable
Test the two-tab workflow on paulos itself: design tab creates March 2026 milestone and runs /scout, grinder tab executes the resulting issues

Near-term (this week):

Document the two-tab workflow pattern in paulos CLAUDE.md so it’s repeatable across projects
Audit polymathic-h carry-over issues and either prep them or close them—perpetual carry-over is a smell
Build GH-401 (API auth middleware) to unblock the other six API issues and enable parallel grinding
Test citations pipeline end-to-end on real Authexis content to validate the four-phase implementation

Strategic (this month):

Evaluate whether /prep should be automated—could a prep agent augment rough issues from the design tab before the grinder picks them up?
Design supervision mechanisms for long-running grinders—if the grinder tab runs for hours, how does the design tab get status updates without context switching?
Extract the “grindability checklist” from these sessions and turn it into a tool—what makes an issue ready to grind vs. needing more design work?

Work log synthesis: February 22, 2026

When the Grinder Becomes the Bottleneck

The Two-Tab Insight: Separating Supervision from Strategy

The API Extraction Pattern: When Business Logic Outgrows Its Container

The Grindability Spectrum: What Makes Work Ready for Autonomous Execution

The Pricing Decision as Configuration Archaeology

Questions This Raises

What Matters About This

Where This Could Go

Why customer tools are organized wrong

Infrastructure shapes thought

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

The work of being available now

The practice of work in progress

Dev reflection - February 23, 2026

Universities missed the window to own AI literacy

Dev reflection - February 22, 2026

Work log synthesis: February 21, 2026

Work log synthesis: February 20, 2026

Why your thought leadership content pipeline is broken