Work log: 2026-03-07
What shipped today
The skopos pipeline collapse landed and was validated end-to-end across two projects. The old five-stage pipeline (triage, prep, dev, QA, review, merge) is now three stages: triage, prep, grind. A single grind agent keeps full context through implement, test, self-review, merge, and close. This eliminates the context loss that made the old handoffs unreliable and cuts the median grind cost to ~$1.
The grind pipeline was stress-tested with 10+ successful grinds across paulos and eclectis. The AgentProvider refactoring chain (#231, #232, #233) validated the new dependency-awareness feature: prep correctly identified blocked issues, the orchestrator auto-unblocked them as dependencies closed, and each grind executed in the right order. On eclectis, the grind agent fixed a SQL injection vulnerability (#82) and replaced silent error swallowing with Sentry logging (#81) — both merged autonomously.
Running on eclectis exposed three cross-project bugs that would never have surfaced on paulos: the AgentProvider refactor dropped the SKOPOS_ANTHROPIC_API_KEY env var fallback, the CODEBASE.md generator produced a 4MB file (62k lines of node_modules and build artifacts), and the backlog label wasn’t created by skopos init. All fixed. The CODEBASE.md generator now properly excludes .venv, .next, node_modules, build outputs, and binary files — eclectis went from 62k lines to 190.
Completed
- #231 — Extract AgentProvider base class and AnthropicProvider implementation
- #232 — Refactor _run_agent() to use AgentProvider interface
- #233 — Refactor _compact_messages() and triage_issue() to use AgentProvider interface
- #224 — Fix TestPushoverPlatform _clear_env fixture
- #211 — Ping Discord with question and link when skopos needs clarification
- #210 — Surface duration metrics in skopos costs display
- #209 — Aggregate and average duration metrics per-project
- #208 — Capture wall-clock duration for each skopos cycle run
- #207 — Duration tracking per skopos agent run
- #204 — Add /scout skill and run_scout() to Skopos pipeline
- eclectis #82 — Sanitize admin_usage SQL interpolation
- eclectis #81 — Replace silent .catch(() => {}) with Sentry error logging
Release progress
- March 2026: 22/24 closed (due 2026-03-30)
- April 2026: 2/2 closed (due 2026-04-30)
Carry-over
- #234 (provider-agnostic cost tracking) is
blockedwaiting on #231-233 which are now done — next cycle should unblock and grind it - eclectis has 2 more
ready-for-grindissues (#78 article_fetch crash, #79 composite indexes) - Prep queue has 7 issues across paulos
Next session
- Run a cycle on paulos — #234 should auto-unblock now that #231-233 are closed
- Run remaining eclectis grinds (#78, #79)
- Consider #190 (SQLite/DuckDB for run logs)
- Try authexis if eclectis grinds go well
Why customer tools are organized wrong
This article reveals a fundamental flaw in how customer support tools are designed—organizing by interaction type instead of by customer—and explains why this fragmentation wastes time and obscures the full picture you need to help users effectively.
Infrastructure shapes thought
The tools you build determine what kinds of thinking become possible. On infrastructure, friction, and building deliberately for thought rather than just throughput.
Server-side dashboard architecture: Why moving data fetching off the browser changes everything
How choosing server-side rendering solved security, CORS, and credential management problems I didn't know I had.
The work of being available now
A book on AI, judgment, and staying human at work.
The practice of work in progress
Practical essays on how work actually gets done.
The second project problem
Your system works. Then you try it somewhere else and it falls apart. The gap between 'works here' and 'works anywhere' is where most automation dies — and most organizations never look.
The smartest code you'll ever delete
The most dangerous kind of waste isn't the thing that doesn't work. It's the thing that works beautifully and shouldn't exist.
The first real user breaks everything
Your product works until someone actually uses it. The gap between 'works in dev' and 'works for a person' is where most systems fail — and most organizations avoid looking.