Skip to main content
Paul Welty, PhD AI, WORK, AND STAYING HUMAN

Work log: Phantasmagoria — March 24, 2026

What shipped today

The narrative-first pipeline produced its first testable build. After fixing validators, updating API keys, and cleaning up old events, we rendered a clean Windows build with 9 events — all generated by the new Phase 1 → Phase 2 → Phase 3 flow. This is the first time the full pipeline has run end-to-end through the production generators, through validation, through rendering, and out the other side as a playable Stellaris mod.

Getting there required fixing several validator mismatches. The event validator and source validator (validate_release.py) both expected the old format: anomaly_fail required (not supported in Stellaris 4.2+), effects required on every option (breaks auto-resolve dig site chapters), and new effect categories (give_technology, add_research_option, add_tech_progress, add_deposit) not recognized. The follow-up slug format changed from _followup_1 to _followup_a (by option key) which the regex didn’t match. All fixed.

Second round of dead code cleanup removed another 1,331 lines — old build_prompt(), build_phase2_prompt(), save_event_to_yaml() functions and three orphaned GUIDE_*.md files that the narrative-first templates replaced. Combined with yesterday’s -737 lines, the generators have shed ~2,068 lines of dead code across the two sessions. Added 16 tests verifying template variable replacement in all three generators’ build_phase1_prompt() functions. Also fixed load_dotenv()load_dotenv(override=True) across all scripts so .env changes always take effect.

Completed

  • #238 — Remove dead code from post-refactor generators, round 2 (-1,331 lines)
  • #239 — Add tests for build_phase1_prompt() template variable replacement (16 tests)
  • Fix stale test assertions in test_outcome_prompt.py (2 tests)
  • Fix validators for narrative-first pipeline (event_validator.py, validate_release.py)
  • First testable Windows build with narrative-first events (9 events, 0 errors)

Carry-over

  • Playtest the Windows build in Stellaris 4.3 — build is at output/Phantasmagoria_2603240934_windows/. Only 3 event types (1 standalone, 1 anomaly, 1 dig site) plus follow-ups. Need Paul to test on his Windows machine.
  • Generate a full batch — current build has just 1 of each type. Need 3+ standalones, 3+ anomalies, 3+ dig sites for a real release.
  • Old roller events deleted — the celestial_equinox release now only has narrative-first events. The old events are gone from disk (not committed yet — just deleted). Need to decide: commit the deletion, or regenerate the full batch first?

Risks

  • The test_render_lint_integration test still fails (1 pre-existing failure). It tries to render all releases including dark_discoveries which has old-format events. Not blocking but should be investigated.
  • The delete of old events is uncommitted. If generation fails or the build is bad, we can’t go back to the old events without git.

Flags and watch-outs

  • load_dotenv(override=True) was needed because the shell had an old API key cached. This is fragile — if someone sets ANTHROPIC_API_KEY in their shell profile, .env won’t override it without override=True.
  • Title diversity still poor — “The Cartographer’s Bones”, “The Cartographers of Extinction”, “The Sediment of Light” all from the same celestial_equinox theme. Batch context will help but may need stronger anti-repetition guidance in the prompts.

Next session

  1. Paul playtests the Windows build — copy output/Phantasmagoria_2603240934_windows/ to Stellaris mod folder and test in 4.3.
  2. Generate full batchgenerate_standalone.py --count 3, generate_anomaly.py --count 3, generate_site.py --count 3 against celestial_equinox. Commit the results.
  3. Render and playtest the full batch — the real test of whether the narrative-first pipeline produces a good player experience at scale.

Why customer tools are organized wrong

This article reveals a fundamental flaw in how customer support tools are designed—organizing by interaction type instead of by customer—and explains why this fragmentation wastes time and obscures the full picture you need to help users effectively.

Infrastructure shapes thought

The tools you build determine what kinds of thinking become possible. On infrastructure, friction, and building deliberately for thought rather than just throughput.

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

How choosing server-side rendering solved security, CORS, and credential management problems I didn't know I had.

The work of being available now

A book on AI, judgment, and staying human at work.

The practice of work in progress

Practical essays on how work actually gets done.

When the queue goes empty

Most products don't fail at building. They fail at the handoff between building and becoming real. What happens when the code is done and the only things left are judgment calls?

When your agents start breaking each other's code

Two agents modified the same file independently and created database locks. The fleet hit 135 issues in one day — and the coordination problem that comes with it.

The removal tax

The most productive thing you can do with a product is take features away. Eighty-nine issues closed across eight projects, and the hardest lesson came from a pipeline that ran perfectly and produced nothing.