Work log: March 1, 2026

What shipped today

Today was about two things: finishing the user-facing polish from yesterday’s telemetry/email sprint, and making a big leap forward on article discovery.

Search infrastructure overhaul. We replaced Google CSE with Serper.dev for article discovery (GH-716). This was Roland’s pain point — he needs international content for his music touring business, and Google CSE was returning US-centric results exclusively. After evaluating Kagi (closed beta, not viable), Tavily, and Exa.ai, Serper won on cost ($0.001/search vs $0.005), international support, and API simplicity. The swap was clean: POST to google.serper.dev/search with an API key header, parse organic results. We then added per-workspace search_countries — Roland’s workspace now searches across US, Germany, Austria, UK, Spain, Italy, and France. His first scan pulled in 192 new articles from 232 found, tripling his library from 86 to 278 articles.

Scoring architecture fix. The multi-country search surfaced a latent bug: our batch scoring approach (sending all results to Claude in one prompt, expecting a massive JSON array back) collapsed at scale. With 229 results across 7 countries, Claude couldn’t reliably generate a 229-element JSON array within the token limit. We refactored to score each result individually with Haiku — simpler, more reliable, and eliminates the index-matching complexity of batch scoring. This also uncovered a cross-module dependency issue: content_research_sources.py imported the old filter_with_claude function name, causing deploy failures until we tracked down and fixed the import.

Morning sprint items. Earlier in the day we shipped article sort-by-score (GH-717), fixed Google scan’s search term parsing (JSON decode + per-line splitting), repaired a corrupted Xcode project file, and landed the email/telemetry work from yesterday’s session (signed_in events, email_override fix, newsletter pitch in briefings).

Completed

GH-716 — Evaluate search providers beyond Google CSE (Serper.dev selected and implemented)
GH-717 — Sort articles by AI score descending by default
GH-697 — User activity tracking and telemetry (signed_in event)
GH-713 — Include newsletter pitch in briefing email footer
GH-707 — Add topic CTA and UTM tracking to briefing emails
GH-714 — Hide scans page from non-admin users
GH-715 — Hide recent activity on dashboard for non-admin users
GH-700 — Investigate PostHog analytics gaps
GH-663 — Telemetry discussion (closed, superseded by GH-697)
GH-712 — Make dashboard recent activity user-facing (closed, superseded by GH-715)
GH-718, 719, 720, 721 — Deploy failure issues (all resolved)

Carry-over

Kelly Pentecost’s Google scan returning 0 results — her search terms may need investigation or her workspace may need search_countries configured
Tags field format inconsistency — some articles have tags stored as plain comma-separated strings instead of JSON arrays, causing parse failures in scripts
Temp .ts scripts in web/ — debugging scripts (check-articles.ts, check-scan.ts, check-final.ts, clear-requeue.ts) should be cleaned up

Risks

Serper API key exposure — key is in Railway env vars and engine .env. The .env is gitignored but worth confirming it stays that way.
Scoring cost at scale — individual Haiku calls per result means 232 API calls per scan for Roland. At Haiku pricing this is pennies, but worth monitoring as more workspaces come online with multi-country search.

Flags and watch-outs

Railway deploys from GitHub pushes — do NOT use railway redeploy CLI (it hangs in non-interactive mode). Just push to main.
When renaming exported functions in handler modules, check all importers — content_research_sources.py imports from google_search_scan.py and was missed initially.
Batch scoring with Claude doesn’t scale past ~30 results. The individual scoring pattern is the right approach going forward.

Next session

Check Kelly Pentecost’s workspace scan configuration — does she have search terms? Does she need search_countries?
Review Roland’s 192 new articles — are the international results actually relevant? Check score distribution and content quality.
Clean up temp .ts scripts from web/ directory.
Continue with briefings work (GH-705) or pick next from the queue.

Work log: March 1, 2026

What shipped today

Completed

Carry-over

Risks

Flags and watch-outs

Next session

Why customer tools are organized wrong

Infrastructure shapes thought

Server-side dashboard architecture: Why moving data fetching off the browser changes everything

The work of being available now

The practice of work in progress

Your biggest problems are the ones running fine

The day all five of my AI projects stopped building and started cleaning

The silence that ships