v5.2.0 Branch Completion — Design¶
Date: 2026-05-10
Status: Draft for review
Branch: release/v5.2.0
Predecessor work: 151 commits' worth of PHON-72/73/76/81-88/92-100/102/106/107/109/112/113
1. Frame¶
release/v5.2.0 is 151 commits ahead of develop and main. The CSP architecture has fully landed (PHON-109 productionized; RunPod/T5Gemma retired). Surface artifacts — public docs, version chips, the React frontend, Cloudflare configuration, and Jira state — have not caught up. This spec scopes the work needed to bring the release branch to a state where the only remaining step is "merge to main + back-merge to develop." Merge itself is out of scope and handled in a separate session.
End state: release/v5.2.0 is a clean, internally consistent release branch. CI is green. README/CONTRIBUTING describe the CSP architecture. Frontend renders top-K candidate output from the new CSP backend. Cloudflare Containers configuration (Dockerfile + binding) is committed. Version chips read 5.2.0. Jira release manifest is coherent.
Non-goals: merging anywhere; cutting tags; writing release notes; actually deploying the container; PHON-91 develop-history strip; .claude/settings.local.json cleanup; PHON-89 developmental membership; audio/Catalog backlog; any tickets not explicitly listed below.
2. Approach¶
Approach A: Sequential, single branch. Each workstream lands as one or more commits directly on release/v5.2.0. No nested feature branches, no PRs. Clean linear history; easy to revisit any single workstream without blocking adjacent work.
Approaches B (mini PRs per workstream) and C (parallel agents) were considered and rejected: B adds review ceremony unjustified for in-flight release polish; C trades coordination overhead for marginal wall-clock savings on already-fast work.
3. Workstreams¶
3.1 Public-facing copy¶
Files:
- README.md — lines 15, 86, 132, 183 still describe T5Gemma 9B-2B + GPU/MPS/25GB-RAM requirements
- CONTRIBUTING.md — line 70 calls the generation server "FastAPI + T5Gemma"
Rewrite to describe the CSP architecture: constraint solver + 4-axis LightGBM reranker, no LM, CPU-only (~600MB RAM for runtime parquets + MiniLM-L6-v2 + reranker), ~25s cold start on CPU, ~1-2s warm requests. Update the "Three faces" / "Generation" sections of README. Update the pipeline diagram in README around line 132 to reflect packages/generation/server/ shape post-PHON-109.
Output: one commit.
3.2 Version bump 5.1.0 → 5.2.0¶
Files:
- packages/web/frontend/package.json:4
- packages/web/frontend/src/components/AppHeader.tsx:129 (chip label)
- packages/web/frontend/src/components/AppHeader.tsx:507 (footer)
- packages/web/frontend/src/components/AppHeader.tsx:579 (drawer)
- packages/web/workers/package.json:3
Per the version-bump-checklist memory, grep for any other v5\.1 / 5\.1\.0 strings before committing — there shouldn't be others, but verify. Python package versions (packages/data, packages/governors, packages/generators) stay at 0.1.0 — they're internal workspace packages, not user-facing.
Output: one commit.
3.3 Jira hygiene¶
Close as Done (memory says complete; Jira out of sync): - PHON-72 (FineWeb-Edu freq + POS — merged 2026-05-04 via PHON-88) - PHON-79 (Data licensing & in-house norms — completed via PHON-72/88 chain) - PHON-85 (FineWeb-Edu grade-banded freq — done via PHON-88) - PHON-86 (PhonBank preschool age-graded freq — done via PHON-88) - PHON-87 (CHILDES Eng-NA + Eng-UK age-graded freq — done via PHON-88)
Close as Won't Do: - PHON-80 (Split RunPod API key prod/staging — RunPod retired in PHON-109)
Backfill fixVersion = v5.2.0 on shipped tickets so the release manifest is queryable: PHON-72, 73, 76, 81, 82, 83, 84, 85, 86, 87, 88, 92, 93, 94, 95, 96, 97, 98, 99, 100, 102, 106, 107, 109, 112, 113.
Output: no commits (Jira-only).
3.4 PHON-114 Cloudflare Containers configuration (approach α — native binding)¶
Decision: native CF Containers binding rather than external HTTPS host. Single-Worker-to-single-backend in the same CF account; private intra-CF routing avoids exposing the FastAPI server to the internet, removes the need for a separate domain/TLS/auth-header, and gets DO-backed lifecycle management for free. Open-beta risk is bounded — falling back to external HTTPS is a one-line route change.
Files to add:
- packages/generation/server/Dockerfile — Python 3.10/11 base; install phonolex_data, phonolex_governors, phonolex_generators editable; install server deps; bake data/runtime/{words,edges,selectional,pairs,skeletons}.parquet + data/runtime/reranker_v2.pkl into the image (LFS-resolved at build time); expose 8000; entrypoint uvicorn packages.generation.server.main:app --host 0.0.0.0 --port 8000. Expected image size ~700-800MB.
- packages/generation/server/.dockerignore — exclude tests/, research/, docs/, packages/web/, *.pyc, .git, __pycache__/, etc.
- packages/web/workers/src/containers/generation.ts — class GenerationServer extends Container { defaultPort = 8000; }
Files to modify:
- packages/web/workers/wrangler.toml — add [[containers]] block (image path ../../generation/server/Dockerfile, class_name GenerationServer, max_instances), [[durable_objects.bindings]] (name GENERATION_SERVICE, class_name GenerationServer), [[migrations]] (tag v1, new_sqlite_classes = ["GenerationServer"]). Mirror under [env.staging]. Drop the GENERATION_SERVER_URL = "" placeholder in both [vars] blocks.
- packages/web/workers/src/index.ts — re-export GenerationServer (DO classes must be exported from the worker entry module).
- packages/web/workers/src/routes/generation.ts — replace fetch(backendUrl) with env.GENERATION_SERVICE.getByName('default').fetch(req). Drop the URL-fallback backendUrl() helper. If the binding's missing the request fails loudly; no silent 503.
- packages/web/workers/src/types.ts — add GENERATION_SERVICE to the Env type.
- packages/web/workers/package.json — add @cloudflare/containers dep.
- Tests for routes/generation.ts — mock the binding's getByName().fetch() instead of fetch.
Out of scope here: building/pushing the image; setting up the CF Containers registry; regenerating an OAuth token with container scopes (the current OAuth lacks them). All deploy-time concerns.
Output: ~3 commits (Dockerfile + dockerignore; Worker container class + wrangler binding + index re-export; route migration + tests).
3.5 PHON-110 Frontend rewrite¶
The biggest workstream. The current frontend assumes streaming tokens with per-token compliance richtext + violation lists. CSP returns a candidate list with axis scores. Different paradigm.
3.5.1 — API client (packages/web/frontend/src/lib/generationApi.ts)
Drop the SSE parsing logic in generateContent(). Replace with two plain Promise-returning functions:
generateSentences(req: GenerateSentencesRequest): Promise<GenerateSentencesResponse>
generateParagraphs(req: GenerateParagraphsRequest): Promise<GenerateParagraphsResponse>
Both use fetch(url, { method: 'POST', body: JSON.stringify(req) }) and return parsed JSON. useServerStatus is preserved — /server/status (Worker) → /health (backend) is still wired and we need the cold-start signal in the UI.
usePreflight and fetchPreflight are deleted. The /api/preflight endpoint is orphan dead code in the FastAPI server (packages/generation/server/routes/preflight.py has broken imports against the v6 schema and is not registered in main.py); remove it as part of this workstream.
3.5.2 — Types (packages/web/frontend/src/types/governance.ts)
Add: AxisScores, SentenceCandidate, ParagraphCandidate, GenerateSentencesRequest, GenerateSentencesResponse, GenerateParagraphsRequest, GenerateParagraphsResponse. Schema mirrors packages/generation/server/schemas.py.
Replace the 5-type Constraint union (exclude / include / bound / bound_boost / contrastive) with the 7-type union from the server: exclude / include / bound / bound_boost / contrastive_minpair / contrastive_maxopp / contrastive_multopp.
Delete: RichToken, SingleGenerationResponse, GenerationEvent, GenerationResult, PreflightResponse, related v6 stream/token types. No successors.
3.5.3 — Constraint composer (store/constraintStore.ts + lib/constraintCompiler.ts)
Store gains:
- mode: 'sentences' | 'paragraphs' — drives endpoint and visible options
- spec: string — therapeutic noun targets (tokenized space-separated)
- band: string — frequency band selector
- axisWeights: { naturalness: number; grammaticality: number; ageAppropriate: number; coherence: number } — default 0.25 each, sum-to-1 invariant enforced in UI
- topK: number — sentence mode top_k (default 8)
- Paragraph-specific: nSentences, perSentenceTopK, nSubjectSeeds, discourseSubject (optional), usePronounCoref
Compiler maps StoreEntry → 7-type Constraint. The single legacy contrastive StoreEntry expands into three variants based on user selection; chips remain the unit of UI interaction so the existing chip rendering pipeline doesn't need restructuring.
3.5.4 — Composer UI
Existing chip-based composer keeps its shape. Three changes:
- Contrastive chip menu expands to three variants: minimal pair, maximal opposition, multiple opposition. Each has its own form (minpair: phoneme1/phoneme2/position; maxopp: same + min_sonorant_diff; multopp: substitute + targets[] + n_targets).
- A new chip variant bound_boost mirrors the bound chip but renders as a soft/annotation chip rather than a hard filter chip.
- New top-of-panel controls: spec text input, band selector, mode toggle (sentences/paragraphs), top_k slider, four axis-weight sliders (with sum-normalization on commit). Paragraph mode reveals additional options below.
3.5.5 — Output rendering (OutputFeed + OutputCard)
Sentence mode: vertical card list. Each SentenceCandidate card shows:
- Sentence surface text (large)
- Composite score badge (color-coded by quartile)
- Four axis bars: naturalness, grammaticality, age_appropriate, coherence
- Collapsible details: verb, fillers (per role), skeleton arg-structure, feature_distance, sonorant_diff, ppmi_total
Paragraph mode: outer paragraph card with discourse-subject title + paragraph-level composite + axis scores; nested sentence cards below (smaller, same content as sentence mode).
Selection + export (.txt, .csv) — preserved with the new candidate shape. CSV columns: sentence, composite_score, naturalness, grammaticality, age_appropriate, coherence, verb, skeleton, constraints.
TokenDisplay.tsx is deleted (per-token richtext is not part of the CSP output paradigm).
3.5.6 — Tool layout (GovernedGenerationTool/index.tsx)
Header: server-status pill (idle / warming / ready) + visible cold-start banner the first time the user opens the tool in a session ("First request takes ~25s while the backend warms.")
Two-column body: left composer panel (constraints + spec + band + axis weights + mode toggle + mode-specific options); right output feed (top-K candidate cards).
3.5.7 — Iteration loop
Bring up backend (uv run uvicorn packages.generation.server.main:app --host 0.0.0.0 --port 8000) + frontend (cd packages/web/frontend && npm run dev) locally. Iterate together. Per the port-management memory, kill stale processes rather than hopping ports.
Output: ~5-8 commits across this workstream, paced by iteration.
3.6 PHON-90 UI/UX audit¶
After PHON-110 lands, walk the six functional tools and verify the v5.2 data substrate is surfaced. Tools to audit: 1. Custom Word Lists — confirm new properties from PHON-72/73/76/81-83/85-88 appear as filter checkboxes; spot-check filter SQL. 2. Text Analysis — confirm new norms render in per-word stats and aggregate percentile readouts. 3. Contrastive Sets — covered by PHON-111; skip here. 4. Sound Similarity — sanity-only (depends on phoneme features, not new norms). 5. Lookup — confirm word-detail panel renders the new norm fields and PHON-88's 4-table joined columns. 6. Governed Generation — covered by PHON-110; skip here.
Most of properties.ts is auto-generated from packages/web/workers/scripts/config.py's PropertyDef records. The audit is "verify auto-wiring covered everything" — if all the PropertyDef records are in place, the audit may produce zero net-new code.
Output: zero or more small commits depending on findings.
3.7 PHON-111 Contrastive Sets parity¶
The web app's Contrastive Sets tool (Worker route + frontend) uses discrete distinctive-feature counts for distance. CSP runtime uses continuous L2 over learned 27-d posterior vectors from packages/features/. Bring the web app to parity.
Approach: cache the learned 27-d phoneme vectors in worker isolate at cold start (~45 phonemes × 27 floats ≈ 5KB; trivial relative to the existing similarity dot-product cache). Replace the Worker route's distance computation with L2 over cached vectors. Update minimal-pair / maximal-opposition / multiple-opposition handlers to use the continuous distance.
Files:
- packages/web/workers/src/lib/ — new module exposing cached phoneme vectors + L2 distance fn (mirroring the existing similarity cache pattern)
- packages/web/workers/src/routes/contrastive.ts (or wherever the Contrastive Sets route lives) — switch distance source
- Worker tests — update fixtures
- Frontend ContrastiveInterventionTool.tsx — verify rendered distance values still display correctly (they're already floats, so likely no UI change needed)
Tool UX semantics unchanged.
Output: ~2 commits.
4. Sequencing¶
1. Docs rewrite (README + CONTRIBUTING) §3.1 single commit
2. Version bump 5.1.0 → 5.2.0 §3.2 single commit
3. PHON-114 Containers config §3.4 ~3 commits
4. PHON-110 frontend rewrite (iterative w/ servers) §3.5 ~5-8 commits
5. PHON-90 UI/UX audit (post-PHON-110) §3.6 commits per finding
6. PHON-111 contrastive parity §3.7 ~2 commits
7. Jira hygiene (close stale + backfill fixVersion) §3.3 no commits
Quick wins first to clear stale-doc bugs and free our heads. PHON-114 is independent infra so it can land before any UI work. Then the frontend rewrite (the biggest chunk; we iterate with servers up). Audit runs against the final surface. Contrastive parity drops in. Jira hygiene closes the loop.
5. Risks and open questions¶
- Cloudflare Containers open-beta risk. API surface may shift. Mitigation: the route handler abstraction is small (~5 lines), so if we need to swap back to external HTTPS the change is contained.
- Migration tag lock-in (
new_sqlite_classes). Once tagged, we can't undo it cleanly. Acceptable: it's a one-way door for a single class we expect to keep. - Image size at deploy. Baking parquet artifacts into the image gives ~700-800MB. CF Containers may have practical limits (TBD at deploy time). Fallback if image is too big: pull artifacts from R2 at container startup. Out of branch scope.
@cloudflare/containerspackage presence. Unverified that this is the canonical package name in current docs vs. an in-Workers-runtime export. The plan task that adds the dep should verify before adding.- Existing tests for
routes/generation.ts. Must be updated to mock the binding shape; failing fast is preferable to silently passing againstfetchmocks that no longer match.
6. Acceptance¶
release/v5.2.0 is "branch-complete" when:
- README + CONTRIBUTING describe CSP architecture; no T5Gemma/RunPod references in user-facing docs.
- Version chips read 5.2.0.
- Frontend GovernedGenerationTool renders top-K candidates with axis bars; constraint composer supports all 7 constraint types; cold-start banner is visible.
- Cloudflare Containers configuration (Dockerfile + .dockerignore + wrangler binding + Worker class + route migration) is committed; tests updated; type-check + Vitest green.
- Contrastive Sets tool uses continuous L2 distances from learned vectors.
- Tool audit documented; any required fixes committed.
- Jira: stale tickets closed; fixVersion = v5.2.0 set on shipped tickets.
- CI green on release/v5.2.0.
Merge to develop/main is the next session's work.