Governed Generation — Product Plan¶
2026-03-12 — planning document for monorepo migration, architecture unification, and next phase
The Big Picture¶
Three repos currently do overlapping work with duplicated utilities and fragmented data handling:
| Repo | What it is | What it does |
|---|---|---|
| PhonoLex | Phonological analysis platform | 5 tools (word lists, text analysis, contrastive sets, similarity search, lookup), 15 datasets, 68K+ cognitive edges, Cloudflare Workers + D1, React + MUI frontend |
| phonolex-governors | Constraint library | Gates/Boosts/Projections, LookupBuilder, dataset loaders, HuggingFace adapter |
| constrained_chat | Governed generation dashboard | FastAPI + React, T5Gemma 9B-2B, chat + generate modes, compliance panel |
The duplication problem¶
Same data loaded 3+ ways:
- CMU dict: PhonoLex Python (english_data_loader.py), phonolex-governors (datasets.cmudict_to_phono()), build_lookup_phonolex.py
- Psycholinguistic norms: PhonoLex export-to-d1.py, phonolex-governors (datasets.load_kuperman() etc.), build_lookup_phonolex.py
- IPA/ARPAbet conversion: PhonoLex data/mappings/, phonolex-governors internal
- Syllabification: PhonoLex syllabification.py (full onset-nucleus-coda), phonolex-governors (none — hence WCM is all zeros)
- Phoneme normalization: PhonoLex Workers normalize.ts, build_lookup_phonolex.py IPA_NORMALIZE
The vision: one thing with a few faces¶
One unified platform with three faces:
- PhonoLex (existing) — interactive word analysis, list building, contrastive sets, similarity search. The datahub.
- Governed Generation (existing) — live governed content generation with real-time compliance. The clinical tool.
- Content Catalog (new) — batch-generate 10K stories with specified properties, package, and provide a browsable catalog of static compliant content. Same governor tech, no live model needed to consume.
Shared backbone: - Unified data layer (PhonoLex is the source of truth for all phonological + psycholinguistic data) - Governor engine (constraint compilation, logit processing) - Lookup/enrichment pipeline (tokenizer-aware, subword-attributed)
1. Monorepo Architecture¶
Proposed structure¶
phonolex/ # The product IS PhonoLex
├── packages/
│ ├── data/ # THE data layer (single source of truth)
│ │ ├── sources/ # Raw dataset files (CMU, PHOIBLE, norms)
│ │ ├── mappings/ # IPA/ARPAbet conversion tables
│ │ ├── loaders/ # Python data loading (ONE place)
│ │ │ ├── cmudict.py # CMUdict → phonological features
│ │ │ ├── norms.py # All 13 norm datasets
│ │ │ ├── associations.py # 7 cognitive edge datasets
│ │ │ ├── phoible.py # Phoneme feature vectors
│ │ │ ├── vocab_lists.py # Curated vocabulary lists
│ │ │ └── g2p_alignment.py # PhonoLex G2P alignment data
│ │ ├── phonology/ # Phonological computation (ONE place)
│ │ │ ├── syllabification.py # Onset-nucleus-coda extraction
│ │ │ ├── wcm.py # Word Complexity Measure
│ │ │ ├── similarity.py # Soft Levenshtein (Python port)
│ │ │ └── normalize.py # IPA normalization (ɡ→g etc.)
│ │ └── pyproject.toml # pip install phonolex-data
│ │
│ ├── governors/ # Constraint engine
│ │ ├── src/phonolex_governors/ # Gates, Boosts, Projections
│ │ │ ├── core.py # Governor composition
│ │ │ ├── gates.py # HardGate (Exclude, VocabOnly)
│ │ │ ├── boosts.py # SoftBoost (Include — NEW)
│ │ │ ├── projections.py # Dynamic (Coverage — NEW)
│ │ │ ├── constraints.py # Constraint → layer compilation
│ │ │ └── adapters/
│ │ │ └── huggingface.py # HFGovernorProcessor
│ │ ├── lookup/ # LookupBuilder (uses packages/data)
│ │ └── pyproject.toml # pip install phonolex-governors
│ │
│ ├── web/ # PhonoLex web app (face #1)
│ │ ├── workers/ # Cloudflare Workers API
│ │ │ ├── src/routes/ # words, similarity, contrastive, text, etc.
│ │ │ └── scripts/export-to-d1.py
│ │ └── frontend/ # React + MUI
│ │
│ ├── dashboard/ # Governed Generation (face #2)
│ │ ├── server/ # FastAPI backend
│ │ │ ├── model.py # T5Gemma loading + generation
│ │ │ ├── governor.py # Governor cache + HF adapter
│ │ │ ├── profiles.py # Constraint profiles
│ │ │ └── routes/
│ │ ├── frontend/ # React + Tailwind
│ │ └── scripts/
│ │ └── build_lookup.py # Uses packages/data + packages/governors
│ │
│ └── catalog/ # Content Catalog (face #3 — NEW)
│ ├── generator/ # Batch generation pipeline
│ │ ├── batch.py # Run N generations with profile
│ │ ├── indexer.py # Build searchable catalog
│ │ └── export.py # Package for distribution
│ ├── server/ # Browse/search API
│ └── frontend/ # Catalog browser UI
│
├── docs/
├── pyproject.toml # Workspace root
└── README.md
Key architectural decisions¶
packages/data is the single source of truth. All data loading, phonological computation, and normalization lives here. No more datasets.cmudict_to_phono() in governors AND english_data_loader.py in PhonoLex AND inline loading in scripts.
packages/governors depends on packages/data. LookupBuilder calls data.loaders.cmudict, data.loaders.norms, data.phonology.syllabification, etc. No internal dataset copies.
PhonoLex web app keeps its Cloudflare architecture. The D1 database is a precomputed snapshot of packages/data output — the export-to-d1.py script is the bridge. No runtime dependency on Python data loaders.
Dashboard build_lookup.py uses the shared data layer. Instead of importing from phonolex_governors.datasets, it imports from packages/data. The governors package provides the constraint engine; the data package provides the data.
2. Immediate Changes (Tomorrow)¶
2a. Remove vocab_memberships from RichToken¶
Strip from wire format. Keep in lookup for governor use.
Files:
- server/schemas.py — remove field from RichToken
- server/model.py — stop passing in enrich_tokens()
- frontend/src/types.ts — remove from interface
- Tests: update fixtures
2b. Inline Turn Compliance Cards¶
Replace the separate CompliancePanel sparkline with per-turn compliance cards rendered inline with the chat.
Layout:
┌─────────────────────────────────┬──────────────────────────┐
│ Chat │ Turn Compliance │
├─────────────────────────────────┼──────────────────────────┤
│ 🧑 Tell me about your day │ │
│ │ │
│ 🤖 I went to the park and │ ✓ 32/32 pass │
│ played with my dog. │ AoA: avg 3.2 (≤ 8.0) │
│ │ /ɹ/: 0 tokens ✓ │
│ │ Conc: avg 4.1 (≥ 3.0) │
├─────────────────────────────────┼──────────────────────────┤
│ 🧑 What did you see there? │ │
│ │ │
│ 🤖 I saw birds and flowers. │ ✓ 18/18 pass │
│ The sun was warm. │ AoA: avg 2.8 (≤ 8.0) │
│ │ /ɹ/: 0 tokens ✓ │
│ │ ★ "flowers" AoA=5.8 │
└─────────────────────────────────┴──────────────────────────┘
Each card shows: - Pass/fail ratio - Per-constraint summary (Exclude: count of blocked phoneme tokens; Bound: avg/max vs threshold) - Flagged tokens: edge cases, highest-value tokens near thresholds - Future: Include hit rate, Coverage running %
Component: TurnComplianceCard — computed client-side from Turn.assistant_tokens[].compliance + .norms
3. New Constraint Types¶
3a. Include (Boost layer)¶
Purpose: Increase probability of tokens containing target phonemes.
class IncludeConstraint(BaseModel):
type: Literal["include"]
phonemes: list[str]
strength: float = 2.0 # logit boost magnitude
Governor layer: SoftBoost — adds +strength to logits of tokens whose phonemes overlap with targets. Unlike gates (binary), boosts shift the distribution without eliminating options.
Clinical use: "Generate text rich in /b/ and /d/ for minimal pair practice"
3b. Coverage Rate (Projection layer)¶
Purpose: Hit a target phoneme rate across the generation.
class CoverageConstraint(BaseModel):
type: Literal["coverage"]
phonemes: list[str]
target_rate: float # 0.0–1.0
window: int | None = None
Governor layer: Stateful projection. Each generation step: 1. Count content tokens generated so far 2. Count how many contained target phonemes 3. Compute current_rate = target_count / total_count 4. If current_rate < target_rate: boost target tokens proportionally 5. If current_rate ≥ target_rate: reduce or zero out the boost
Needs: GovernorContext must carry generation history (current input_ids provides this). Projection layer in phonolex-governors needs stateful callback support.
3c. Thematic Constraint¶
Purpose: Keep generation within a semantic field using PhonoLex's cognitive association graph.
class ThematicConstraint(BaseModel):
type: Literal["thematic"]
seed_words: list[str]
strength: float = 1.5
Implementation: At build time, expand seed_words through association graph (USF data from PhonoLex) to get a set of related tokens. Apply as a soft boost to those tokens.
Data source: PhonoLex already has 1M+ cognitive edges across 7 relationship types. The export-to-d1.py pipeline or a direct Python loader can provide these.
4. Norms Expansion¶
Available in lookup, not yet in profiles¶
| Norm | Coverage | Clinical use | Priority |
|---|---|---|---|
valence |
34K | Emotional tone (positive/negative) | High |
arousal |
34K | Calm vs exciting language | High |
imageability |
15K | Vivid, concrete descriptions | High |
frequency / log_frequency |
79K | Common words only | Medium |
socialness |
19K | Interpersonal vocabulary | Medium |
dominance |
34K | Empowering language | Medium |
familiarity |
15K | Familiar vs rare | Medium |
iconicity |
34K | Sound-meaning correspondence | Low |
boi |
22K | Body-object interaction | Low |
semantic_diversity |
60K | Meaning flexibility | Low |
contextual_diversity |
79K | Context breadth | Low |
lexical_decision_rt |
59K | Processing speed | Low |
| Sensorimotor (11 dims) | 46K | Modality-specific content | Low |
Quick-win profiles to add¶
- Calm therapeutic:
arousal ≤ 3.0 - Positive affect:
valence ≥ 6.0 - High imageability:
imageability ≥ 5.0 - High frequency only:
log_frequency ≥ 3.0 - Social language:
socialness ≥ 5.0
UI change¶
NormSliders.tsx should dynamically list available norms from a registry, not hardcode AoA + concreteness.
5. Phonological Data Gaps¶
WCM, syllables, shapes — all zeros¶
Root cause: cmudict_to_phono() in phonolex-governors doesn't compute syllable structure or WCM. PhonoLex's syllabification.py DOES — full onset-nucleus-coda extraction, stress handling, maximal onset principle. And PhonoLex's D1 export computes WCM from 8 phonological parameters.
Fix: In the unified monorepo, build_lookup.py would call packages/data/phonology/syllabification.py to get syllable structure and compute WCM. No more depending on governors for phonological computation — that's the data layer's job.
Blocked on: Monorepo migration (or at minimum, making PhonoLex's syllabification importable from constrained_chat).
Interim fix: Import PhonoLex's syllabification.py directly into build_lookup_phonolex.py. Compute WCM using the same formula as PhonoLex's export-to-d1.py.
Cluster phonemes¶
Currently empty for subword tokens. Computable from syllable onset/coda data once syllabification is wired in.
6. Content Catalog (Face #3)¶
Concept¶
Batch-generate thousands of compliant texts with various constraint profiles, index them, and provide a browsable catalog. Clinicians can search for "stories about animals with AoA ≤ 5.0 and no /ɹ/" and get pre-generated, pre-validated content.
Why¶
- Live generation requires GPU hardware
- Pre-generated content can be served statically
- Quality can be human-reviewed before distribution
- Enables offline use
Architecture¶
Generation pipeline:
profiles × prompts × N repetitions
→ governed generation (T5Gemma)
→ enrichment + compliance verification
→ index with metadata
Catalog:
{
id: "cat-story-r-aoa5-001",
prompt: "Tell me a story about a cat",
profile: "Exclude /ɹ/ + AoA ≤ 5.0",
text: "Clement, a tabby cat with a coat the color...",
metadata: {
token_count: 128,
avg_aoa: 3.8,
phoneme_distribution: {...},
compliance: { passed: true, violations: 0 }
}
}
Relation to generation_sweep.py¶
The existing scripts/generation_sweep.py is a prototype of this. Extend it with:
- Multiple prompts per profile
- N repetitions per prompt × profile combination
- JSON catalog output with full metadata
- Search/filter API
- Simple browse UI
7. PhonoLex Features → Governed Generation Integration¶
PhonoLex has tools that directly serve the governed generation use case but aren't connected:
| PhonoLex Tool | Governed Generation Use |
|---|---|
| Text Analysis | Analyze generated text for readability, highlight by property — same as compliance panel but richer |
| Contrastive Sets | Generate minimal pair targets, then use Include/Coverage constraints to elicit them |
| Phonological Similarity | Find alternative words when governor blocks a word — suggest rewording |
| Word Lists | Build targeted vocab lists for VocabOnly constraint palettes |
| Lookup + Associations | Thematic constraint seed expansion; show clinician why a word was flagged |
Future integration¶
- Compliance panel could use PhonoLex's text analysis percentile engine
- "Suggest alternatives" feature: when a word is flagged, query PhonoLex similarity for compliant alternatives
- Contrastive set generation → automatic Include constraint targets
- Association graph → thematic constraint seed expansion
8. Cleanup¶
Remove from constrained_chat¶
phase0_eval*.py,phase1_*.py,phase2_*.py— archive to docs/archive/WORKING_IMPLEMENTATIONS.md— fold into docsgovernor-t5-plan.md— superseded by this document__pycache__/directories — add to .gitignore- Unused
_normalize_phoneme_listin build_lookup_phonolex.py
Frontend polish¶
- "Violations" highlight mode is useless with airtight governor — replace with "target phonemes" mode for future Include constraint
- Compliance panel: restructure around turn cards, not aggregate sparklines
- Loading states during model warmup and generation
9. Priority Sequence¶
Phase A — Quick wins ✅¶
- ~~Remove
vocab_membershipsfrom RichToken wire format~~ - ~~Add quick-win norm profiles (arousal, valence, imageability, frequency)~~
- ~~Make NormSliders dynamic~~ → replaced by command language
- ~~Turn compliance cards in chat UI~~ → replaced by inline system messages
Phase B — Data unification ✅¶
- ~~Extract shared data layer from PhonoLex + phonolex-governors~~
- ~~Wire PhonoLex syllabification into lookup builder (fix WCM)~~
- ~~Wire PhonoLex association data into lookup builder~~
- ~~Monorepo scaffolding~~
Phase C — New constraints ✅¶
- ~~Include constraint (Boost layer in governors)~~ → density-weighted
IncludeConstraint+/includecommand - ~~Coverage rate imperative (unified into Include)~~ →
IncludeConstraint(target_rate=0.2)+/include k 20% - ~~VocabBoost (soft vocabulary targeting)~~ →
VocabBoostConstraint+/vocab-boostcommand - ~~Thematic constraint (association-backed boost)~~ →
ThematicConstraint+/themecommand
Phase B2 — Command language ✅¶
- Slash command language with 15 verbs (
/exclude,/include,/aoa,/msh,/boost, etc.) - ConstraintStore (Zustand) replaces hardcoded ConstraintPanel
- Pinned constraint bar with dismissible color-coded chips + preset loader/save
- CLI-style monospace system messages in chat feed
- API migrated from
profile_idto inline constraints with hash-keyed cache - 5 constraint schemas (Include, VocabBoost, MSH, MinPairBoost, MaxOppositionBoost)
- NormCovered auto-inserted server-side for Bound constraints
- Profiles dissolve into store on selection — no "active profile" state
- Autocomplete dropdown + IPA keyboard toggle for command input
- Compliance panel removed — violations always on, single-column layout
- Syllabification + WCM wired into governor lookup builder (was all zeros)
- Token card cleaned up: filtered norms, scrollable, no compliance/vocab sections
- See
docs/superpowers/specs/2026-03-16-governed-chat-command-language-design.mdfor full spec
Phase D — Content catalog (next)¶
- Batch generation pipeline
- Catalog indexer + search
- Browse UI
Phase E — Polish¶
- Loading states during model warmup and generation
- Token card styling/colors
- Clickable compliance indicator on messages (expandable detail)