Include/Coverage Redesign + VocabBoost¶
2026-03-17 — Design spec for principled phoneme inclusion, vocabulary boosting, and session-level coverage tracking
Problem¶
The current IncludeConstraint and CoverageConstraint are unprincipled:
- Flat boost. Every token containing a target phoneme gets the same
+strengthlogit bias regardless of how phoneme-rich the token is. A token with /k/ as one of six phonemes gets the same boost as a token that's mostly /k/. - Two separate classes (
IncludeConstraint+CoverageConstraint+_CoverageProjection) for what is conceptually one mechanism with two modes. - Inconsistent phono access.
include.pydoes raw dict access (feats.get("phono")) while every other constraint uses_parse_phono()fromconstraints.py. - Per-generation coverage only. Each turn's coverage tracking starts from zero. The clinician has no way to target phoneme density across a therapy session.
- No vocabulary boosting.
VocabOnlyhard-gates to a word list. There is no soft alternative for encouraging target vocabulary without eliminating everything else.
Additionally, the audit identified several governor-wide gaps:
- Density constraint is dead code (deprecated, no schema, unreachable from API)
- MSHStage not checked in _check_compliance
- MaxOppositionBoost validation error propagates as a 500
- Only 7 of 25+ norms reachable from the command language parser
- HFGovernorProcessor.reset() not called between turns
Solution¶
1. Unified Boost-with-Coverage Pattern¶
A shared two-mode pattern used by both IncludeConstraint and VocabBoostConstraint:
Static mode — user specifies strength, boost applied every step:
boost(token) = weight(token) * strength
Coverage mode — user specifies target_rate, boost scales with the gap:
boost(token) = weight(token) * max_boost * gap(ctx)
gap(ctx) = max(0, target_rate - current_rate)
When coverage is met or exceeded, boost drops to zero. The mechanism self-regulates.
Mode selection is implicit from parameters:
- target_rate provided → coverage mode (stateful projection, mechanism_kind = "projection")
- target_rate absent → static mode (LogitBoost, mechanism_kind = "boost")
strength and target_rate are mutually exclusive: when target_rate is provided, strength is ignored and max_boost controls the dynamic scaling. Pydantic validators enforce this — if both strength (non-default) and target_rate are set, a ValueError is raised.
One class, one build path per constraint type. The stateful tracking lives inside a shared _CoverageMechanism that wraps the weight tensor and handles gap computation.
Deliberate asymmetry: weighting vs. counting. The boost magnitude uses the weight function (density for Include, binary for VocabBoost), so phoneme-rich tokens get stronger encouragement. But coverage tracking counts tokens as binary hit/miss — a token either contains a target phoneme or it doesn't. This is intentional: coverage answers "how many tokens practiced the sound" (binary), while density controls "how strongly to prefer phoneme-rich tokens among those that do" (weighted).
2. IncludeConstraint — Phoneme Density Weighting¶
Replace the flat +strength with normalized phoneme density:
weight(token) = count(target_phonemes ∩ token_phonemes) / len(token_phonemes)
Examples targeting /k/:
| Token phonemes | Density | Rationale |
|---|---|---|
[k, æ, t] |
1/3 = 0.33 | One target phoneme, three total |
[k, ɹ, æ, k] |
2/4 = 0.50 | Two /k/ occurrences, four total |
[k, ə] |
1/2 = 0.50 | Short token, high target density |
[s, t, ɹ, ɪ, ŋ] |
0/5 = 0.00 | No target phonemes, no boost |
Multi-phoneme targets work naturally: /include k t targeting {k, t} against token [k, æ, t] → 2/3 = 0.67.
Tokens with no phonemes in the lookup get weight 0 — they pass through unboosted. Phono access uses _parse_phono() from constraints.py.
Parameters:
- phonemes: set[str] — IPA phonemes to target
- strength: float = 2.0 — static mode boost (mutually exclusive with target_rate)
- target_rate: float | None = None — coverage mode target (0.0–1.0)
- max_boost: float = 3.0 — maximum boost in coverage mode
Content tokens for coverage: All tokens with phonological data (non-empty phoneme lists in the lookup). Target tokens are those whose phonemes intersect the target set. This parallels VocabBoost where content = tokens with lookup data, target = tokens in the target vocabulary.
Clinical purpose: Phoneme elicitation — maximize practice opportunities for target sounds. The density weighting ensures tokens rich in the target phoneme are preferred over tokens where the target is incidental.
Future extension: Position-aware filtering (onset, coda, nucleus) can layer on top by filtering token_phonemes to only the relevant syllable positions before computing density. The syllable data is already in the lookup (PhonoFeatures.syllables with onset, nucleus, coda lists). The scoring function doesn't change — only the input. This is deferred from this spec to assess coherence impact separately.
3. VocabBoostConstraint — Soft Vocabulary Targeting¶
A new constraint type parallel to IncludeConstraint, using the same dual-mode pattern with binary membership weighting:
weight(token) = 1.0 if token_word ∈ target_vocab else 0.0
target_vocab is built from two optional sources (same as VocabOnly):
- lists: list[str] | None — named vocabulary sets from vocab_memberships in the lookup
- words: list[str] | None — explicit word strings, resolved to token IDs via the tokenizer
Single-token resolution only for words. If a word doesn't map to a single token, it is not boosted. This matches VocabOnly behavior and avoids fuzzy subword boosting.
At least one of lists or words is required. If words is provided, the tokenizer must be present in the build kwargs; build() raises ValueError if missing.
Parameters:
- lists: list[str] | None — named vocab sets
- words: list[str] | None — explicit target words
- strength: float = 2.0 — static mode (mutually exclusive with target_rate)
- target_rate: float | None = None — coverage mode target (0.0–1.0)
- max_boost: float = 3.0 — maximum boost in coverage mode
- include_punctuation: bool = True — exempt punctuation tokens from content_ids in coverage mode
Content tokens for coverage: Any token in the lookup with data (a real word), excluding punctuation tokens when include_punctuation is True. Punctuation tokens are identified via the tokenizer (same set as VocabOnly). They pass through unboosted and do not affect the coverage rate. Target tokens are those in the target vocabulary.
Relationship to VocabOnly: VocabOnly (hard gate) remains for strict restriction. VocabBoost is the soft alternative. They compose — gate on a broad list, boost a narrow one within it.
Command syntax:
- /vocab-boost <list_name> — boost a named list (static)
- /vocab-boost <word> <word> ... — boost specific words (static)
- /vocab-boost <list_or_words> N% — coverage mode
- /remove vocab-boost — remove
4. Session-Level Coverage Tracking¶
Coverage mode tracks phoneme/vocabulary hit rates across the entire chat session, not just the current generation.
CoverageTracker — a lightweight counter that lives alongside the session:
CoverageTracker:
key: str # deterministic coverage key (see below)
content_count: int # total content tokens across completed turns
target_count: int # total target tokens across completed turns
Coverage key computation. The key is a deterministic string derived from the constraint's identity fields:
- IncludeConstraint: "include:{sorted_phonemes}" — e.g., "include:k" or "include:k,t"
- VocabBoostConstraint: "vocab_boost:{sorted_lists}:{sorted_words}" — e.g., "vocab_boost:ogden_basic:" or "vocab_boost::cat,dog,fish"
The same algorithm is used in two places: (1) the constraint's build() method, which passes the key to _CoverageMechanism; and (2) the route handler, which uses the key to look up/create trackers. Both derive the key from the schema constraint's fields, ensuring they always match. A shared utility function coverage_key_for(constraint) in governor.py computes this from the schema constraint.
Lifecycle:
1. When a coverage constraint is in the active set at generation time, the route handler looks up or creates a tracker keyed by the constraint identity.
2. Prior counts are passed to the governor mechanism via a new field on GovernorContext: prior_coverage: dict[str, tuple[int, int]] | None — key → (content_count, target_count).
3. After generation completes, the route handler counts content/target tokens in the output and updates the tracker.
4. When the constraint is removed from the store, the tracker is dropped.
5. Re-adding starts fresh from zero.
GovernorContext addition:
@dataclass
class GovernorContext:
step: int = 0
total_steps: int = 1
token_ids: torch.Tensor | None = None
mask_positions: torch.Tensor | None = None
device: str | torch.device = "cpu"
prior_coverage: dict[str, tuple[int, int]] | None = None # NEW
What counts toward coverage: Only assistant tokens from completed turns. User tokens are excluded — the clinician isn't the one practicing. The counting uses the same content_ids / target_ids sets that the mechanism builds at governor construction time. The mechanism exposes a count_tokens(token_ids: list[int]) -> tuple[int, int] method that the route handler calls post-generation to update the tracker.
Gap computation with priors:
def _compute_gap(self, ctx):
prior_content, prior_target = 0, 0
if ctx.prior_coverage and self.coverage_key in ctx.prior_coverage:
prior_content, prior_target = ctx.prior_coverage[self.coverage_key]
# Count current generation tokens
current_content, current_target = self._count_current(ctx.token_ids)
total_content = prior_content + current_content
total_target = prior_target + current_target
if total_content == 0:
return self.target_rate # full gap, no data yet
current_rate = total_target / total_content
return max(0.0, self.target_rate - current_rate)
Injecting priors into the HF adapter. HFGovernorProcessor.reset() gains an optional prior_coverage parameter:
def reset(self, total_steps: int | None = None,
prior_coverage: dict[str, tuple[int, int]] | None = None):
self._step = 0
self._prior_coverage = prior_coverage
if total_steps is not None:
self.total_steps = total_steps
The processor stores prior_coverage and includes it in every GovernorContext it constructs during that generation run. The route handler calls reset(prior_coverage=priors) before each generation.
Accessing count_tokens post-generation. The Governor exposes a get_coverage_mechanisms() -> dict[str, _CoverageMechanism] method that returns coverage mechanisms keyed by their coverage_key. HFGovernorProcessor delegates this through as get_coverage_counters(). The route handler calls this post-generation to get the count_tokens() method for each active coverage constraint.
Cache impact: Only one chat session is active at a time, so the GovernorCache can continue to cache built governors by constraint hash. The per-turn prior counts are injected via reset(), not baked into the build.
Edge case — early tokens: When total content count is 0 or very small (start of session, start of first turn), current_rate is unstable. The mechanism returns target_rate as the gap (full boost), which is the correct behavior — at the start, full encouragement is appropriate.
Sessionless generation (/generate-single): The /generate-single endpoint operates without a session. Coverage mode falls back to per-generation tracking with zero priors — the same behavior as today's _CoverageProjection. Session-level accumulation requires the /generate endpoint with a session.
5. Mechanism Implementation¶
Shared _CoverageMechanism class implements the Mechanism protocol and handles the stateful coverage mode for both IncludeConstraint and VocabBoostConstraint:
class _CoverageMechanism:
def __init__(self, weights, target_ids, content_ids, target_rate, max_boost, vocab_size, coverage_key):
self.weights = weights # (vocab_size,) — density or binary
self.target_ids = target_ids
self.content_ids = content_ids
self.target_rate = target_rate
self.max_boost = max_boost
self.coverage_key = coverage_key
def apply(self, logits, ctx):
gap = self._compute_gap(ctx)
if gap <= 0.0:
return logits
return logits + self.weights.to(logits.device) * (self.max_boost * gap)
def count_tokens(self, token_ids: list[int]) -> tuple[int, int]:
"""Count content and target tokens for post-generation tracker update."""
content = sum(1 for t in token_ids if t in self.content_ids)
target = sum(1 for t in token_ids if t in self.target_ids)
return content, target
Static mode uses LogitBoost directly (no _CoverageMechanism). The weight tensor is precomputed at build time:
# Static: bias[tid] = density(tid) * strength (Include)
# Static: bias[tid] = 1.0 * strength if in vocab else 0.0 (VocabBoost)
return LogitBoost(bias)
6. Schema and API Changes¶
Remove CoverageConstraint as a separate schema type.
Modify IncludeConstraint schema to accept optional target_rate:
class IncludeConstraint(BaseModel):
type: Literal["include"] = "include"
phonemes: list[str]
strength: float = 2.0
target_rate: float | None = None # NEW — triggers coverage mode
max_boost: float = 3.0 # NEW — used in coverage mode
Add VocabBoostConstraint schema:
class VocabBoostConstraint(BaseModel):
type: Literal["vocab_boost"] = "vocab_boost"
lists: list[str] | None = None
words: list[str] | None = None
strength: float = 2.0
target_rate: float | None = None
max_boost: float = 3.0
include_punctuation: bool = True
Update the Constraint discriminated union to include VocabBoostConstraint and remove the separate CoverageConstraint.
Update _to_dg_constraint in governor.py to handle the unified IncludeConstraint (with optional target_rate) and the new VocabBoostConstraint.
7. Frontend Command Changes¶
IncludeConstraint — no syntax change. /include k and /include k 20% work as before. The existing "coverage" StoreEntry variant is removed; the "include" variant gains an optional targetRate field:
// Before: two entry types
type IncludeEntry = { type: "include"; phoneme: string; strength: number };
type CoverageEntry = { type: "coverage"; phoneme: string; targetRate: number };
// After: one unified entry type
type IncludeEntry = { type: "include"; phoneme: string; strength: number; targetRate?: number };
The compiler emits a single IncludeConstraint with or without target_rate. Naming convention: StoreEntry uses camelCase (targetRate), API constraint uses snake_case (target_rate). The compiler handles the conversion.
target_rate conversion: The parser stores percentages as user-entered values (e.g., 20 from /include k 20%). The compiler divides by 100 before sending to the API (e.g., target_rate: 0.20). The API schema validates 0.0 <= target_rate <= 1.0 via a @field_validator.
VocabBoostConstraint — new command:
- /vocab-boost <list_name> — boost a named list
- /vocab-boost <word> <word> ... — boost specific words
- /vocab-boost <list_or_words> N% — coverage mode
The parser needs a new parseVocabBoost function. The compiler needs a new VocabBoostEntry → VocabBoostConstraint path. A new chip color for vocab-boost entries in the ConstraintBar.
StoreEntry union gains a vocab_boost variant:
type VocabBoostEntry = {
type: "vocab_boost";
lists?: string[];
words?: string[];
targetRate?: number;
};
8. Cleanup and Consistency Fixes¶
These are included in this redesign scope:
8a. Remove Density constraint. Delete from constraints.py and __init__.py. Dead code — deprecated, no schema, unreachable from API.
8b. Add MSH to _check_compliance. Check MSHStage in model.py post-hoc compliance. Tokens without phono data still pass through the governor (by design), but tokens with phono data that exceed the stage limit surface as violations.
8c. MaxOppositionBoost validation. Catch ValueError from the sonorant/obstruent class check in _to_dg_constraint and return HTTP 422 with a clear message instead of an unhandled 500.
8d. Expand norm allowlist. Add to the parser's NORM_COMMANDS: dominance, socialness, boi, iconicity, semantic_diversity, contextual_diversity, lexical_decision_rt, and sensorimotor dimensions. These are all present in the lookup already. Each needs a norm key, default direction, and help text entry.
8e. HFGovernorProcessor.reset() between turns. The route handler calls reset(prior_coverage=priors) before each generation to restart the step counter and inject session-level coverage priors (see Section 4).
8f. Unify phono access in include.py. Use _parse_phono() from constraints.py instead of raw dict access. This aligns Include with every other constraint's access pattern.
Files Affected¶
Engine (packages/governors/src/diffusion_governors/)¶
include.py— rewrite: unifiedIncludeConstraintwith density weighting and dual mode; newVocabBoostConstraint; shared_CoverageMechanism; delete_CoverageProjectionand oldCoverageConstraintcore.py— addprior_coveragefield toGovernorContextconstraints.py— deleteDensity; export_parse_phonofor use byinclude.py(or move to shared util)boosts.py— no changesgates.py— no changescdd.py— no changeslookups.py— no changes__init__.py— update exports: removeDensity,CoverageConstraint; addVocabBoostConstraint
Dashboard server (packages/dashboard/server/)¶
schemas.py— removeCoverageConstraintschema; addtarget_rate/max_boosttoIncludeConstraint; addVocabBoostConstraintschema; updateConstraintuniongovernor.py— update_to_dg_constraintfor unified Include and new VocabBoost; catchMaxOppositionBoostValueErroras 422; callreset()on processor between turnsmodel.py— add MSH to_check_compliancesessions.py— addCoverageTrackerstorage alongside sessions (or as a field onSession)routes/generate.py— wire coverage tracker lifecycle: look up priors, pass to context, update after generation
Dashboard frontend (packages/dashboard/frontend/src/)¶
types.ts— removeCoverageEntry; addtarget_rate/max_boosttoIncludeEntry; addVocabBoostEntrytoStoreEntryunion; addVocabBoostConstrainttoConstraintunioncommands/parser.ts— updateparseIncludeto emit single entry type; addparseVocabBoost; expandNORM_COMMANDSwith new normscommands/registry.ts— addvocab-boostverb definition, help text; add new norm entriescommands/compiler.ts— update Include compilation (no more separate Coverage); add VocabBoost compilationstore/constraintStore.ts— minor: handle new entry typescomponents/ConstraintBar/— new chip color for vocab-boost
Tests¶
- Engine tests for density weighting, coverage gap computation with priors, VocabBoost static/coverage modes
- Integration tests for session-level coverage tracking across multiple turns
- Parser tests for
/vocab-boostcommand variants - Compiler tests for unified Include output
What This Does NOT Change¶
- Gate constraints (
Exclude,ExcludeInClusters,Complexity,MSHStage,Bound,NormCovered,VocabOnly) — untouched. - Boost constraints (
MinPairBoost,MaxOppositionBoost) — untouched except the 422 fix. - CDDProjection — untouched.
Bound(mechanism="cdd")still works (unreachable from commands, but the engine path is preserved). - Lookup format — no changes to
build_lookup.pyor the lookup JSON structure. - Governor composition order — gates → boosts → projections, unchanged.
- Position-aware Include — deferred. The syllable data is in the lookup; filtering by onset/coda/nucleus can be added later without changing the scoring function.