PHON-112 — Pair-driven CSP design¶

Goal¶

Replace the verb-fixed, skeleton-driven sentence resolver with a constraint-driven resolver in which the verb is just another constrained slot and contrastive constraints (minpair / maxopp / multopp) drive resolution by producing a filler set that the selectional table maps onto roles. Skeletons become host filters, not the entry point.

Motivation¶

PHON-106 v1 hardcoded (nsubj, dobj) as the linked pair under a fixed caller-supplied verb. The architecture has the wrong constraint topology:

A MinpairConstraint(p1, p2) request has to satisfy three conjuncts at once: (i) (nsubj, dobj) is a real minimal pair at (p1, p2), (ii) BOTH halves have nonzero PMI as roles of the supplied verb, (iii) both halves are in the spec lexicon. The intersection is sparse — the v1 realization test had to switch from (k, b) initial to (d, z) final to find any pair survivors at all under verb="cut", and the maxopp probe with (k, m) initial skips for the same reason.
The verb is exempt from constraint filtering. An ExcludeConstraint(/ɹ/) request that supplies verb="run" silently violates the constraint.
Verb selection is unscoped — paradigm_3_csp.solve(verb, ...) requires the caller to know which verb. The eval harness hardcodes a 10-verb list.

The fix is to drop the verb-fixed, skeleton-first stance: constraints (including phonological + bound + contrastive) shape the filler space, the selectional table tells us which (verb, role, filler) combinations are realizable, and skeletons are a host filter selected from skeletons.parquet.

Architecture¶

constraints
   │
   ├─ filter words.parquet  → constraint_filtered_lexicon  (allow set per slot type)
   │     │
   │     ├─ verb candidates  = lexicon ∩ POS_VERB ∩ has_selectional_mass
   │     ├─ noun candidates  = lexicon ∩ POS_NOUN
   │     └─ ...
   │
   ├─ filter pairs.parquet (if contrastive) → filler_set
   │
   └─ join to selectional.parquet (the routing oracle)
         │
         ├─ contrastive path: filler_set ⋈ sel × sel
         │     → rows (verb, role_a, w1, role_b, w2, band, ppmi_a + ppmi_b)
         │
         └─ non-contrastive: per-slot filler enumeration over (verb_candidates × constraint_filtered_lexicon)
               → rows (verb, {role: filler}*, band, total_ppmi)

  → skeleton match: pick skeleton(s) from skeletons.parquet whose role schema covers the produced roles
  → surface realize (existing path, agnostic to where words came from)
  → reranker score (PHON-107)

The selectional table is the routing oracle: for any (verb, role, filler, band) it gives the PPMI score. The contrastive path operates a self-join on this table, joined against the constraint-filtered pair frame. The non-contrastive path enumerates per slot but with the verb iterated rather than fixed.

API¶

# Old (PHON-106 v1 — to retire)
paradigm_3_csp.solve(
    verb: str,
    spec_name: str,
    spec_words: frozenset[str],
    sel_df: pl.DataFrame,
    *,
    constraints: list[Constraint] | None = None,
    word_df: pl.DataFrame | None = None,
    pairs_df: pl.DataFrame | None = None,
    ...,
) -> tuple[list[Candidate], dict]

# New
solve(
    *,
    spec_name: str,
    spec_words: frozenset[str],
    store: WordStore,             # carries word_df, pairs_df, sel_df
    band: str,                    # PMI is band-keyed; required
    constraints: list[Constraint] = (),
    locked_slots: dict[str, str] = {},  # e.g. {"V": "cut", "nsubj": "cat"}
    top_k: int = 8,
) -> tuple[list[Candidate], dict]

Notable shape changes: - Verb leaves the positional signature. To fix the verb, pass locked_slots={"V": "cut"}. Same lock pattern that already exists for nsubj/dobj. - WordStore replaces the three separate DataFrame args. Cleaner; matches PHON-93's runtime data layer. - band is required (the existing default of "fineweb_adult" was implicit; making it explicit forces the caller to declare). - constraints is the only place to specify phonological / norm / contrastive demands.

solve_shape (the single-skeleton resolver) gets a parallel signature change. Existing internal helpers (_load_pairs_for_request, _should_use_vectorized, _enumerate_vectorized, etc.) are reorganized but most survive.

Pipeline (request walk-through)¶

A request solve(spec_name="spec1", store, band="fineweb_adult", constraints=[MinpairConstraint("k", "b", "initial")], top_k=8):

1. Resolve constraint-filtered lexicon (per-slot allow sets)¶

Build per-slot-type allow sets. Two distinct paths:

Filler slots (nsubj, dobj, iobj, pobj_X) — filler allow set = spec_words ∩ constraints. The spec is the therapeutic target lexicon; nouns in those slots should be drawn from it.

Verb slot (V) — verb allow set = full_lexicon ∩ constraints. The verb is NOT restricted by spec_words. Rationale: spec is the target lexicon for THINGS the patient learns; verbs CONNECT those things and are typically not lexicon targets themselves. A k-themed spec like spec1 contains few verbs, but the system should still pick "cut", "see", "make" etc. to construct sentences AROUND the spec nouns.

Both paths apply hard constraints (Exclude, Bound) — Exclude(/ɹ/) rules out "run", "read" as verbs the same way it rules them out as nouns. Soft constraints (Include, BoundBoost) annotate per-word axes for scoring (Task 9 wires those).

verb_candidates additionally requires has_selectional_mass: at least one row in selectional.parquet for the band. Without this, the verb has no PMI signal for any role and produces no useful sentences.

A pair-frame word can land in any slot — including V when slots=None lets the join decide, or when the user explicitly requests slots=("V", "dobj"). The verb does not have to be in a minimal pair; it's one possible host for a pair-frame word, and the join finds whichever assignment yields the highest-scoring complete sentence. If user constraints over-constrain (e.g., Exclude(/k/) + only-k-pair-targets in spec), the result is correctly empty.

2. Resolve contrastive filler set (if a contrastive constraint is present)¶

Call _load_pairs_for_request(constraint, pairs_df, filtered_spec=verb_candidates ∪ noun_candidates) to produce a 4-column frame (filler_a, filler_b, feature_distance, sonorant_diff). Reuse the helper from PHON-106 Task 7 unchanged; the column names generalize from nsubj/dobj to filler_a/filler_b since the role assignment hasn't been computed yet.

For multopp the helper is extended to emit (substitute, target_1, ..., target_N) row tuples.

3. Selectional join (contrastive path)¶

pair_words = pl.concat([pairs["filler_a"], pairs["filler_b"]]).unique()
sel_window = sel.filter(
    (pl.col("filler").is_in(pair_words))
    & (pl.col("band") == band)
    & (pl.col("verb").is_in(verb_candidates))
)
# Self-join on verb, role differs, fillers come from a pair row
joined = (
    sel_window.alias("a")
      .join(sel_window.alias("b"), on="verb")
      .filter(pl.col("a.role") != pl.col("b.role"))
      .join(pairs, left_on=["a.filler", "b.filler"], right_on=["filler_a", "filler_b"])
)

Each row of joined is a complete sentence spec: (verb, role_a, w1, role_b, w2, ppmi_a + ppmi_b, feature_distance, sonorant_diff). The verb falls out of the join; the role assignments fall out of the join; nothing was iterated by the caller.

For non-contrastive requests, step 2 is skipped and step 3 becomes per-slot filler enumeration over (verb_candidates × per-slot allow sets) — the existing _slot_fillers algorithm extended to iterate over verb_candidates rather than receiving a fixed verb.

4. Skeleton match (host filter)¶

For each row of joined, the role pair {role_a, role_b} defines a minimum role schema. Query skeletons.parquet for skeletons whose arg_structure contains both roles, restricted to the request's band. Verb compatibility falls out of the selectional join (a verb that has rows for {role_a, role_b} is by construction a verb that uses those roles in this band) — skeletons.parquet itself stores only verb_lemma_count, not the verb-lemma set, so the host filter doesn't probe per-verb membership.

The existing schema is (band, arg_structure, pos_template, freq, verb_lemma_count, example). Rank candidate hosts by: 1. arg_structure ⊇ {role_a, role_b} (hard filter) 2. band == request.band (hard filter) 3. freq desc (popular skeletons preferred — produces more idiomatic surfaces) 4. Shorter arg_structure first when freq ties (fewer non-content slots = less surface complexity)

5. Surface realize + score¶

The existing renderer (determiner placement, conjugation, advmod fill) runs unchanged on the resolved (verb, fillers, skeleton) tuple. The reranker (PHON-107) scores candidates. top_k are returned.

Constraint dispatch table¶

Constraint	Effect on lexicon	Effect on join	Notes
`ExcludeConstraint(phonemes)`	Remove violating words from per-slot allow sets (incl. verb_candidates)	None — pre-filter	Verb gets the same filter as fillers
`IncludeConstraint(phonemes)`	Annotate per-word axes; no hard filter	Adds `include_*` score axis	Per-word, all slots
`BoundConstraint(norm, min/max)`	Remove out-of-range words from per-slot allow sets	None — pre-filter	Norm columns from words.parquet
`BoundBoostConstraint(norm, min/max)`	Annotate per-word axes	Adds `bound_boost_*` score axis	Per-word, all slots
`MinpairConstraint(p1, p2, pos, slots=("filler_a", "filler_b"))`	None	Drives filler set; role_a/role_b can be ANY pair of roles incl. V	New `slots` parameter; default `("filler_a", "filler_b")` lets join decide role_a/role_b
`MaxoppConstraint(p1, p2, pos, min_son_diff)`	None	Drives filler set with sonorant_diff filter	`feature_distance` becomes scoring axis
`MultoppConstraint(sub, targets, n_targets)`	None	Extends filler set to (N+1)-tuples	Realizes as N sentences in a paragraph (PHON-113 scope)

Two semantic generalizations: - Phonological constraints (Exclude, Include) apply to the verb candidate set the same way they apply to noun candidates. Exclude(/ɹ/) rules out "run", "read", "drive" as verbs. - Contrastive constraints don't lock to (nsubj, dobj). The join produces all (role_a, role_b) combinations; ranking determines which is preferred. A constraint with a slots parameter (e.g., slots=("V", "dobj")) hints the join to retain only that role pair — useful for clinically-targeted verb-noun contrasts.

Skeleton-as-host (not driver)¶

Today, the caller picks a skeleton (shape) and solve_shape fills it. Tomorrow, the join produces (role_a, role_b) and the skeleton is selected to host them.

Mechanics: 1. The join's output has columns verb, role_a, role_b, .... Define produced_roles = {role_a, role_b}. 2. Look up skeletons in skeletons.parquet where arg_structure is a superset of produced_roles. 3. Rank by frequency_in_band[verb], take top-K skeletons. 4. For each skeleton, the join row already determines which words go in the "linked" roles; the remaining content slots (e.g., a pobj_with in nsubj,V,dobj,pobj_with) get per-slot enumeration restricted to constraint-filtered fillers and PMI-positive for that (verb, role). 5. Render + score per skeleton; the top_k across skeletons survives.

This means a single (verb, w1, w2) join row may produce multiple sentence candidates differing in skeleton. That's correct: different skeletons emphasize different aspects of the same content.

Scope¶

In scope: - solve() rewrite per new API - solve_shape() rewrite to consume the join - New verb-as-slot constraint dispatch - Skeleton-as-host migration - Tests under the new join-driven path - Retire v1 PHON-106 linked-slot code (_enumerate_vectorized's contrast_pair_frame branch, hardcoded (nsubj, dobj), etc.)

Out of scope: - Paragraph composition (paragraph_csp.solve_paragraph) — PHON-113 - Multopp realization as N-sentence paragraphs — PHON-113 - Reranker v2 (PHON-107) — unchanged interface; PHON-107 trains on the new path's output - Productionization to /api/generate-single — PHON-109 - Frontend reframe — PHON-110

Migration plan¶

Survives unchanged: pairs.parquet (PHON-106 Task 3), _load_pairs_for_request helper (Task 7), constraint dataclasses (Task 6), WordStore + emit_pairs_parquet (Task 4 + post-Task-3 fix), skeletons.parquet (existing).
Gets retired: _enumerate_vectorized's contrast_pair_frame branch, the hardcoded (nsubj, dobj) linked mode, solve_shape's pair-frame parameters, paradigm_3_csp.solve(verb, ...) positional signature.
Gets rewritten: the orchestration layer between constraint resolution and skeleton selection. The new core is a Polars self-join routine; the existing per-skeleton path becomes a downstream consumer of the join output.
Tests: the 12 contrastive scorer tests in test_contrastive_scorers.py get rewritten. Six of them (the helper tests + error tests) survive in shape; the realization test, the routing tests, and the maxopp feature-distance test get replaced with join-driven assertions.
Branch state: this work goes onto feature/csp-iteration on top of PHON-106. No PR until PHON-109 productionization.

Risks¶

Self-join cost on full sel: 5.44M × 5.44M is huge. Mitigation: pre-filter sel to filler IN pair_words AND verb IN verb_candidates AND band == request.band BEFORE the self-join. Pair_words is typically <2K, verb_candidates <500 after constraints, band slice is ~5–10% of rows. Effective join size: ~10K × 10K. Polars handles this trivially.
Multi-skeleton expansion: a single join row producing 5 sentences across skeletons can blow up top_k. Mitigation: rank skeletons per join row, take top_k_per_row × n_join_rows, then global top_k.
verb_candidates ∩ has_selectional_mass: needs an index. Build once at startup from sel.group_by("verb").agg(pl.len()).filter(pl.len() >= threshold). Cache on WordStore.
Multopp deferred to PHON-113 means the constraint dispatch table has a path that errors today: same behavior as PHON-106 v1 (raise ValueError("paragraph composition")).

Open questions¶

Should band be a constraint? Currently passed as a top-level kwarg. Could become BandConstraint(band) for symmetry. Defer; not load-bearing in v1.
What's the threshold for has_selectional_mass? Probably >= 100 PMI rows for the band. Calibrate empirically; doesn't affect the architecture.
Should MinpairConstraint accept slots=("V", "dobj") in v1 or v2? v1 of PHON-112 should accept it (it's a one-line filter on the join), v2 polish (e.g., allowing slots=None to let the join decide) follows.
Does the eval harness build_judging_set.py need updating? Yes — drop the hardcoded VERBS = [...] list since verbs are constraint-driven now. Goes in PHON-112's plan.

Self-review¶

[x] All decisions concrete: API signature, pipeline stages, constraint dispatch all named explicitly.
[x] No "TBD" / placeholder language.
[x] Internal consistency: WordStore carries pairs_df + sel_df + word_df throughout; constraint dispatch table covers all 7 constraint types.
[x] Scope is decomposed correctly: paragraphs are PHON-113 because paragraph-level coherence is a distinct design problem (shared subject, pronoun coref, agreement, multopp's natural N-sentence shape).
[x] Ambiguity check: slots parameter on MinpairConstraint defaults to ("filler_a", "filler_b") (= "let join pick role pair"), explicit pair (e.g., ("V", "dobj")) hints the join. No silent fallback.