PHON-103 — CSP Domain Caching Implementation Plan¶
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Add three module-level LRU caches to the CSP spike so verb-independent setup work (spec_lexicon, filtered_spec, per_word_axes) is computed once per constraint set and reused across multi-verb paragraph composition.
Architecture: New domain_cache.py module with OrderedDict-backed LRU caches. spec_lexicon keyed on (spec_id, id(word_df)). filtered_spec keyed on (id(spec_words_frozenset), hard_constraints, id(word_df)). per_word_axes keyed on (soft_constraints, id(word_df)). Public wrappers replace direct calls in paradigm_3_csp.py and paragraph_csp.py.
Tech Stack: Python 3.12, Polars, collections.OrderedDict, pytest.
Spec: docs/superpowers/specs/2026-05-08-phon-103-csp-domain-caching-design.md
File map¶
| File | Action |
|---|---|
packages/generation/research/2026-05-07-sentence-generation-paradigms/domain_cache.py |
Create |
packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py |
Create |
packages/generation/research/2026-05-07-sentence-generation-paradigms/bench_domain_cache.py |
Create |
packages/generation/research/2026-05-07-sentence-generation-paradigms/paradigm_3_csp.py |
Modify (spec_lexicon returns frozenset; _resolve_domain_words body wrapped; solve()'s per_word_axes call wrapped) |
packages/generation/research/2026-05-07-sentence-generation-paradigms/paragraph_csp.py |
Modify (_filtered_domain body wrapped; solve_paragraph()'s per_word_axes call wrapped) |
All paths in this plan are relative to the repo root /Users/jneumann/Repos/PhonoLex/. The spike directory is referenced as <spike>/ for brevity:
<spike>/ = packages/generation/research/2026-05-07-sentence-generation-paradigms/.
Test command throughout:
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Task 1: Bootstrap domain_cache.py with empty cache infrastructure¶
Files:
- Create: <spike>/domain_cache.py
- Create: <spike>/test_domain_cache.py
- [ ] Step 1.1: Write the failing test for
clear_cachesandget_cache_stats
Create <spike>/test_domain_cache.py:
"""Tests for domain_cache.py — PHON-103."""
from __future__ import annotations
import sys
from pathlib import Path
import pytest
sys.path.insert(0, str(Path(__file__).parent))
import domain_cache
def test_clear_caches_resets_stats():
domain_cache.clear_caches()
stats = domain_cache.get_cache_stats()
expected_keys = {"spec_lexicon", "filtered_spec", "per_word_axes"}
assert set(stats.keys()) == expected_keys
for cache_name, counts in stats.items():
assert counts == {"hits": 0, "misses": 0, "evictions": 0}, (
f"{cache_name} stats not zeroed: {counts}"
)
def test_get_cache_stats_returns_snapshot_not_reference():
"""Mutating the returned dict must not affect internal state."""
domain_cache.clear_caches()
stats = domain_cache.get_cache_stats()
stats["spec_lexicon"]["hits"] = 999
fresh = domain_cache.get_cache_stats()
assert fresh["spec_lexicon"]["hits"] == 0
- [ ] Step 1.2: Run test, verify it fails
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: FAIL with ModuleNotFoundError: No module named 'domain_cache'.
- [ ] Step 1.3: Create minimal
domain_cache.py
Create <spike>/domain_cache.py:
"""Domain caching for CSP — PHON-103.
Three module-level OrderedDict-backed LRU caches keyed on the constraint set,
so verb-independent setup work (spec_lexicon, filtered_spec, per_word_axes)
is computed once per (spec marker × constraints) tuple and reused across
calls within a process lifetime.
Mirrors the existing `_ADVMOD_PMI_CACHE` / `_PHONEME_CACHE` patterns from
`skeleton_csp.py` and `constraint_surface.py`.
"""
from __future__ import annotations
from collections import OrderedDict
from typing import Callable, Iterable, TypeVar
import polars as pl
from phonolex_data.runtime.store import WordStore
from phonolex_generators.cfg_seed.spec_filters import SPEC_FILTERS
from constraint_surface import (
BoundBoostConstraint,
BoundConstraint,
Constraint,
ExcludeConstraint,
IncludeConstraint,
domain_trace,
hard_filter_expr,
per_word_axes as _per_word_axes_uncached,
)
_HARD_TYPES = (ExcludeConstraint, BoundConstraint)
_SOFT_TYPES = (IncludeConstraint, BoundBoostConstraint)
_MAX_SPEC_LEXICON = 8
_MAX_FILTERED_SPEC = 64
_MAX_PER_WORD_AXES = 64
_SPEC_LEXICON_CACHE: OrderedDict = OrderedDict()
_FILTERED_SPEC_CACHE: OrderedDict = OrderedDict()
_PER_WORD_AXES_CACHE: OrderedDict = OrderedDict()
_CACHE_STATS: dict[str, dict[str, int]] = {
"spec_lexicon": {"hits": 0, "misses": 0, "evictions": 0},
"filtered_spec": {"hits": 0, "misses": 0, "evictions": 0},
"per_word_axes": {"hits": 0, "misses": 0, "evictions": 0},
}
T = TypeVar("T")
def _lru_get_or_compute(
cache: OrderedDict,
max_size: int,
stat_key: str,
key,
compute: Callable[[], T],
) -> T:
"""Generic LRU get-or-compute. move_to_end on hit; popitem(last=False) on overflow."""
if key in cache:
cache.move_to_end(key)
_CACHE_STATS[stat_key]["hits"] += 1
return cache[key]
_CACHE_STATS[stat_key]["misses"] += 1
value = compute()
cache[key] = value
if len(cache) > max_size:
cache.popitem(last=False)
_CACHE_STATS[stat_key]["evictions"] += 1
return value
def clear_caches() -> None:
"""Drop all entries and reset stats. Used by tests."""
_SPEC_LEXICON_CACHE.clear()
_FILTERED_SPEC_CACHE.clear()
_PER_WORD_AXES_CACHE.clear()
for stats in _CACHE_STATS.values():
for k in stats:
stats[k] = 0
def get_cache_stats() -> dict[str, dict[str, int]]:
"""Snapshot of hits/misses/evictions per cache. Defensive copy."""
return {k: dict(v) for k, v in _CACHE_STATS.items()}
- [ ] Step 1.4: Run test, verify both pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 2 passed.
- [ ] Step 1.5: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/domain_cache.py \
packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: bootstrap domain_cache module
Empty cache infrastructure: OrderedDict-backed LRU primitives,
_CACHE_STATS, clear_caches(), get_cache_stats(). Tests verify
stat snapshots are defensive copies.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 2: Implement _hashable_constraints helper¶
Files:
- Modify: <spike>/domain_cache.py
- Modify: <spike>/test_domain_cache.py
- [ ] Step 2.1: Write failing tests for
_hashable_constraints
Append to <spike>/test_domain_cache.py:
from constraint_surface import (
BoundBoostConstraint,
BoundConstraint,
ContrastiveConstraint,
ExcludeConstraint,
IncludeConstraint,
)
def test_hashable_constraints_filters_by_type():
excl = ExcludeConstraint(phonemes=("ɹ",))
bnd = BoundConstraint(norm="aoa", max_value=6.0)
incl = IncludeConstraint(phonemes=("k",))
contrastive = ContrastiveConstraint(pair_type="minpair", phoneme1="k", phoneme2="g")
hard_types = (ExcludeConstraint, BoundConstraint)
result = domain_cache._hashable_constraints([excl, bnd, incl, contrastive], hard_types)
assert set(result) == {excl, bnd}
assert isinstance(result, tuple)
def test_hashable_constraints_order_invariant():
excl = ExcludeConstraint(phonemes=("ɹ",))
bnd = BoundConstraint(norm="aoa", max_value=6.0)
hard_types = (ExcludeConstraint, BoundConstraint)
forward = domain_cache._hashable_constraints([excl, bnd], hard_types)
reverse = domain_cache._hashable_constraints([bnd, excl], hard_types)
assert forward == reverse
def test_hashable_constraints_empty_returns_empty_tuple():
incl = IncludeConstraint(phonemes=("k",))
hard_types = (ExcludeConstraint, BoundConstraint)
assert domain_cache._hashable_constraints([incl], hard_types) == ()
assert domain_cache._hashable_constraints([], hard_types) == ()
- [ ] Step 2.2: Run test, verify it fails
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 3 new tests fail with AttributeError: module 'domain_cache' has no attribute '_hashable_constraints'.
- [ ] Step 2.3: Implement
_hashable_constraints
Append to <spike>/domain_cache.py (above _lru_get_or_compute):
def _hashable_constraints(
constraints: Iterable[Constraint],
types: tuple[type, ...],
) -> tuple[Constraint, ...]:
"""Filter to relevant Constraint types and sort for stable hashing.
Constraints are frozen dataclasses (hashable). Sort key uses (type, repr)
so two semantically-equivalent lists in different orders produce the same
cache key.
"""
relevant = [c for c in constraints if isinstance(c, types)]
return tuple(sorted(relevant, key=lambda c: (c.type, repr(c))))
- [ ] Step 2.4: Run tests, verify all pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 5 passed.
- [ ] Step 2.5: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/domain_cache.py \
packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: add _hashable_constraints helper
Filters by Constraint subtype and sorts by (type, repr) so reordered
input lists yield the same cache key.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 3: Implement get_spec_lexicon with WordStore fixture¶
Files:
- Modify: <spike>/domain_cache.py
- Modify: <spike>/test_domain_cache.py
- [ ] Step 3.1: Add a session-scoped WordStore fixture and
clear_cachesautouse fixture
Append to top of <spike>/test_domain_cache.py (after existing imports):
@pytest.fixture(scope="session")
def store():
"""Session-scoped WordStore. ~1s to load."""
from phonolex_data.runtime.store import WordStore
repo_root = Path(__file__).resolve().parents[4]
return WordStore.from_parquet(repo_root / "data" / "runtime" / "words.parquet")
@pytest.fixture(autouse=True)
def _reset_caches():
"""Clear caches between tests for isolation."""
domain_cache.clear_caches()
yield
- [ ] Step 3.2: Write failing tests for
get_spec_lexicon
Append to <spike>/test_domain_cache.py:
def test_get_spec_lexicon_correctness_vs_uncached(store):
"""Cached result must equal direct spec_lexicon call."""
from phonolex_generators.cfg_seed.spec_filters import SPEC_FILTERS
expected = frozenset(
store.subset(SPEC_FILTERS["spec1"])
.get_column("word")
.str.to_lowercase()
.to_list()
)
cached = domain_cache.get_spec_lexicon("spec1", store)
assert cached == expected
assert isinstance(cached, frozenset)
def test_get_spec_lexicon_returns_same_object_on_hit(store):
"""Repeated calls return the SAME frozenset object (id stable). This is
what allows downstream get_filtered_spec to use id() as a cache key."""
a = domain_cache.get_spec_lexicon("spec1", store)
b = domain_cache.get_spec_lexicon("spec1", store)
assert a is b
def test_get_spec_lexicon_second_call_hits_cache(store):
domain_cache.get_spec_lexicon("spec1", store)
domain_cache.get_spec_lexicon("spec1", store)
stats = domain_cache.get_cache_stats()
assert stats["spec_lexicon"]["misses"] == 1
assert stats["spec_lexicon"]["hits"] == 1
def test_get_spec_lexicon_different_specs_separate_entries(store):
domain_cache.get_spec_lexicon("spec1", store)
domain_cache.get_spec_lexicon("spec6", store)
stats = domain_cache.get_cache_stats()
assert stats["spec_lexicon"]["misses"] == 2
assert stats["spec_lexicon"]["hits"] == 0
- [ ] Step 3.3: Run tests, verify they fail
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 4 new tests fail with AttributeError: module 'domain_cache' has no attribute 'get_spec_lexicon'.
- [ ] Step 3.4: Implement
get_spec_lexicon
Append to <spike>/domain_cache.py:
def get_spec_lexicon(spec_id: str, store: WordStore) -> frozenset[str]:
"""Cached spec_lexicon. Returns the SAME frozenset on repeated calls,
so downstream caches keyed on id() of the result remain stable.
Keyed on (spec_id, id(store.df)). Cache lifetime is process lifetime;
underlying SPEC_FILTERS and store.df are immutable singletons.
"""
key = (spec_id, id(store.df))
def compute() -> frozenset[str]:
return frozenset(
store.subset(SPEC_FILTERS[spec_id])
.get_column("word")
.str.to_lowercase()
.to_list()
)
return _lru_get_or_compute(
_SPEC_LEXICON_CACHE, _MAX_SPEC_LEXICON, "spec_lexicon", key, compute,
)
- [ ] Step 3.5: Run tests, verify all pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 9 passed.
- [ ] Step 3.6: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/domain_cache.py \
packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: implement get_spec_lexicon cache
Keyed on (spec_id, id(store.df)). Returns the SAME frozenset on repeated
calls — important because downstream get_filtered_spec keys on
id(spec_words). Tests verify object identity stability.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 4: Implement get_filtered_spec¶
Files:
- Modify: <spike>/domain_cache.py
- Modify: <spike>/test_domain_cache.py
- [ ] Step 4.1: Write failing tests for
get_filtered_spec
Append to <spike>/test_domain_cache.py:
def test_get_filtered_spec_correctness_no_constraints(store):
"""Empty constraints → returns spec_words unchanged, empty trace."""
spec_words = domain_cache.get_spec_lexicon("spec1", store)
filtered, trace = domain_cache.get_filtered_spec(spec_words, [], store.df)
assert filtered == spec_words
assert trace == []
def test_get_filtered_spec_correctness_with_hard_constraint(store):
spec_words = domain_cache.get_spec_lexicon("spec1", store)
excl = ExcludeConstraint(phonemes=("ɹ",))
filtered, trace = domain_cache.get_filtered_spec(spec_words, [excl], store.df)
# All filtered words must lack /ɹ/
spec_df = store.df.filter(pl.col("word").is_in(list(filtered)))
has_r = spec_df.filter(pl.col("phonemes_str").str.contains("|ɹ|", literal=True))
assert has_r.height == 0, "filter leaked words containing /ɹ/"
assert filtered < spec_words # strict subset (some words contain ɹ)
assert len(trace) == 1
assert trace[0]["constraint_label"] == "exclude /ɹ/"
def test_get_filtered_spec_constraint_order_invariance(store):
spec_words = domain_cache.get_spec_lexicon("spec1", store)
excl = ExcludeConstraint(phonemes=("ɹ",))
bnd = BoundConstraint(norm="aoa", max_value=6.0)
a, _ = domain_cache.get_filtered_spec(spec_words, [excl, bnd], store.df)
b, _ = domain_cache.get_filtered_spec(spec_words, [bnd, excl], store.df)
assert a == b
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["misses"] == 1
assert stats["filtered_spec"]["hits"] == 1
def test_get_filtered_spec_word_df_none_passthrough(store):
"""word_df=None → returns (spec_words, []), no caching."""
spec_words = domain_cache.get_spec_lexicon("spec1", store)
excl = ExcludeConstraint(phonemes=("ɹ",))
filtered, trace = domain_cache.get_filtered_spec(spec_words, [excl], None)
assert filtered == spec_words
assert trace == []
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"] == {"hits": 0, "misses": 0, "evictions": 0}
def test_get_filtered_spec_trace_mutation_isolated(store):
"""Mutating returned trace must not corrupt the cached version."""
spec_words = domain_cache.get_spec_lexicon("spec1", store)
excl = ExcludeConstraint(phonemes=("ɹ",))
a, trace_a = domain_cache.get_filtered_spec(spec_words, [excl], store.df)
trace_a[0]["constraint_label"] = "MUTATED"
b, trace_b = domain_cache.get_filtered_spec(spec_words, [excl], store.df)
assert trace_b[0]["constraint_label"] == "exclude /ɹ/"
def test_get_filtered_spec_same_frozenset_hits(store):
"""Same spec_words frozenset reference → cache hit on second call."""
spec_words = domain_cache.get_spec_lexicon("spec1", store)
excl = ExcludeConstraint(phonemes=("ɹ",))
domain_cache.get_filtered_spec(spec_words, [excl], store.df)
domain_cache.get_filtered_spec(spec_words, [excl], store.df)
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["hits"] == 1
- [ ] Step 4.2: Run tests, verify they fail
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 6 new tests fail with AttributeError: module 'domain_cache' has no attribute 'get_filtered_spec'.
- [ ] Step 4.3: Implement
get_filtered_spec
Append to <spike>/domain_cache.py:
def get_filtered_spec(
spec_words: frozenset[str],
constraints: Iterable[Constraint],
word_df: pl.DataFrame | None,
) -> tuple[frozenset[str], list[dict]]:
"""Cached spec ∩ hard-constraint filter.
Caller must pass `spec_words` as a frozenset and hold the same reference
across calls that should hit. `get_spec_lexicon` returns a stable cached
frozenset; paragraph_csp callers compose `spec1 | spec6` once per request
and reuse the union across the verb loop.
Returns (filtered_word_set, domain_trace). Domain trace is a fresh list of
fresh dicts on every call so caller mutation doesn't corrupt the cache.
word_df=None → pass-through (no caching).
"""
if word_df is None:
return spec_words, []
hard = _hashable_constraints(constraints, _HARD_TYPES)
key = (id(spec_words), hard, id(word_df))
def compute() -> tuple[frozenset[str], tuple[dict, ...]]:
if not hard:
return spec_words, ()
spec_df = word_df.filter(pl.col("word").is_in(list(spec_words)))
expr = hard_filter_expr(list(hard))
if expr is None:
return spec_words, ()
trace = tuple(domain_trace(list(hard), spec_df))
filtered = frozenset(spec_df.filter(expr).get_column("word").to_list())
return filtered, trace
filtered, trace = _lru_get_or_compute(
_FILTERED_SPEC_CACHE, _MAX_FILTERED_SPEC, "filtered_spec", key, compute,
)
return filtered, [dict(t) for t in trace]
- [ ] Step 4.4: Run tests, verify all pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 15 passed.
- [ ] Step 4.5: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/domain_cache.py \
packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: implement get_filtered_spec cache
Keyed on (id(spec_words), hard_constraints, id(word_df)). The id()-on-
frozenset key shape lets paragraph_csp's spec1|spec6 unions cache
correctly within a request. word_df=None bypasses caching to preserve
the existing fallback behavior.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 5: Implement get_per_word_axes¶
Files:
- Modify: <spike>/domain_cache.py
- Modify: <spike>/test_domain_cache.py
- [ ] Step 5.1: Write failing tests for
get_per_word_axes
Append to <spike>/test_domain_cache.py:
def test_get_per_word_axes_correctness_vs_uncached(store):
from constraint_surface import per_word_axes as uncached_per_word_axes
incl = IncludeConstraint(phonemes=("k",))
expected = uncached_per_word_axes([incl], store.df)
cached = domain_cache.get_per_word_axes([incl], store.df)
assert cached == expected
def test_get_per_word_axes_second_call_hits_cache(store):
incl = IncludeConstraint(phonemes=("k",))
domain_cache.get_per_word_axes([incl], store.df)
domain_cache.get_per_word_axes([incl], store.df)
stats = domain_cache.get_cache_stats()
assert stats["per_word_axes"]["misses"] == 1
assert stats["per_word_axes"]["hits"] == 1
def test_get_per_word_axes_filters_out_hard_constraints(store):
"""Hard constraints must NOT be part of the per_word_axes key —
different hard constraints with same soft constraints should hit."""
incl = IncludeConstraint(phonemes=("k",))
excl = ExcludeConstraint(phonemes=("ɹ",))
bnd = BoundConstraint(norm="aoa", max_value=6.0)
domain_cache.get_per_word_axes([incl, excl], store.df)
domain_cache.get_per_word_axes([incl, bnd], store.df)
stats = domain_cache.get_cache_stats()
assert stats["per_word_axes"]["misses"] == 1
assert stats["per_word_axes"]["hits"] == 1
def test_get_per_word_axes_word_df_none_returns_empty(store):
incl = IncludeConstraint(phonemes=("k",))
result = domain_cache.get_per_word_axes([incl], None)
assert result == {}
stats = domain_cache.get_cache_stats()
assert stats["per_word_axes"] == {"hits": 0, "misses": 0, "evictions": 0}
- [ ] Step 5.2: Run tests, verify they fail
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 4 new tests fail with AttributeError: module 'domain_cache' has no attribute 'get_per_word_axes'.
- [ ] Step 5.3: Implement
get_per_word_axes
Append to <spike>/domain_cache.py:
def get_per_word_axes(
constraints: Iterable[Constraint],
word_df: pl.DataFrame | None,
) -> dict[str, dict[str, float]]:
"""Cached per_word_axes lookup tables.
Keyed on (sorted soft constraints, id(word_df)). Hard constraints are
filtered out — they don't affect axis values, so changing only hard
constraints should hit the cache.
word_df=None → returns empty dict (no axes), no caching.
"""
if word_df is None:
return {}
soft = _hashable_constraints(constraints, _SOFT_TYPES)
key = (soft, id(word_df))
return _lru_get_or_compute(
_PER_WORD_AXES_CACHE, _MAX_PER_WORD_AXES, "per_word_axes", key,
lambda: _per_word_axes_uncached(list(soft), word_df),
)
- [ ] Step 5.4: Run tests, verify all pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 19 passed.
- [ ] Step 5.5: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/domain_cache.py \
packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: implement get_per_word_axes cache
Keyed on (soft_constraints, id(word_df)). Hard constraints excluded
from key — toggling them must not bust the per_word_axes cache.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 6: Add LRU eviction tests¶
Files:
- Modify: <spike>/test_domain_cache.py
- [ ] Step 6.1: Write LRU eviction tests
Append to <spike>/test_domain_cache.py:
def _make_unique_excludes(n: int) -> list[list[ExcludeConstraint]]:
"""Generate n distinct constraint lists for forcing cache fills."""
return [[ExcludeConstraint(phonemes=(f"x{i}",))] for i in range(n)]
def test_filtered_spec_lru_evicts_oldest_when_full(store):
spec_words = domain_cache.get_spec_lexicon("spec1", store)
# _MAX_FILTERED_SPEC = 64. Insert 65 distinct keys → 1 eviction.
constraint_lists = _make_unique_excludes(65)
for cs in constraint_lists:
domain_cache.get_filtered_spec(spec_words, cs, store.df)
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["misses"] == 65
assert stats["filtered_spec"]["evictions"] == 1
assert stats["filtered_spec"]["hits"] == 0
def test_filtered_spec_lru_move_to_end_on_hit(store):
"""Re-accessing the oldest entry should promote it; next eviction targets the new oldest."""
spec_words = domain_cache.get_spec_lexicon("spec1", store)
constraint_lists = _make_unique_excludes(64)
for cs in constraint_lists:
domain_cache.get_filtered_spec(spec_words, cs, store.df)
# Cache is full. Touch the oldest entry (index 0) — this promotes it.
domain_cache.get_filtered_spec(spec_words, constraint_lists[0], store.df)
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["hits"] == 1
# Insert a new entry — should evict index-1 (the new oldest), not index-0.
domain_cache.get_filtered_spec(
spec_words, [ExcludeConstraint(phonemes=("zNew",))], store.df,
)
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["evictions"] == 1
# Verify index-0 is still in cache by re-accessing it (should hit).
domain_cache.get_filtered_spec(spec_words, constraint_lists[0], store.df)
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["hits"] == 2
def test_per_word_axes_lru_evicts_at_max(store):
"""_MAX_PER_WORD_AXES = 64. Insert 65 → 1 eviction."""
constraint_lists = [
[IncludeConstraint(phonemes=(f"y{i}",))] for i in range(65)
]
for cs in constraint_lists:
domain_cache.get_per_word_axes(cs, store.df)
stats = domain_cache.get_cache_stats()
assert stats["per_word_axes"]["evictions"] == 1
def test_spec_lexicon_cap_above_real_spec_count(store):
"""Touch every real spec_id; _MAX_SPEC_LEXICON = 8 ≥ len(SPEC_FILTERS)
so no eviction expected."""
from phonolex_generators.cfg_seed.spec_filters import SPEC_FILTERS
for spec_id in SPEC_FILTERS:
domain_cache.get_spec_lexicon(spec_id, store)
stats = domain_cache.get_cache_stats()
assert stats["spec_lexicon"]["evictions"] == 0
assert stats["spec_lexicon"]["misses"] == len(SPEC_FILTERS)
- [ ] Step 6.2: Run tests, verify all pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 23 passed.
- [ ] Step 6.3: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: LRU eviction + move-to-end tests
Verifies oldest entry evicted at capacity, hit promotes entry past
next eviction, and spec_lexicon cap is comfortably above the
SPEC_FILTERS count.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 7: Wire spec_lexicon callsites in paradigm_3_csp.py¶
Files:
- Modify: <spike>/paradigm_3_csp.py
The function spec_lexicon(store, spec_id) is defined at line 77 and called at lines 264, 320, 385–386, 451–452, 560, 625 (8 callsites total). We replace the local definition with a wrapper that returns the cached frozenset directly.
- [ ] Step 7.1: Add import for the cache wrappers
Find the existing block (around line 35–47):
from constraint_surface import ( # noqa: E402
Constraint,
BoundConstraint,
ExcludeConstraint,
IncludeConstraint,
BoundBoostConstraint,
ContrastiveConstraint,
cross_slot_axes,
domain_trace,
hard_filter_expr,
per_word_axes,
)
from reranker import rerank # noqa: E402
After this block, add:
from domain_cache import ( # noqa: E402
get_filtered_spec,
get_per_word_axes,
get_spec_lexicon,
)
- [ ] Step 7.2: Replace the local
spec_lexiconfunction with a thin wrapper
Find:
def spec_lexicon(store: WordStore, spec_id: str) -> set[str]:
return set(
store.subset(SPEC_FILTERS[spec_id])
.get_column("word")
.str.to_lowercase()
.to_list()
)
Replace with:
def spec_lexicon(store: WordStore, spec_id: str) -> frozenset[str]:
"""Backwards-compat wrapper around domain_cache.get_spec_lexicon.
Returns a `frozenset[str]`. Downstream callers do `set & spec_words`
intersections which work identically on frozenset. The frozenset return
type is REQUIRED — callers must pass the same frozenset object to
get_filtered_spec to hit the cache (key is id(spec_words)).
"""
return get_spec_lexicon(spec_id, store)
- [ ] Step 7.3: Check whether
SPEC_FILTERSis referenced elsewhere; remove import if unused
grep -n "SPEC_FILTERS" packages/generation/research/2026-05-07-sentence-generation-paradigms/paradigm_3_csp.py
If SPEC_FILTERS is no longer referenced anywhere in paradigm_3_csp.py, remove the line from phonolex_generators.cfg_seed.spec_filters import SPEC_FILTERS. Otherwise leave it.
- [ ] Step 7.4: Run cache tests to verify no regression
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 23 passed.
- [ ] Step 7.5: Smoke-test paradigm_3_csp imports + a baseline solve
cd packages/generation/research/2026-05-07-sentence-generation-paradigms && \
uv run python -c "
import paradigm_3_csp
from phonolex_data.runtime.store import WordStore
from pathlib import Path
import polars as pl
repo = Path('../../../..').resolve()
store = WordStore.from_parquet(repo / 'data' / 'runtime' / 'words.parquet')
sel_df = pl.read_parquet(repo / 'data' / 'runtime' / 'selectional.parquet')
spec_words = paradigm_3_csp.spec_lexicon(store, 'spec1')
top, stats = paradigm_3_csp.solve('cut', 'spec1', spec_words, sel_df, word_df=store.df)
print(f'top-1: {top[0][\"sentence\"]} domain_size={stats[\"nsubj_domain_size\"]}')
print(f'spec_words type: {type(spec_words).__name__}')
"
Expected: a sentence printed and spec_words type: frozenset.
- [ ] Step 7.6: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/paradigm_3_csp.py
git commit -m "$(cat <<'EOF'
PHON-103: route spec_lexicon through domain_cache
Local definition becomes a wrapper around get_spec_lexicon, returning
the cached frozenset directly. The frozenset return type is required
for downstream get_filtered_spec keying on id(spec_words). Set ops
downstream (set & frozenset) are unaffected.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 8: Wire _resolve_domain_words and per_word_axes callsites in paradigm_3_csp.py¶
Files:
- Modify: <spike>/paradigm_3_csp.py
- [ ] Step 8.1: Replace the body of
_resolve_domain_words
Find (around line 93):
def _resolve_domain_words(
spec_words: set[str],
constraints: list[Constraint],
word_df: pl.DataFrame | None,
) -> tuple[set[str], list[dict]]:
"""Apply hard constraints to the spec lexicon. Returns (filtered, trace)."""
hard = [c for c in constraints if isinstance(c, (ExcludeConstraint, BoundConstraint))]
if not hard or word_df is None:
return spec_words, []
spec_df = word_df.filter(pl.col("word").is_in(list(spec_words)))
expr = hard_filter_expr(hard)
if expr is None:
return spec_words, []
trace = domain_trace(hard, spec_df)
filtered = set(spec_df.filter(expr).get_column("word").to_list())
return filtered, trace
Replace with:
def _resolve_domain_words(
spec_words: frozenset[str],
constraints: list[Constraint],
word_df: pl.DataFrame | None,
) -> tuple[frozenset[str], list[dict]]:
"""Apply hard constraints to the spec lexicon via the domain cache.
Returns (filtered_words, trace). spec_words must be a frozenset (the
caller pattern: spec_words = spec_lexicon(store, spec_id) which now
returns frozenset). Callers that need a mutable set can wrap with set().
"""
return get_filtered_spec(spec_words, constraints, word_df)
- [ ] Step 8.2: Replace the
per_word_axescallsite insolve()
Find (around line 170):
word_axes = per_word_axes(constraints, word_df) if word_df is not None else {}
Replace with:
word_axes = get_per_word_axes(constraints, word_df)
(The if word_df is not None guard moves into get_per_word_axes itself.)
- [ ] Step 8.3: Run cache tests to verify no regression
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 23 passed.
- [ ] Step 8.4: Smoke-test paradigm_3_csp end-to-end
cd packages/generation/research/2026-05-07-sentence-generation-paradigms && \
uv run python -c "
import paradigm_3_csp, domain_cache
from phonolex_data.runtime.store import WordStore
from pathlib import Path
import polars as pl
repo = Path('../../../..').resolve()
store = WordStore.from_parquet(repo / 'data' / 'runtime' / 'words.parquet')
sel_df = pl.read_parquet(repo / 'data' / 'runtime' / 'selectional.parquet')
spec_words = paradigm_3_csp.spec_lexicon(store, 'spec1')
domain_cache.clear_caches()
for _ in range(3):
paradigm_3_csp.solve('cut', 'spec1', spec_words, sel_df, word_df=store.df)
print(domain_cache.get_cache_stats())
"
Expected: filtered_spec shows 1 miss + 2 hits; per_word_axes shows 1 miss + 2 hits.
- [ ] Step 8.5: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/paradigm_3_csp.py
git commit -m "$(cat <<'EOF'
PHON-103: route _resolve_domain_words and solve()'s per_word_axes through cache
_resolve_domain_words becomes a thin pass-through to get_filtered_spec.
solve()'s per_word_axes call routes through get_per_word_axes, which
absorbs the word_df=None guard.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 9: Wire _filtered_domain and per_word_axes callsites in paragraph_csp.py¶
Files:
- Modify: <spike>/paragraph_csp.py
paragraph_csp.py has its own _filtered_domain function (line 94) that duplicates _resolve_domain_words logic, plus a per_word_axes call in solve_paragraph (line 231). Both route through the cache.
- [ ] Step 9.1: Add the cache import
Find the import block (around lines 50–58):
from constraint_surface import (
...
per_word_axes,
)
from skeleton_csp import (
...
)
After the skeleton_csp import block, add:
from domain_cache import get_filtered_spec, get_per_word_axes
Also remove per_word_axes from the from constraint_surface import (...) block — it's no longer used directly.
- [ ] Step 9.2: Replace the body of
_filtered_domain
Find (around line 94):
def _filtered_domain(
spec_words: set[str],
constraints: tuple[Constraint, ...],
word_df: pl.DataFrame,
) -> set[str]:
"""Apply hard constraints to spec_words → narrowed domain."""
hard = [c for c in constraints if isinstance(c, (ExcludeConstraint, BoundConstraint))]
if not hard:
return spec_words
spec_df = word_df.filter(pl.col("word").is_in(list(spec_words)))
expr = hard_filter_expr(hard)
if expr is None:
return spec_words
return set(spec_df.filter(expr).get_column("word").to_list())
Replace with:
def _filtered_domain(
spec_words: frozenset[str],
constraints: tuple[Constraint, ...],
word_df: pl.DataFrame,
) -> frozenset[str]:
"""Apply hard constraints to spec_words via the domain cache.
Routes through get_filtered_spec; keys on id(spec_words) so the same
frozenset reference reused across the verb loop hits the cache.
"""
filtered, _ = get_filtered_spec(spec_words, list(constraints), word_df)
return filtered
- [ ] Step 9.3: Replace the
per_word_axescallsite insolve_paragraph()
Find (around line 231):
word_axes = per_word_axes(list(spec.constraints), store_df)
Replace with:
word_axes = get_per_word_axes(list(spec.constraints), store_df)
- [ ] Step 9.4: Verify
hard_filter_expranddomain_traceare no longer referenced; remove if unused
grep -n "hard_filter_expr\|^from constraint_surface" packages/generation/research/2026-05-07-sentence-generation-paradigms/paragraph_csp.py
If hard_filter_expr is no longer referenced (the only caller was inside the now-replaced _filtered_domain body), remove it from the from constraint_surface import (...) block.
- [ ] Step 9.5: Smoke-test paragraph_csp end-to-end
cd packages/generation/research/2026-05-07-sentence-generation-paradigms && \
uv run python -c "
import paragraph_csp, paradigm_3_csp, domain_cache
from paragraph_csp import ParagraphSpec, solve_paragraph
from phonolex_data.runtime.store import WordStore
from pathlib import Path
import polars as pl
repo = Path('../../../..').resolve()
store = WordStore.from_parquet(repo / 'data' / 'runtime' / 'words.parquet')
sel_df = pl.read_parquet(repo / 'data' / 'runtime' / 'selectional.parquet')
spec_words = paradigm_3_csp.spec_lexicon(store, 'spec1') | paradigm_3_csp.spec_lexicon(store, 'spec6')
domain_cache.clear_caches()
spec = ParagraphSpec(verbs=('chase','sit','eat'), band='fineweb_adult', constraints=(),
n_paragraphs=1, per_sentence_top_k=2)
solve_paragraph(spec, store_df=store.df, sel_df=sel_df, spec_words=spec_words)
print(domain_cache.get_cache_stats())
"
Expected: filtered_spec shows ≥1 miss + ≥2 hits (1 miss in _filtered_domain + verb-loop hits); per_word_axes shows 1 miss; spec_lexicon shows 2 misses (from the | setup).
- [ ] Step 9.6: Run cache tests
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 23 passed.
- [ ] Step 9.7: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/paragraph_csp.py
git commit -m "$(cat <<'EOF'
PHON-103: route paragraph_csp _filtered_domain and per_word_axes through cache
_filtered_domain becomes a pass-through to get_filtered_spec.
solve_paragraph's per_word_axes call routes through the cache. With
spec_words held as a stable frozenset for the request duration,
multi-verb paragraphs share one filtered_spec entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 10: Behavior test — paragraph composition reuses cached domain¶
Files:
- Modify: <spike>/test_domain_cache.py
- [ ] Step 10.1: Write the multi-verb paragraph hit/miss test
Append to <spike>/test_domain_cache.py:
def test_solve_paragraph_reuses_cached_domain(store):
"""3-verb paragraph with shared constraints → solve_paragraph calls
get_filtered_spec once (in _filtered_domain), and again per verb in
each candidate-subject solve_shape path. With one stable spec_words
frozenset, all per-verb calls hit. Per_word_axes called once → 1 miss."""
from paragraph_csp import ParagraphSpec, solve_paragraph
repo_root = Path(__file__).resolve().parents[4]
sel_df = pl.read_parquet(repo_root / "data" / "runtime" / "selectional.parquet")
spec_words = (
domain_cache.get_spec_lexicon("spec1", store)
| domain_cache.get_spec_lexicon("spec6", store)
)
# Reset to discount the spec_lexicon prep above
domain_cache.clear_caches()
spec = ParagraphSpec(
verbs=("chase", "sit", "eat"),
band="fineweb_adult",
constraints=(IncludeConstraint(phonemes=("k",)),),
n_paragraphs=1,
per_sentence_top_k=2,
)
solve_paragraph(spec, store_df=store.df, sel_df=sel_df, spec_words=spec_words)
stats = domain_cache.get_cache_stats()
# filtered_spec: paragraph_csp._filtered_domain calls it once with
# constraints=(IncludeConstraint,) → since IncludeConstraint isn't a
# _HARD_TYPE it produces an empty `hard` tuple → cache MISS first call.
# Subsequent _filtered_domain calls don't happen in solve_paragraph
# (only one call). But verb-loop solve_shape calls do NOT route through
# filtered_spec (solve_shape doesn't use it). Therefore: 1 miss, 0 hits.
assert stats["filtered_spec"]["misses"] == 1
# per_word_axes: solve_paragraph calls it once outside the verb loop.
# solve_shape does NOT call per_word_axes — that's solve()'s job, and
# paragraph_csp uses solve_shape directly. So: 1 miss, 0 hits.
assert stats["per_word_axes"]["misses"] == 1
def test_solve_loop_reuses_cached_domain(store):
"""3-verb loop calling paradigm_3_csp.solve directly → 1 miss + 2 hits
on filtered_spec and per_word_axes (solve() goes through both caches
per call)."""
import paradigm_3_csp
repo_root = Path(__file__).resolve().parents[4]
sel_df = pl.read_parquet(repo_root / "data" / "runtime" / "selectional.parquet")
spec_words = paradigm_3_csp.spec_lexicon(store, "spec1")
domain_cache.clear_caches()
constraints = [IncludeConstraint(phonemes=("k",)), ExcludeConstraint(phonemes=("ɹ",))]
for verb in ("cut", "chase", "eat"):
paradigm_3_csp.solve(
verb, "spec1", spec_words, sel_df,
constraints=constraints, word_df=store.df,
)
stats = domain_cache.get_cache_stats()
assert stats["filtered_spec"]["misses"] == 1
assert stats["filtered_spec"]["hits"] == 2
assert stats["per_word_axes"]["misses"] == 1
assert stats["per_word_axes"]["hits"] == 2
The two tests cover the two distinct call-shapes: solve_paragraph (single call to each cache, no verb-loop hits because solve_shape doesn't use the caches) and paradigm_3_csp.solve looped per verb (verb-loop hits).
- [ ] Step 10.2: Run the tests, verify they pass
cd packages/generation && uv run python -m pytest research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py -v
Expected: 25 passed. If either of the two new tests fails, the actual hit/miss counts in the failure message tell you the real call pattern; reconcile by adjusting expectations to match (the design only requires that some reuse occurs, not exact numbers).
- [ ] Step 10.3: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/test_domain_cache.py
git commit -m "$(cat <<'EOF'
PHON-103: behavior tests — verb-loop reuses cached domain
Two tests cover the two call-shapes: solve_paragraph (one call to each
cache, no verb-loop hits because solve_shape bypasses the caches) and
paradigm_3_csp.solve looped per verb (verb-loop hits 1 miss + 2 hits).
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Task 11: Bench script + record baseline¶
Files:
- Create: <spike>/bench_domain_cache.py
- [ ] Step 11.1: Create the bench script
Create <spike>/bench_domain_cache.py:
"""Bench domain_cache speedup across realistic scenarios — PHON-103.
Run: uv run python research/2026-05-07-sentence-generation-paradigms/bench_domain_cache.py
Reports wall-clock for 4 conditions and the cache stats per condition.
"""
from __future__ import annotations
import sys
import time
from pathlib import Path
import polars as pl
sys.path.insert(0, str(Path(__file__).parent))
import domain_cache
import paradigm_3_csp
from constraint_surface import (
BoundConstraint,
ExcludeConstraint,
IncludeConstraint,
)
from paragraph_csp import ParagraphSpec, solve_paragraph
from phonolex_data.runtime.store import WordStore
def _load_data() -> tuple[WordStore, pl.DataFrame]:
repo_root = Path(__file__).resolve().parents[4]
store = WordStore.from_parquet(repo_root / "data" / "runtime" / "words.parquet")
sel_df = pl.read_parquet(repo_root / "data" / "runtime" / "selectional.parquet")
return store, sel_df
def _run_paragraph(store: WordStore, sel_df: pl.DataFrame, constraints: tuple) -> float:
spec_words = (
paradigm_3_csp.spec_lexicon(store, "spec1")
| paradigm_3_csp.spec_lexicon(store, "spec6")
)
spec = ParagraphSpec(
verbs=("chase", "sit", "eat"),
band="fineweb_adult",
constraints=constraints,
n_paragraphs=2,
per_sentence_top_k=4,
)
t0 = time.perf_counter()
solve_paragraph(spec, store_df=store.df, sel_df=sel_df, spec_words=spec_words)
return time.perf_counter() - t0
def main() -> None:
print("Loading WordStore + selectional.parquet…")
store, sel_df = _load_data()
constraints_a = (IncludeConstraint(phonemes=("k",)),)
constraints_b = (
ExcludeConstraint(phonemes=("ɹ",)),
BoundConstraint(norm="aoa", max_value=6.0),
)
print("\n=== Condition 1: cache COLD, 1 paragraph ===")
domain_cache.clear_caches()
t = _run_paragraph(store, sel_df, constraints_a)
print(f"wall: {t:.3f}s")
print(f"stats: {domain_cache.get_cache_stats()}")
print("\n=== Condition 2: cache WARM, repeat 1 paragraph ===")
t = _run_paragraph(store, sel_df, constraints_a)
print(f"wall: {t:.3f}s")
print(f"stats: {domain_cache.get_cache_stats()}")
print("\n=== Condition 3: 5 paragraphs, same constraints ===")
domain_cache.clear_caches()
t0 = time.perf_counter()
for _ in range(5):
_run_paragraph(store, sel_df, constraints_a)
total = time.perf_counter() - t0
print(f"wall: {total:.3f}s ({total/5:.3f}s avg)")
print(f"stats: {domain_cache.get_cache_stats()}")
print("\n=== Condition 4: 5 paragraphs, alternating constraint sets ===")
domain_cache.clear_caches()
t0 = time.perf_counter()
for i in range(5):
cs = constraints_a if i % 2 == 0 else constraints_b
_run_paragraph(store, sel_df, cs)
total = time.perf_counter() - t0
print(f"wall: {total:.3f}s ({total/5:.3f}s avg)")
print(f"stats: {domain_cache.get_cache_stats()}")
if __name__ == "__main__":
main()
- [ ] Step 11.2: Run the bench
cd packages/generation && uv run python research/2026-05-07-sentence-generation-paradigms/bench_domain_cache.py
Expected: Four condition reports printed. Capture the numbers — they go in the spec.
- [ ] Step 11.3: Append baseline numbers to the spec
Open docs/superpowers/specs/2026-05-08-phon-103-csp-domain-caching-design.md and append at the end:
## Empirical baseline (recorded 2026-05-08)
From `bench_domain_cache.py`:
| Condition | Wall-clock | spec_lexicon h/m/e | filtered_spec h/m/e | per_word_axes h/m/e |
|---|---|---|---|---|
| Cold, 1 paragraph | <FILL>s | <FILL> | <FILL> | <FILL> |
| Warm, 1 paragraph (repeat) | <FILL>s | <FILL> | <FILL> | <FILL> |
| 5 paragraphs, same constraints | <FILL>s total (<FILL>s avg) | <FILL> | <FILL> | <FILL> |
| 5 paragraphs, alternating | <FILL>s total (<FILL>s avg) | <FILL> | <FILL> | <FILL> |
Speedup: warm vs cold = <FILL>×; 5 same vs 5 alternating = <FILL>×.
Replace each <FILL> with the actual number from the bench output.
- [ ] Step 11.4: Commit
git add packages/generation/research/2026-05-07-sentence-generation-paradigms/bench_domain_cache.py \
docs/superpowers/specs/2026-05-08-phon-103-csp-domain-caching-design.md
git commit -m "$(cat <<'EOF'
PHON-103: bench script + recorded baseline
bench_domain_cache.py reports wall-clock + cache stats for 4
conditions. Baseline numbers folded into the design spec.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
EOF
)"
Done¶
After Task 11 commits, the cache implementation is feature-complete with tests + a bench. PHON-103 closes; PHON-104 (per-slot top-N pruning) is unblocked.