Skip to content

ThematicConstraint Implementation Plan

For agentic workers: REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add a ThematicConstraint that uses SWOW/USF cognitive association data to boost tokens within a clinician-defined semantic field.

Architecture: Load SWOW + USF into a canonical association graph at server startup. ThematicConstraint scores each token's word against the graph using exemplar-based max aggregation, thresholds, and produces a static LogitBoost. The graph flows through the existing kwarg chain to the constraint's build() method.

Tech Stack: Python (diffusion_governors engine, FastAPI dashboard, phonolex_data loaders), TypeScript/React (dashboard frontend), PyTest, Vitest

Spec: docs/superpowers/specs/2026-03-17-thematic-constraint-design.md


Chunk 1: Engine — ThematicConstraint + Association Graph

Task 1: Write failing tests for ThematicConstraint

Files: - Create: packages/governors/tests/test_thematic.py

  • [ ] Step 1: Write tests

The tests use a small synthetic association graph instead of loading real data:

"""Tests for ThematicConstraint."""

from __future__ import annotations

import torch
import pytest

from diffusion_governors.core import Governor, GovernorContext
from diffusion_governors.boosts import LogitBoost
from diffusion_governors.lookups import PhonoFeatures, TokenFeatures


def _make_lookup() -> dict[str, dict]:
    """Lookup with 6 tokens for thematic testing."""
    entries = {
        "0": TokenFeatures(word="dog", phono=PhonoFeatures(phonemes=["d", "ɑ", "ɡ"])),
        "1": TokenFeatures(word="puppy", phono=PhonoFeatures(phonemes=["p", "ʌ", "p", "i"])),
        "2": TokenFeatures(word="cat", phono=PhonoFeatures(phonemes=["k", "æ", "t"])),
        "3": TokenFeatures(word="tree", phono=PhonoFeatures(phonemes=["t", "ɹ", "i"])),
        "4": TokenFeatures(word="the"),  # no phono, function word
        "5": TokenFeatures(word="fish", phono=PhonoFeatures(phonemes=["f", "ɪ", "ʃ"])),
    }
    return {k: v.to_dict() for k, v in entries.items()}


def _make_graph() -> dict[tuple[str, str], float]:
    """Synthetic association graph (canonical key ordering: a < b).

    Associations:
        dog ↔ puppy: 0.25
        cat ↔ puppy: 0.08
        dog ↔ cat: 0.10
        dog ↔ fish: 0.03  (below typical threshold)
        tree ↔ fish: 0.0  (not in graph)
    """
    return {
        ("dog", "puppy"): 0.25,
        ("cat", "puppy"): 0.08,
        ("cat", "dog"): 0.10,
        ("dog", "fish"): 0.03,
    }


VOCAB_SIZE = 6


class TestThematicConstraint:
    """Tests for ThematicConstraint."""

    def test_compiles_to_boost(self):
        from diffusion_governors.thematic import ThematicConstraint

        lookup = _make_lookup()
        graph = _make_graph()
        c = ThematicConstraint(seed_words=["dog"])
        gov = Governor.from_constraints(
            c, vocab_size=VOCAB_SIZE, device="cpu", lookup=lookup, assoc_graph=graph,
        )
        assert len(gov.boosts) == 1
        assert isinstance(gov.boosts[0], LogitBoost)

    def test_mechanism_kind_is_boost(self):
        from diffusion_governors.thematic import ThematicConstraint

        c = ThematicConstraint(seed_words=["dog"])
        assert c.mechanism_kind == "boost"

    def test_single_seed_field_scoring(self):
        """Seed 'dog' boosts words associated with 'dog'."""
        from diffusion_governors.thematic import ThematicConstraint

        lookup = _make_lookup()
        graph = _make_graph()
        c = ThematicConstraint(seed_words=["dog"], strength=2.0, threshold=0.05)
        gov = Governor.from_constraints(
            c, vocab_size=VOCAB_SIZE, device="cpu", lookup=lookup, assoc_graph=graph,
        )
        logits = torch.zeros(1, 1, VOCAB_SIZE)
        out = gov(logits, GovernorContext())

        # dog(0): self-association not in graph → 0
        # puppy(1): assoc(dog,puppy)=0.25 → 0.25 * 2.0 = 0.5
        assert out[0, 0, 1].item() == pytest.approx(0.5, abs=1e-4)
        # cat(2): assoc(dog,cat)=0.10 → 0.10 * 2.0 = 0.2
        assert out[0, 0, 2].item() == pytest.approx(0.2, abs=1e-4)
        # tree(3): no assoc with dog → 0
        assert out[0, 0, 3].item() == pytest.approx(0.0)
        # the(4): not in graph → 0
        assert out[0, 0, 4].item() == pytest.approx(0.0)
        # fish(5): assoc(dog,fish)=0.03, below threshold 0.05 → 0
        assert out[0, 0, 5].item() == pytest.approx(0.0)

    def test_multi_seed_max_aggregation(self):
        """Multiple seeds: field_score = max across seeds."""
        from diffusion_governors.thematic import ThematicConstraint

        lookup = _make_lookup()
        graph = _make_graph()
        # Seeds: dog + cat
        c = ThematicConstraint(seed_words=["dog", "cat"], strength=1.0, threshold=0.05)
        gov = Governor.from_constraints(
            c, vocab_size=VOCAB_SIZE, device="cpu", lookup=lookup, assoc_graph=graph,
        )
        logits = torch.zeros(1, 1, VOCAB_SIZE)
        out = gov(logits, GovernorContext())

        # puppy(1): max(assoc(dog,puppy)=0.25, assoc(cat,puppy)=0.08) = 0.25
        assert out[0, 0, 1].item() == pytest.approx(0.25, abs=1e-4)
        # cat(2): max(assoc(dog,cat)=0.10, self) = 0.10
        assert out[0, 0, 2].item() == pytest.approx(0.10, abs=1e-4)
        # dog(0): max(self, assoc(cat,dog)=0.10) = 0.10
        assert out[0, 0, 0].item() == pytest.approx(0.10, abs=1e-4)

    def test_threshold_filters_weak_associations(self):
        """Words below threshold get zero boost."""
        from diffusion_governors.thematic import ThematicConstraint

        lookup = _make_lookup()
        graph = _make_graph()
        c = ThematicConstraint(seed_words=["dog"], strength=1.0, threshold=0.15)
        gov = Governor.from_constraints(
            c, vocab_size=VOCAB_SIZE, device="cpu", lookup=lookup, assoc_graph=graph,
        )
        logits = torch.zeros(1, 1, VOCAB_SIZE)
        out = gov(logits, GovernorContext())

        # puppy(1): 0.25 >= 0.15 → boosted
        assert out[0, 0, 1].item() == pytest.approx(0.25, abs=1e-4)
        # cat(2): 0.10 < 0.15 → filtered out
        assert out[0, 0, 2].item() == pytest.approx(0.0)

    def test_no_matching_seeds_returns_zero_boost(self):
        """If no seeds have associations, all tokens get zero."""
        from diffusion_governors.thematic import ThematicConstraint

        lookup = _make_lookup()
        graph = _make_graph()
        c = ThematicConstraint(seed_words=["xylophone"], strength=2.0)
        gov = Governor.from_constraints(
            c, vocab_size=VOCAB_SIZE, device="cpu", lookup=lookup, assoc_graph=graph,
        )
        logits = torch.zeros(1, 1, VOCAB_SIZE)
        out = gov(logits, GovernorContext())
        assert torch.allclose(out, logits)

    def test_requires_assoc_graph_kwarg(self):
        """build() raises ValueError if assoc_graph not provided."""
        from diffusion_governors.thematic import ThematicConstraint

        lookup = _make_lookup()
        c = ThematicConstraint(seed_words=["dog"])
        with pytest.raises(ValueError, match="assoc_graph"):
            c.build(vocab_size=VOCAB_SIZE, device="cpu", lookup=lookup)

    def test_exported_from_package(self):
        from diffusion_governors import ThematicConstraint  # noqa: F401


class TestAssocStrength:
    """Tests for the assoc_strength helper."""

    def test_canonical_key_order(self):
        from diffusion_governors.thematic import assoc_strength

        graph = {("apple", "banana"): 0.5}
        # Both orderings should find it
        assert assoc_strength(graph, "apple", "banana") == 0.5
        assert assoc_strength(graph, "banana", "apple") == 0.5

    def test_missing_pair_returns_zero(self):
        from diffusion_governors.thematic import assoc_strength

        graph = {("apple", "banana"): 0.5}
        assert assoc_strength(graph, "apple", "cherry") == 0.0
  • [ ] Step 2: Run tests to verify they fail

Run: uv run pytest packages/governors/tests/test_thematic.py -v Expected: FAIL — thematic module doesn't exist

  • [ ] Step 3: Commit failing tests
git add packages/governors/tests/test_thematic.py
git commit -m "test: add failing tests for ThematicConstraint"

Task 2: Implement ThematicConstraint

Files: - Create: packages/governors/src/diffusion_governors/thematic.py - Modify: packages/governors/src/diffusion_governors/__init__.py

  • [ ] Step 1: Create thematic.py
"""ThematicConstraint — association-backed semantic field boost.

Uses SWOW/USF cognitive association graph to define a semantic field
from exemplar words. Boosts tokens whose words are within the field,
weighted by association strength.

    ThematicConstraint(seed_words=["dog", "cat"], strength=1.5)
"""

from __future__ import annotations

from typing import Any

import torch

from diffusion_governors.constraints import Constraint
from diffusion_governors.core import Mechanism
from diffusion_governors.lookups import Lookup

# Type alias for the association graph
AssocGraph = dict[tuple[str, str], float]


def assoc_strength(graph: AssocGraph, a: str, b: str) -> float:
    """Look up association strength between two words.

    Uses canonical key ordering (alphabetical) so both directions are covered.
    Returns 0.0 if the pair is not in the graph.
    """
    key = (min(a, b), max(a, b))
    return graph.get(key, 0.0)


def build_assoc_graph(swow: dict[str, dict[str, float]],
                      usf: dict[str, dict[str, float]]) -> AssocGraph:
    """Merge SWOW and USF into a single canonical association graph.

    Both datasets use 0-1 normalized association strength.
    For each word pair, the maximum strength across sources is kept.

    Args:
        swow: {cue: {response: strength}} from load_swow()
        usf: {cue: {target: strength}} from load_free_association()

    Returns:
        {(word_a, word_b): strength} where word_a < word_b (canonical ordering)
    """
    graph: AssocGraph = {}

    for cue, responses in swow.items():
        for response, strength in responses.items():
            key = (min(cue, response), max(cue, response))
            graph[key] = max(graph.get(key, 0.0), strength)

    for cue, targets in usf.items():
        for target, strength in targets.items():
            key = (min(cue, target), max(cue, target))
            graph[key] = max(graph.get(key, 0.0), strength)

    return graph


class ThematicConstraint(Constraint):
    """Semantic field boost using cognitive association data.

    Defines a field from exemplar seed words. Sweeps the vocabulary and
    boosts tokens whose words are associated with any seed above a threshold.
    Uses exemplar-based max aggregation: field_score = max across seeds.

    Args:
        seed_words: Exemplar words defining the semantic field.
        strength: Scales the association-derived weights. Default 1.5.
        threshold: Minimum field_score for inclusion. Default 0.02.
    """

    def __init__(self, seed_words: list[str], strength: float = 1.5, threshold: float = 0.02):
        self.seed_words = [w.lower().strip() for w in seed_words]
        self.strength = strength
        self.threshold = threshold

    @property
    def mechanism_kind(self) -> str:
        return "boost"

    def build(self, **kwargs: Any) -> Mechanism:
        from diffusion_governors.boosts import LogitBoost

        vocab_size: int = kwargs["vocab_size"]
        lookup: Lookup = kwargs["lookup"]
        graph: AssocGraph | None = kwargs.get("assoc_graph")

        if graph is None:
            raise ValueError("ThematicConstraint requires assoc_graph in build kwargs")

        scores: dict[int, float] = {}
        for tid_str, entry in lookup.items():
            tid = int(tid_str)
            if tid >= vocab_size:
                continue
            word = entry.get("word", "").lower().strip()
            if not word:
                continue

            # Exemplar-based: max association across all seeds
            field_score = max(
                (assoc_strength(graph, word, seed) for seed in self.seed_words),
                default=0.0,
            )
            if field_score >= self.threshold:
                scores[tid] = field_score

        return LogitBoost.from_scores(scores, vocab_size, scale=self.strength)
  • [ ] Step 2: Update __init__.py — add ThematicConstraint to exports

Add to imports:

from diffusion_governors.thematic import ThematicConstraint

Add "ThematicConstraint" to __all__.

  • [ ] Step 3: Run tests

Run: uv run pytest packages/governors/tests/test_thematic.py -v Expected: All pass

  • [ ] Step 4: Run all governor tests for regressions

Run: uv run pytest packages/governors/tests/ -v Expected: All pass

  • [ ] Step 5: Commit
git add packages/governors/src/diffusion_governors/thematic.py \
       packages/governors/src/diffusion_governors/__init__.py
git commit -m "feat: add ThematicConstraint — association-backed semantic field boost"

Chunk 2: Dashboard Server — Graph Loading, Schema, Governor Bridge

Task 3: Load association graph at server startup

Files: - Modify: packages/dashboard/server/model.py

  • [ ] Step 1: Add association graph loading to model.py

Add a new module-level singleton and load it in load_model():

# Add to module-level state section (after _lookup):
_assoc_graph: dict[tuple[str, str], float] | None = None

# Add accessor:
def get_assoc_graph():
    return _assoc_graph

# In load_model(), after loading _lookup:
    from phonolex_data.loaders import load_swow, load_free_association
    from diffusion_governors.thematic import build_assoc_graph
    print("Loading association graph (SWOW + USF)...")
    swow = load_swow()
    usf = load_free_association()
    _assoc_graph = build_assoc_graph(swow, usf)
    print(f"  Association graph: {len(_assoc_graph)} pairs")

Note: _assoc_graph must be declared global in load_model() alongside the existing globals.

  • [ ] Step 2: Run server tests

Run: uv run pytest packages/dashboard/server/tests/ -v Expected: All pass (tests don't call load_model())

  • [ ] Step 3: Commit
git add packages/dashboard/server/model.py
git commit -m "feat: load SWOW + USF association graph at server startup"

Task 4: Add ThematicConstraint schema and wire governor bridge

Files: - Modify: packages/dashboard/server/schemas.py - Modify: packages/dashboard/server/governor.py - Modify: packages/dashboard/server/routes/generate.py

  • [ ] Step 1: Add ThematicConstraint schema to schemas.py
class ThematicConstraint(BaseModel):
    type: Literal["thematic"] = "thematic"
    seed_words: list[str]
    strength: float = 1.5
    threshold: float = 0.02

    @field_validator("seed_words")
    @classmethod
    def validate_seed_words(cls, v):
        if not v:
            raise ValueError("At least one seed word is required")
        return v

Add ThematicConstraint to the Constraint discriminated union.

  • [ ] Step 2: Update governor.py

Add imports:

from diffusion_governors import ThematicConstraint as DGThematic
from server.schemas import ThematicConstraint

Add to _to_dg_constraint:

    elif isinstance(c, ThematicConstraint):
        return DGThematic(seed_words=c.seed_words, strength=c.strength, threshold=c.threshold)

Update build_governor signature and kwargs to thread assoc_graph:

def build_governor(
    constraints: list[Constraint], lookup: dict, vocab_size: int,
    tokenizer=None, assoc_graph=None,
) -> Governor | None:
    ...
    kwargs = dict(lookup=lookup, vocab_size=vocab_size)
    if tokenizer is not None:
        kwargs["tokenizer"] = tokenizer
    if assoc_graph is not None:
        kwargs["assoc_graph"] = assoc_graph
    return Governor.from_constraints(*dg_constraints, **kwargs)

Update build_logits_processor to accept and pass assoc_graph:

def build_logits_processor(
    constraints: list[Constraint], lookup: dict, vocab_size: int,
    tokenizer=None, assoc_graph=None,
) -> LogitsProcessorList | None:
    gov = build_governor(constraints, lookup, vocab_size, tokenizer=tokenizer, assoc_graph=assoc_graph)
    ...

Update GovernorCache.get_processor to accept and pass assoc_graph:

def get_processor(
    self, constraints: list, lookup: dict, vocab_size: int,
    tokenizer=None, assoc_graph=None,
) -> LogitsProcessorList | None:
    ...
    proc = build_logits_processor(
        constraints, lookup, vocab_size, tokenizer=tokenizer, assoc_graph=assoc_graph,
    )
    ...

  • [ ] Step 3: Pass assoc_graph in route handlers

In routes/generate.py, update both generate and generate_single to pass the graph:

    assoc_graph = model.get_assoc_graph()
    processor = governor_cache.get_processor(
        constraints, lookup, vocab_size,
        tokenizer=model.get_tokenizer(), assoc_graph=assoc_graph,
    )
  • [ ] Step 4: Add warnings field to response schemas

In schemas.py, add warnings: list[str] | None = None to AssistantResponse and SingleGenerationResponse.

  • [ ] Step 5: Run server tests

Run: uv run pytest packages/dashboard/server/tests/ -v Expected: All pass

  • [ ] Step 6: Commit
git add packages/dashboard/server/schemas.py \
       packages/dashboard/server/governor.py \
       packages/dashboard/server/routes/generate.py
git commit -m "feat: wire ThematicConstraint through schema, governor bridge, and routes"

Chunk 3: Frontend — Types, Parser, Compiler, ConstraintBar

Task 5: Update frontend for ThematicConstraint

Files: - Modify: packages/dashboard/frontend/src/types.ts - Modify: packages/dashboard/frontend/src/commands/parser.ts - Modify: packages/dashboard/frontend/src/commands/registry.ts - Modify: packages/dashboard/frontend/src/commands/compiler.ts - Modify: packages/dashboard/frontend/src/components/ConstraintBar/index.tsx

  • [ ] Step 1: Update types.ts

Add interface:

export interface ThematicConstraint {
  type: "thematic";
  seed_words: string[];
  strength: number;
  threshold?: number;
}

Add "thematic" to ConstraintType, add ThematicConstraint to Constraint union.

Add StoreEntry variant:

  | { type: "theme"; seedWords: string[]; strength: number }

  • [ ] Step 2: Add parseTheme to parser.ts
function parseTheme(args: string[]): ParseResult {
  if (args.length === 0) {
    return { type: "error", message: "Usage: /theme <word>... [strength]" };
  }

  // Check if last arg is a number (strength override)
  let strength = 1.5;
  const lastArg = args[args.length - 1];
  if (args.length >= 2 && !isNaN(parseFloat(lastArg)) && isFinite(Number(lastArg))) {
    strength = parseFloat(lastArg);
    args = args.slice(0, -1);
  }

  if (args.length === 0) {
    return { type: "error", message: "At least one seed word is required" };
  }

  const seedWords = args.map((w) => w.toLowerCase());

  return {
    type: "add",
    entries: [{ type: "theme" as const, seedWords, strength }],
    confirmation: `Theme: ${seedWords.join(", ")}${strength !== 1.5 ? ` (strength ${strength})` : ""}`,
  };
}

Add to parseCommand switch:

case "theme": return parseTheme(args);

Add to parseRemove switch:

case "theme":
  return { type: "remove", targetType: "theme", confirmation: "Removed Theme constraint" };

  • [ ] Step 3: Add "theme" to VERBS in registry.ts

  • [ ] Step 4: Add theme compilation to compiler.ts

  // Theme
  const themes = entries.filter((e) => e.type === "theme");
  for (const t of themes) {
    result.push({ type: "thematic", seed_words: t.seedWords, strength: t.strength });
  }
  • [ ] Step 5: Update ConstraintBar

In chipCategory:

case "theme": return "theme";

In chipLabel:

case "theme": return entry.seedWords.join(", ");

In matchFields:

case "theme": return undefined;

  • [ ] Step 6: Run frontend tests

Run: cd packages/dashboard/frontend && npx vitest run Expected: All pass (may need minor test updates if tests check exhaustive type coverage)

  • [ ] Step 7: Commit
git add packages/dashboard/frontend/src/types.ts \
       packages/dashboard/frontend/src/commands/parser.ts \
       packages/dashboard/frontend/src/commands/registry.ts \
       packages/dashboard/frontend/src/commands/compiler.ts \
       packages/dashboard/frontend/src/components/ConstraintBar/index.tsx
git commit -m "feat: frontend support for /theme command — types, parser, compiler, ConstraintBar"

Task 6: Add frontend tests for theme

Files: - Modify: packages/dashboard/frontend/src/commands/__tests__/parser.test.ts - Modify: packages/dashboard/frontend/src/commands/__tests__/compiler.test.ts

  • [ ] Step 1: Add parser tests for /theme
describe("/theme", () => {
  it("parses single seed word", () => {
    const r = parseCommand("/theme dog");
    expect(r?.type).toBe("add");
    expect(r?.entries).toHaveLength(1);
    expect(r?.entries[0]).toEqual({ type: "theme", seedWords: ["dog"], strength: 1.5 });
  });

  it("parses multiple seed words", () => {
    const r = parseCommand("/theme dog cat bird");
    expect(r?.entries[0].seedWords).toEqual(["dog", "cat", "bird"]);
  });

  it("parses strength override", () => {
    const r = parseCommand("/theme dog cat 3.0");
    expect(r?.entries[0].seedWords).toEqual(["dog", "cat"]);
    expect(r?.entries[0].strength).toBe(3.0);
  });

  it("single word with strength", () => {
    const r = parseCommand("/theme animals 2.0");
    expect(r?.entries[0].seedWords).toEqual(["animals"]);
    expect(r?.entries[0].strength).toBe(2.0);
  });

  it("returns error for empty args", () => {
    const r = parseCommand("/theme");
    expect(r?.type).toBe("error");
  });
});
  • [ ] Step 2: Add compiler tests for theme
describe("theme", () => {
  it("compiles theme entry", () => {
    const result = compileConstraints([
      { type: "theme", seedWords: ["dog", "cat"], strength: 1.5 },
    ]);
    expect(result).toEqual([
      { type: "thematic", seed_words: ["dog", "cat"], strength: 1.5 },
    ]);
  });
});
  • [ ] Step 3: Run tests

Run: cd packages/dashboard/frontend && npx vitest run Expected: All pass

  • [ ] Step 4: Commit
git add packages/dashboard/frontend/src/commands/__tests__/parser.test.ts \
       packages/dashboard/frontend/src/commands/__tests__/compiler.test.ts
git commit -m "test: add parser and compiler tests for /theme command"

Chunk 4: Integration + Docs

Task 7: Integration smoke test

  • [ ] Step 1: Run all Python tests

Run: uv run pytest packages/governors/tests/ packages/dashboard/server/tests/ -v Expected: All pass

  • [ ] Step 2: Run all frontend tests

Run: cd packages/dashboard/frontend && npx vitest run Expected: All pass

  • [ ] Step 3: Verify package exports
uv run python -c "
from diffusion_governors import ThematicConstraint
from diffusion_governors.thematic import assoc_strength, build_assoc_graph
print('ThematicConstraint:', ThematicConstraint)
print('assoc_strength:', assoc_strength)
print('build_assoc_graph:', build_assoc_graph)
print('All imports OK')
"

Task 8: Update docs

Files: - Modify: CLAUDE.md - Modify: docs/product-plan.md

  • [ ] Step 1: Update CLAUDE.md

Add ThematicConstraint to the Governed Generation section — mention /theme command, association graph, exemplar-based scoring.

  • [ ] Step 2: Update product plan

Mark thematic constraint as complete in Phase C.

  • [ ] Step 3: Commit
git add CLAUDE.md docs/product-plan.md
git commit -m "docs: update CLAUDE.md and product plan for ThematicConstraint"