PHON-154 Variant-Aware Matching — Phase 4b: Frontend audio (per-variant)¶
For agentic workers: REQUIRED SUB-SKILL: superpowers:subagent-driven-development or superpowers:executing-plans. Steps use checkbox (
- [ ]).
Goal: The Speech Analysis tab shows the production scored against EVERY attested pronunciation — each variant's target + deviation overlay + a "best match" marker — and feeds the session attribution from the best-matching variant.
Architecture: The Worker /api/audio/analyze now returns { produced, variants: [{ canonical, produced, positions, attribution, features }] }. Update audioAnalysisApi types to this shape; DeviationOverlay takes positions directly (so it renders per variant); ProductionCard renders the produced transcript once + each variant's target row + overlay, flagging the best match; AudioAnalysisTool derives each production's session-attribution features from its best-matching variant (lowest mean deviation).
Tech Stack: React + TypeScript + MUI, Vitest + Testing Library.
Spec: docs/superpowers/specs/2026-06-15-phon-154-variant-aware-matching-design.md. Depends on: Phase 4a (Worker returns {produced, variants}).
Out of scope (Phase 4c): Lookup variant display + the superscript has_variants flag on result rows.
Reference: current shapes (post-PHON-153, on this branch)¶
audioAnalysisApi.ts:AnalyzePosition {phone, deviation, nearest};AnalyzeAttribution {source, distances};AnalyzeResult {canonical, produced, positions, attribution, features};analyzeProductionreturns{kind:'ok', result: AnalyzeResult} | {kind:'warming'} | {kind:'error', detail};analyzeProductionWithRetry;attributeSession.DeviationOverlay.tsx: props{ result: AnalyzeResult }, readsresult.positions.ProductionCard.tsx: rendersresult.produced(Heard row) +<DeviationOverlay result={result}/>(Target row); has warming/error/pending states +onRetry.AudioAnalysisTool.tsx:Productionhasresult?: AnalyzeResult; features?: number[];runAnalysissetsfeatures: outcome.result.features ?? undefined; afeatureSig/effect poolsp.featuresviaattributeSession.
Task 1: API types → per-variant shape¶
Files:
- Modify: packages/web/frontend/src/services/audioAnalysisApi.ts
- Test: packages/web/frontend/src/services/audioAnalysisApi.test.ts
- [ ] Step 1: Write the failing test
Add to audioAnalysisApi.test.ts:
import { bestVariant } from './audioAnalysisApi';
describe('bestVariant', () => {
it('picks the variant with the lowest mean deviation', () => {
const result = {
produced: ['k', 'æ', 't'],
variants: [
{ canonical: ['k', 'ə', 't'], positions: [
{ phone: 'k', deviation: 0.1, nearest: 'k' },
{ phone: 'ə', deviation: 1.4, nearest: 'æ' },
{ phone: 't', deviation: 0.1, nearest: 't' }], attribution: null, features: [9, 9] },
{ canonical: ['k', 'æ', 't'], positions: [
{ phone: 'k', deviation: 0.1, nearest: 'k' },
{ phone: 'æ', deviation: 0.1, nearest: 'æ' },
{ phone: 't', deviation: 0.1, nearest: 't' }], attribution: null, features: [1, 1] },
],
};
expect(bestVariant(result)?.features).toEqual([1, 1]);
});
it('returns null for no variants', () => {
expect(bestVariant({ produced: [], variants: [] })).toBeNull();
});
it('handles all-null deviations (falls back to the first variant)', () => {
const r = { produced: ['k'], variants: [
{ canonical: ['k'], positions: [{ phone: 'k', deviation: null, nearest: null }], attribution: null, features: [2] },
]};
expect(bestVariant(r)?.features).toEqual([2]);
});
});
- [ ] Step 2: Run — confirm fail
Run: cd packages/web/frontend && npx vitest run src/services/audioAnalysisApi.test.ts
Expected: FAIL (bestVariant undefined + the new AnalyzeResult shape).
- [ ] Step 3: Update the types + add
bestVariant; updateanalyzeProduction
In audioAnalysisApi.ts, replace the AnalyzeResult interface and add VariantAnalysis + bestVariant:
export interface AnalyzePosition { phone: string; deviation: number | null; nearest: string | null; }
export interface AnalyzeAttribution { source: string; distances: Record<string, number>; }
/** One attested pronunciation scored against the production. */
export interface VariantAnalysis {
canonical: string[];
positions: AnalyzePosition[];
attribution: AnalyzeAttribution | null;
features: number[] | null;
}
export interface AnalyzeResult {
/** The faithful transcript of what was heard (same across variants). */
produced: string[];
/** Per-attested-pronunciation scoring (primary first). */
variants: VariantAnalysis[];
}
/** Mean of a variant's non-null deviations; Infinity when none are scorable. */
function meanDeviation(v: VariantAnalysis): number {
const ds = v.positions.map((p) => p.deviation).filter((d): d is number => d !== null);
return ds.length ? ds.reduce((a, b) => a + b, 0) / ds.length : Infinity;
}
/** The best-matching attested pronunciation (lowest mean deviation). The one
* whose feature vector feeds the session attribution. Null when there are no
* variants; falls back to the first variant when none are scorable. */
export function bestVariant(result: AnalyzeResult): VariantAnalysis | null {
if (!result.variants.length) return null;
let best = result.variants[0];
let bestScore = meanDeviation(best);
for (const v of result.variants.slice(1)) {
const s = meanDeviation(v);
if (s < bestScore) { best = v; bestScore = s; }
}
return best;
}
analyzeProduction already does return { kind: 'ok', result: await res.json() } — the parsed JSON is now {produced, variants}, matching the new AnalyzeResult, so no change to the fetch body is needed. (Confirm the function's return still type-checks against the new AnalyzeResult.)
- [ ] Step 4: Run — confirm pass
Run: cd packages/web/frontend && npx vitest run src/services/audioAnalysisApi.test.ts
Expected: PASS.
- [ ] Step 5: Commit
git add packages/web/frontend/src/services/audioAnalysisApi.ts packages/web/frontend/src/services/audioAnalysisApi.test.ts
git commit -m "feat(phon-154): per-variant AnalyzeResult shape + bestVariant selector"
Task 2: DeviationOverlay takes positions directly¶
Files:
- Modify: packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.tsx
- Test: packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.test.tsx
- [ ] Step 1: Update the test to pass
positions
In DeviationOverlay.test.tsx, change the render calls from <DeviationOverlay result={...}/> to <DeviationOverlay positions={fixture.positions}/> (keep the same assertions on chips/heat/substitution).
- [ ] Step 2: Run — confirm fail
Run: cd packages/web/frontend && npx vitest run src/components/tools/AudioAnalysisTool/DeviationOverlay.test.tsx
Expected: FAIL (prop is result, not positions).
- [ ] Step 3: Change the prop
In DeviationOverlay.tsx, change the interface + body to consume positions directly:
import type { AnalyzePosition } from '../../../services/audioAnalysisApi';
interface DeviationOverlayProps {
positions: AnalyzePosition[];
}
const DeviationOverlay: React.FC<DeviationOverlayProps> = ({ positions }) => {
if (positions.length === 0) return <Box />;
return (
<Stack direction="row" spacing={0.5} flexWrap="wrap" useFlexGap>
{positions.map((p, i) => {
// …unchanged chip/heat/tooltip body, using `p` …
})}
</Stack>
);
};
heatColor, tooltipTitle, the chip rendering exactly as-is — only the prop/source changes from result.positions to positions.)
- [ ] Step 4: Run — confirm pass
Run: same vitest command. Expected: PASS.
- [ ] Step 5: Commit
git add packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.tsx packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.test.tsx
git commit -m "refactor(phon-154): DeviationOverlay takes positions (per-variant ready)"
Task 3: ProductionCard renders per-variant¶
Files:
- Modify: packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.tsx
- Test: packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.test.tsx
- [ ] Step 1: Update the fixture + tests
In ProductionCard.test.tsx, change the fixture to the new shape and keep meaningful assertions:
const fixture: AnalyzeResult = {
produced: ['k', 'æ', 'p'],
variants: [
{ canonical: ['k', 'æ', 't'], attribution: null, features: null, positions: [
{ phone: 'k', deviation: 0.1, nearest: 'k' },
{ phone: 'æ', deviation: 0.2, nearest: 'æ' },
{ phone: 't', deviation: 1.4, nearest: 'p' }] },
{ canonical: ['k', 'æ', 'p'], attribution: null, features: null, positions: [
{ phone: 'k', deviation: 0.1, nearest: 'k' },
{ phone: 'æ', deviation: 0.1, nearest: 'æ' },
{ phone: 'p', deviation: 0.1, nearest: 'p' }] },
],
};
k æ p), the target label, the deviation chips (getByTestId('pos-...') — note: with two variants there are two chip groups; scope the assertion to a variant container or assert getAllByTestId). Add:
- a test that BOTH variants' targets render (k æ t and k æ p appear as target rows);
- a test that the best-matching variant (the k æ p one, all low deviation) is marked (e.g. a "best match" label/chip — assert getByText(/best match/i)).
Keep the warming/error/pending/onRetry tests (state shape unchanged). Update the "couldn't score" test to the new shape (a variant whose positions are all null).
- [ ] Step 2: Run — confirm fail
Run: cd packages/web/frontend && npx vitest run src/components/tools/AudioAnalysisTool/ProductionCard.test.tsx
Expected: FAIL.
- [ ] Step 3: Rewrite the result body of ProductionCard
Import bestVariant + VariantAnalysis. Replace the {result ? (...) : ...} RESULT branch (keep the pending/warming/error branches + onRetry exactly as-is) with: the Heard row once, then a list of variants, each with its canonical target label + DeviationOverlay, the best one flagged.
import DeviationOverlay from './DeviationOverlay';
import type { AnalyzeResult, VariantAnalysis } from '../../../services/audioAnalysisApi';
import { bestVariant } from '../../../services/audioAnalysisApi';
function allNull(v: VariantAnalysis): boolean {
return v.positions.length > 0 && v.positions.every((p) => p.deviation === null);
}
Result branch JSX (replaces the current Stack with Target/Heard rows):
{result ? (
<Stack spacing={1.25}>
{/* What the model heard (same across variants). */}
<Box>
<RowLabel>Heard</RowLabel>
<Typography variant="body2" sx={{ fontFamily: 'monospace' }}>
{result.produced.join(' ')}
</Typography>
</Box>
<Divider />
{/* Each attested pronunciation, scored. Best match flagged; the
clinician picks which target they care about. */}
{result.variants.map((v, i) => {
const isBest = result.variants.length > 1 && v === best;
return (
<Box key={i}>
<Stack direction="row" spacing={1} alignItems="center" sx={{ mb: 0.25 }}>
<RowLabel>
{result.variants.length > 1 ? `Target ${i + 1}` : 'Target'} (deviation-colored)
</RowLabel>
<Typography variant="caption" sx={{ fontFamily: 'monospace', color: 'text.secondary' }}>
{v.canonical.join(' ')}
</Typography>
{isBest && (
<Chip label="best match" size="small" color="success" variant="outlined" />
)}
</Stack>
{allNull(v) ? (
<Typography variant="caption" color="text.secondary">
Couldn't score — deviation signal unavailable for this pronunciation.
</Typography>
) : (
<DeviationOverlay positions={v.positions} />
)}
</Box>
);
})}
</Stack>
) : status === 'pending' ? (
/* …unchanged… */
Add const best = result ? bestVariant(result) : null; near the top of the component body (before the return), and import Chip from @mui/material (add to the existing import). Keep RowLabel, the header, and the non-result branches unchanged.
- [ ] Step 4: Run — confirm pass
Run: same vitest command. Expected: PASS.
- [ ] Step 5: Commit
git add packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.tsx packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.test.tsx
git commit -m "feat(phon-154): ProductionCard renders per-variant targets with best-match flag"
Task 4: AudioAnalysisTool feeds attribution from the best variant¶
Files:
- Modify: packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.tsx
- Test: packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.test.tsx
- [ ] Step 1: Update the test mocks to the new shape
In AudioAnalysisTool.test.tsx, the analyzeProductionWithRetry mock currently resolves { kind:'ok', result: { canonical, produced, positions, attribution, features } }. Change the result to the new shape, e.g.:
result: {
produced: ['k', 'æ', 't'],
variants: [
{ canonical: ['k', 'æ', 't'], positions: [{ phone: 'k', deviation: 0.1, nearest: 'k' }],
attribution: null, features: [1, 2, 3, 4, 5, 6] },
],
},
attributeSession is called with [[1,2,3,4,5,6]] (the best variant's features) and that k æ t renders — both still hold. Do the same for the batch test's mock (d ɔ ɡ).
- [ ] Step 2: Run — confirm fail
Run: cd packages/web/frontend && npx vitest run src/components/tools/AudioAnalysisTool/AudioAnalysisTool.test.tsx
Expected: FAIL (old result.features access / shape mismatch).
- [ ] Step 3: Derive
featuresfrom the best variant
In AudioAnalysisTool.tsx, import bestVariant. In runAnalysis, where it currently does features: outcome.result.features ?? undefined, change to the best variant's features:
if (outcome.kind === 'ok') {
const best = bestVariant(outcome.result);
patchProduction(id, {
result: outcome.result,
features: best?.features ?? undefined,
status: undefined,
error: undefined,
});
} else if (outcome.kind === 'warming') {
(Everything else — the featureSig/hasFeatures effect pooling p.features via attributeSession, the warming/error handling, retry — stays the same; Production.features is still number[] | undefined.)
- [ ] Step 4: Run — confirm pass
Run: same vitest command. Expected: PASS.
- [ ] Step 5: Commit
git add packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.tsx packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.test.tsx
git commit -m "feat(phon-154): session attribution from the best-matching variant"
Task 5: Full frontend regression¶
- [ ] Step 1: type-check + lint + build + test
Run, from packages/web/frontend:
npm run type-check && npm run lint && npm run build && npx vitest run
AnalyzeResult (.canonical/.positions at the top level) exists (grep \.positions\b / result\.canonical under src/), update it to the new shape and re-run.
- [ ] Step 2: Commit (only if a stray consumer needed updating)
git add -A packages/web/frontend
git commit -m "fix(phon-154): align remaining consumers with per-variant AnalyzeResult"
Phase 4b done — exit criteria¶
audioAnalysisApiexposes the per-variantAnalyzeResult+bestVariant;analyzeProductionparses{produced, variants}.- ProductionCard shows the Heard transcript once + each variant's target + overlay, best match flagged.
- Session attribution feeds from the best-matching variant.
- Frontend type-check + lint + build + tests green.
Next: Phase 4c¶
- Lookup displays all variant pronunciations; superscript flag (derive "has variants" from
variants.length > 1— robust to the deferred reseed) on result rows. Then local reseed + browser verification of the whole PHON-154 surface.