Skip to content

PHON-154 Variant-Aware Matching — Phase 4b: Frontend audio (per-variant)

For agentic workers: REQUIRED SUB-SKILL: superpowers:subagent-driven-development or superpowers:executing-plans. Steps use checkbox (- [ ]).

Goal: The Speech Analysis tab shows the production scored against EVERY attested pronunciation — each variant's target + deviation overlay + a "best match" marker — and feeds the session attribution from the best-matching variant.

Architecture: The Worker /api/audio/analyze now returns { produced, variants: [{ canonical, produced, positions, attribution, features }] }. Update audioAnalysisApi types to this shape; DeviationOverlay takes positions directly (so it renders per variant); ProductionCard renders the produced transcript once + each variant's target row + overlay, flagging the best match; AudioAnalysisTool derives each production's session-attribution features from its best-matching variant (lowest mean deviation).

Tech Stack: React + TypeScript + MUI, Vitest + Testing Library.

Spec: docs/superpowers/specs/2026-06-15-phon-154-variant-aware-matching-design.md. Depends on: Phase 4a (Worker returns {produced, variants}).

Out of scope (Phase 4c): Lookup variant display + the superscript has_variants flag on result rows.


Reference: current shapes (post-PHON-153, on this branch)

  • audioAnalysisApi.ts: AnalyzePosition {phone, deviation, nearest}; AnalyzeAttribution {source, distances}; AnalyzeResult {canonical, produced, positions, attribution, features}; analyzeProduction returns {kind:'ok', result: AnalyzeResult} | {kind:'warming'} | {kind:'error', detail}; analyzeProductionWithRetry; attributeSession.
  • DeviationOverlay.tsx: props { result: AnalyzeResult }, reads result.positions.
  • ProductionCard.tsx: renders result.produced (Heard row) + <DeviationOverlay result={result}/> (Target row); has warming/error/pending states + onRetry.
  • AudioAnalysisTool.tsx: Production has result?: AnalyzeResult; features?: number[]; runAnalysis sets features: outcome.result.features ?? undefined; a featureSig/effect pools p.features via attributeSession.

Task 1: API types → per-variant shape

Files: - Modify: packages/web/frontend/src/services/audioAnalysisApi.ts - Test: packages/web/frontend/src/services/audioAnalysisApi.test.ts

  • [ ] Step 1: Write the failing test

Add to audioAnalysisApi.test.ts:

import { bestVariant } from './audioAnalysisApi';

describe('bestVariant', () => {
  it('picks the variant with the lowest mean deviation', () => {
    const result = {
      produced: ['k', 'æ', 't'],
      variants: [
        { canonical: ['k', 'ə', 't'], positions: [
          { phone: 'k', deviation: 0.1, nearest: 'k' },
          { phone: 'ə', deviation: 1.4, nearest: 'æ' },
          { phone: 't', deviation: 0.1, nearest: 't' }], attribution: null, features: [9, 9] },
        { canonical: ['k', 'æ', 't'], positions: [
          { phone: 'k', deviation: 0.1, nearest: 'k' },
          { phone: 'æ', deviation: 0.1, nearest: 'æ' },
          { phone: 't', deviation: 0.1, nearest: 't' }], attribution: null, features: [1, 1] },
      ],
    };
    expect(bestVariant(result)?.features).toEqual([1, 1]);
  });

  it('returns null for no variants', () => {
    expect(bestVariant({ produced: [], variants: [] })).toBeNull();
  });

  it('handles all-null deviations (falls back to the first variant)', () => {
    const r = { produced: ['k'], variants: [
      { canonical: ['k'], positions: [{ phone: 'k', deviation: null, nearest: null }], attribution: null, features: [2] },
    ]};
    expect(bestVariant(r)?.features).toEqual([2]);
  });
});
  • [ ] Step 2: Run — confirm fail

Run: cd packages/web/frontend && npx vitest run src/services/audioAnalysisApi.test.ts Expected: FAIL (bestVariant undefined + the new AnalyzeResult shape).

  • [ ] Step 3: Update the types + add bestVariant; update analyzeProduction

In audioAnalysisApi.ts, replace the AnalyzeResult interface and add VariantAnalysis + bestVariant:

export interface AnalyzePosition { phone: string; deviation: number | null; nearest: string | null; }
export interface AnalyzeAttribution { source: string; distances: Record<string, number>; }

/** One attested pronunciation scored against the production. */
export interface VariantAnalysis {
  canonical: string[];
  positions: AnalyzePosition[];
  attribution: AnalyzeAttribution | null;
  features: number[] | null;
}

export interface AnalyzeResult {
  /** The faithful transcript of what was heard (same across variants). */
  produced: string[];
  /** Per-attested-pronunciation scoring (primary first). */
  variants: VariantAnalysis[];
}

/** Mean of a variant's non-null deviations; Infinity when none are scorable. */
function meanDeviation(v: VariantAnalysis): number {
  const ds = v.positions.map((p) => p.deviation).filter((d): d is number => d !== null);
  return ds.length ? ds.reduce((a, b) => a + b, 0) / ds.length : Infinity;
}

/** The best-matching attested pronunciation (lowest mean deviation). The one
 *  whose feature vector feeds the session attribution. Null when there are no
 *  variants; falls back to the first variant when none are scorable. */
export function bestVariant(result: AnalyzeResult): VariantAnalysis | null {
  if (!result.variants.length) return null;
  let best = result.variants[0];
  let bestScore = meanDeviation(best);
  for (const v of result.variants.slice(1)) {
    const s = meanDeviation(v);
    if (s < bestScore) { best = v; bestScore = s; }
  }
  return best;
}

analyzeProduction already does return { kind: 'ok', result: await res.json() } — the parsed JSON is now {produced, variants}, matching the new AnalyzeResult, so no change to the fetch body is needed. (Confirm the function's return still type-checks against the new AnalyzeResult.)

  • [ ] Step 4: Run — confirm pass

Run: cd packages/web/frontend && npx vitest run src/services/audioAnalysisApi.test.ts Expected: PASS.

  • [ ] Step 5: Commit
git add packages/web/frontend/src/services/audioAnalysisApi.ts packages/web/frontend/src/services/audioAnalysisApi.test.ts
git commit -m "feat(phon-154): per-variant AnalyzeResult shape + bestVariant selector"

Task 2: DeviationOverlay takes positions directly

Files: - Modify: packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.tsx - Test: packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.test.tsx

  • [ ] Step 1: Update the test to pass positions

In DeviationOverlay.test.tsx, change the render calls from <DeviationOverlay result={...}/> to <DeviationOverlay positions={fixture.positions}/> (keep the same assertions on chips/heat/substitution).

  • [ ] Step 2: Run — confirm fail

Run: cd packages/web/frontend && npx vitest run src/components/tools/AudioAnalysisTool/DeviationOverlay.test.tsx Expected: FAIL (prop is result, not positions).

  • [ ] Step 3: Change the prop

In DeviationOverlay.tsx, change the interface + body to consume positions directly:

import type { AnalyzePosition } from '../../../services/audioAnalysisApi';

interface DeviationOverlayProps {
  positions: AnalyzePosition[];
}

const DeviationOverlay: React.FC<DeviationOverlayProps> = ({ positions }) => {
  if (positions.length === 0) return <Box />;
  return (
    <Stack direction="row" spacing={0.5} flexWrap="wrap" useFlexGap>
      {positions.map((p, i) => {
        // …unchanged chip/heat/tooltip body, using `p` …
      })}
    </Stack>
  );
};
(Keep heatColor, tooltipTitle, the chip rendering exactly as-is — only the prop/source changes from result.positions to positions.)

  • [ ] Step 4: Run — confirm pass

Run: same vitest command. Expected: PASS.

  • [ ] Step 5: Commit
git add packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.tsx packages/web/frontend/src/components/tools/AudioAnalysisTool/DeviationOverlay.test.tsx
git commit -m "refactor(phon-154): DeviationOverlay takes positions (per-variant ready)"

Task 3: ProductionCard renders per-variant

Files: - Modify: packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.tsx - Test: packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.test.tsx

  • [ ] Step 1: Update the fixture + tests

In ProductionCard.test.tsx, change the fixture to the new shape and keep meaningful assertions:

const fixture: AnalyzeResult = {
  produced: ['k', 'æ', 'p'],
  variants: [
    { canonical: ['k', 'æ', 't'], attribution: null, features: null, positions: [
      { phone: 'k', deviation: 0.1, nearest: 'k' },
      { phone: 'æ', deviation: 0.2, nearest: 'æ' },
      { phone: 't', deviation: 1.4, nearest: 'p' }] },
    { canonical: ['k', 'æ', 'p'], attribution: null, features: null, positions: [
      { phone: 'k', deviation: 0.1, nearest: 'k' },
      { phone: 'æ', deviation: 0.1, nearest: 'æ' },
      { phone: 'p', deviation: 0.1, nearest: 'p' }] },
  ],
};
Keep tests for: renders the produced transcript (k æ p), the target label, the deviation chips (getByTestId('pos-...') — note: with two variants there are two chip groups; scope the assertion to a variant container or assert getAllByTestId). Add: - a test that BOTH variants' targets render (k æ t and k æ p appear as target rows); - a test that the best-matching variant (the k æ p one, all low deviation) is marked (e.g. a "best match" label/chip — assert getByText(/best match/i)). Keep the warming/error/pending/onRetry tests (state shape unchanged). Update the "couldn't score" test to the new shape (a variant whose positions are all null).

  • [ ] Step 2: Run — confirm fail

Run: cd packages/web/frontend && npx vitest run src/components/tools/AudioAnalysisTool/ProductionCard.test.tsx Expected: FAIL.

  • [ ] Step 3: Rewrite the result body of ProductionCard

Import bestVariant + VariantAnalysis. Replace the {result ? (...) : ...} RESULT branch (keep the pending/warming/error branches + onRetry exactly as-is) with: the Heard row once, then a list of variants, each with its canonical target label + DeviationOverlay, the best one flagged.

import DeviationOverlay from './DeviationOverlay';
import type { AnalyzeResult, VariantAnalysis } from '../../../services/audioAnalysisApi';
import { bestVariant } from '../../../services/audioAnalysisApi';

function allNull(v: VariantAnalysis): boolean {
  return v.positions.length > 0 && v.positions.every((p) => p.deviation === null);
}

Result branch JSX (replaces the current Stack with Target/Heard rows):

        {result ? (
          <Stack spacing={1.25}>
            {/* What the model heard (same across variants). */}
            <Box>
              <RowLabel>Heard</RowLabel>
              <Typography variant="body2" sx={{ fontFamily: 'monospace' }}>
                {result.produced.join(' ')}
              </Typography>
            </Box>

            <Divider />

            {/* Each attested pronunciation, scored. Best match flagged; the
                clinician picks which target they care about. */}
            {result.variants.map((v, i) => {
              const isBest = result.variants.length > 1 && v === best;
              return (
                <Box key={i}>
                  <Stack direction="row" spacing={1} alignItems="center" sx={{ mb: 0.25 }}>
                    <RowLabel>
                      {result.variants.length > 1 ? `Target ${i + 1}` : 'Target'} (deviation-colored)
                    </RowLabel>
                    <Typography variant="caption" sx={{ fontFamily: 'monospace', color: 'text.secondary' }}>
                      {v.canonical.join(' ')}
                    </Typography>
                    {isBest && (
                      <Chip label="best match" size="small" color="success" variant="outlined" />
                    )}
                  </Stack>
                  {allNull(v) ? (
                    <Typography variant="caption" color="text.secondary">
                      Couldn't score — deviation signal unavailable for this pronunciation.
                    </Typography>
                  ) : (
                    <DeviationOverlay positions={v.positions} />
                  )}
                </Box>
              );
            })}
          </Stack>
        ) : status === 'pending' ? (
          /* …unchanged… */

Add const best = result ? bestVariant(result) : null; near the top of the component body (before the return), and import Chip from @mui/material (add to the existing import). Keep RowLabel, the header, and the non-result branches unchanged.

  • [ ] Step 4: Run — confirm pass

Run: same vitest command. Expected: PASS.

  • [ ] Step 5: Commit
git add packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.tsx packages/web/frontend/src/components/tools/AudioAnalysisTool/ProductionCard.test.tsx
git commit -m "feat(phon-154): ProductionCard renders per-variant targets with best-match flag"

Task 4: AudioAnalysisTool feeds attribution from the best variant

Files: - Modify: packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.tsx - Test: packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.test.tsx

  • [ ] Step 1: Update the test mocks to the new shape

In AudioAnalysisTool.test.tsx, the analyzeProductionWithRetry mock currently resolves { kind:'ok', result: { canonical, produced, positions, attribution, features } }. Change the result to the new shape, e.g.:

result: {
  produced: ['k', 'æ', 't'],
  variants: [
    { canonical: ['k', 'æ', 't'], positions: [{ phone: 'k', deviation: 0.1, nearest: 'k' }],
      attribution: null, features: [1, 2, 3, 4, 5, 6] },
  ],
},
The test asserts attributeSession is called with [[1,2,3,4,5,6]] (the best variant's features) and that k æ t renders — both still hold. Do the same for the batch test's mock (d ɔ ɡ).

  • [ ] Step 2: Run — confirm fail

Run: cd packages/web/frontend && npx vitest run src/components/tools/AudioAnalysisTool/AudioAnalysisTool.test.tsx Expected: FAIL (old result.features access / shape mismatch).

  • [ ] Step 3: Derive features from the best variant

In AudioAnalysisTool.tsx, import bestVariant. In runAnalysis, where it currently does features: outcome.result.features ?? undefined, change to the best variant's features:

      if (outcome.kind === 'ok') {
        const best = bestVariant(outcome.result);
        patchProduction(id, {
          result: outcome.result,
          features: best?.features ?? undefined,
          status: undefined,
          error: undefined,
        });
      } else if (outcome.kind === 'warming') {

(Everything else — the featureSig/hasFeatures effect pooling p.features via attributeSession, the warming/error handling, retry — stays the same; Production.features is still number[] | undefined.)

  • [ ] Step 4: Run — confirm pass

Run: same vitest command. Expected: PASS.

  • [ ] Step 5: Commit
git add packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.tsx packages/web/frontend/src/components/tools/AudioAnalysisTool/AudioAnalysisTool.test.tsx
git commit -m "feat(phon-154): session attribution from the best-matching variant"

Task 5: Full frontend regression

  • [ ] Step 1: type-check + lint + build + test

Run, from packages/web/frontend:

npm run type-check && npm run lint && npm run build && npx vitest run
Expected: all clean/green. Pay attention to the react-hooks@7 (React Compiler) lint gate — no new violations. If any other consumer of the old AnalyzeResult (.canonical/.positions at the top level) exists (grep \.positions\b / result\.canonical under src/), update it to the new shape and re-run.

  • [ ] Step 2: Commit (only if a stray consumer needed updating)
git add -A packages/web/frontend
git commit -m "fix(phon-154): align remaining consumers with per-variant AnalyzeResult"

Phase 4b done — exit criteria

  • audioAnalysisApi exposes the per-variant AnalyzeResult + bestVariant; analyzeProduction parses {produced, variants}.
  • ProductionCard shows the Heard transcript once + each variant's target + overlay, best match flagged.
  • Session attribution feeds from the best-matching variant.
  • Frontend type-check + lint + build + tests green.

Next: Phase 4c

  • Lookup displays all variant pronunciations; superscript flag (derive "has variants" from variants.length > 1 — robust to the deferred reseed) on result rows. Then local reseed + browser verification of the whole PHON-154 surface.