acoustic` — Implementation Plan¶

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development. Steps use checkbox (- [ ]) syntax.

Goal: A /dev/acoustic dev page (Model #4) that extracts F1–F3 + F0 + duration from a vowel production via Parselmouth and overlays Hillenbrand percentile bands by target-vowel × speaker-group. Dev/validation surface, NOT the product.

Architecture: phonolex_audio gains a Parselmouth /acoustic endpoint (local Python); a build step turns Hillenbrand vowdata.dat into a bundled hillenbrandNorms.json; the Worker proxies extraction + computes the percentile in-TS (pinned to Python); an AcousticViewer dev page mirrors PronunciationViewer.

Tech Stack: Python (Parselmouth/Praat, the new dep), TypeScript (Workers overlay + route, React dev page), Vitest, pytest.

Spec: docs/superpowers/specs/2026-06-05-phon-130-acoustic-analysis-dev-page-design.md. Branch: research/phon-130-acoustic-analysis.

File Structure¶

research/2026-06-05-phon-130-acoustic/build_hillenbrand_norms.py — parse vowdata.dat → hillenbrandNorms.json.
packages/web/workers/src/config/hillenbrandNorms.json — bundled norm distributions (committed; Tier A).
packages/web/workers/src/lib/acousticOverlay.ts — percentile of a value vs a sorted norm array.
packages/web/workers/src/routes/audio.ts — POST /api/audio/acoustic (proxy + overlay).
packages/audio/src/phonolex_audio/acoustic.py — Parselmouth extraction.
packages/audio/src/phonolex_audio/server.py — POST /acoustic.
packages/audio/pyproject.toml — add praat-parselmouth.
packages/web/frontend/src/services/acousticApi.ts — analyzeAcoustic().
packages/web/frontend/src/components/tools/AcousticViewer.tsx (+ test) — the dev page.
packages/web/frontend/src/main.tsx — /dev/acoustic route.

Reuse: the phonolex_audio server multipart pattern, audio.ts proxy + JSON-import patterns, PronunciationViewer dev-page scaffold, the cumulative-percentile formula.

Task 1: Build Hillenbrand norm tables¶

Files: Create research/2026-06-05-phon-130-acoustic/build_hillenbrand_norms.py; Create packages/web/workers/src/config/hillenbrandNorms.json

[ ] Step 1: Inspect the data format first. Read /Volumes/ExternalData2/audio-datasets/hillenbrand_et_al_1995/h95-alldata/{readme.txt,vowdata.dat} (head). Confirm: the speaker-id prefix encodes group (m=men, w=women, b=boys, g=girls), the vowel code column (e.g. ae ah aw eh ei er ih iy oa oo uh uw → IPA æ ɑ ɔ ɛ eɪ ɝ ɪ i oʊ ʊ ʌ u), and the F0/F1/F2/F3 steady-state columns (Hz; 0 = not measured → drop). Note the exact column indices.

[ ] Step 2: Implement the builder (build_hillenbrand_norms.py): parse each row → (group, vowel_ipa, {f0,f1,f2,f3}), drop unmeasured (0) values, and emit sorted ascending arrays per (vowel, group):

"""Hillenbrand 1995 -> hillenbrandNorms.json for the Worker percentile overlay.
{ "vowels": ["i","ɪ",...], "groups": ["men","women","boys","girls"],
  "table": { "<vowel>|<group>": { "f1": [sorted Hz...], "f2": [...], "f3": [...], "f0": [...] } } }
Run: uv run python build_hillenbrand_norms.py"""
import json
from pathlib import Path
DAT = Path("/Volumes/ExternalData2/audio-datasets/hillenbrand_et_al_1995/h95-alldata/vowdata.dat")
OUT = Path(__file__).resolve().parents[2] / "packages/web/workers/src/config/hillenbrandNorms.json"
GROUP = {"m": "men", "w": "women", "b": "boys", "g": "girls"}
VOWEL = {"ae":"æ","ah":"ɑ","aw":"ɔ","eh":"ɛ","ei":"eɪ","er":"ɝ",
         "ih":"ɪ","iy":"i","oa":"oʊ","oo":"ʊ","uh":"ʌ","uw":"u"}
# adapt column parsing to the REAL vowdata.dat layout confirmed in Step 1.
def main():
    acc = {}
    for line in DAT.read_text().splitlines():
        # parse: speaker-id (e.g. 'm01ae'), then F0,F1,F2,F3 steady-state cols
        # group = GROUP[id[0]]; vowel = VOWEL[id[3:5]]; values from the right columns
        ...  # implement against the confirmed layout
    table = {}
    for (vowel, group), d in acc.items():
        table[f"{vowel}|{group}"] = {k: sorted(v) for k, v in d.items() if v}
    OUT.write_text(json.dumps({"vowels": sorted(VOWEL.values()),
                               "groups": list(GROUP.values()), "table": table}))
    print(f"wrote {OUT}: {len(table)} (vowel,group) cells")

[ ] Step 3: Run + verify. cd research/2026-06-05-phon-130-acoustic && uv run python build_hillenbrand_norms.py. Confirm ~48 cells (12 vowels × 4 groups), and sanity-check against published Hillenbrand means: e.g. men's /i/ (iy) median F1 ≈ 340 Hz, F2 ≈ 2240 Hz; women's /ɑ/ (ah) F1 ≈ 920 Hz. Print a couple of medians.

[ ] Step 4: Commit (JSON is a Tier-A derived statistic — ships):

git add research/2026-06-05-phon-130-acoustic/build_hillenbrand_norms.py packages/web/workers/src/config/hillenbrandNorms.json
git commit -m "data(phon-130): Hillenbrand vowel-formant norm tables (hillenbrandNorms.json)"

Task 2: In-Worker percentile overlay (`acousticOverlay.ts`)¶

Files: Create packages/web/workers/src/lib/acousticOverlay.ts; Create packages/web/workers/src/__tests__/acousticOverlay.test.ts

[ ] Step 1: Failing test

import { percentile, overlayFor, type HillenbrandNorms } from '../lib/acousticOverlay';

const NORMS: HillenbrandNorms = {
  vowels: ['i'], groups: ['men'],
  table: { 'i|men': { f1: [300, 320, 340, 360, 380], f2: [2200, 2220, 2240, 2260, 2280], f3: [], f0: [] } },
};

describe('percentile (cumulative bisect_right/N*100)', () => {
  it('is 0 below all, 100 at/above max, ~mid in the middle', () => {
    expect(percentile(290, [300, 320, 340, 360, 380])).toBe(0);
    expect(percentile(340, [300, 320, 340, 360, 380])).toBe(60); // bisect_right=3 -> 3/5*100
    expect(percentile(999, [300, 320, 340, 360, 380])).toBe(100);
  });
  it('empty norm array -> null (unmeasured)', () => {
    expect(percentile(340, [])).toBeNull();
  });
});

describe('overlayFor', () => {
  it('returns per-measure percentiles for the (vowel, group) cell', () => {
    const o = overlayFor({ f1: 340, f2: 2240, f3: 3000, f0: 120 }, 'i', 'men', NORMS);
    expect(o.f1).toBe(60);
    expect(o.f2).toBe(60);
    expect(o.f3).toBeNull(); // empty norms
  });
  it('missing cell -> all null', () => {
    const o = overlayFor({ f1: 340, f2: 2240, f3: 3000, f0: 120 }, 'u', 'boys', NORMS);
    expect(o).toEqual({ f1: null, f2: null, f3: null, f0: null });
  });
});

[ ] Step 2: Run, expect FAIL. cd packages/web/workers && npx vitest run src/__tests__/acousticOverlay.test.ts

[ ] Step 3: Implement

export interface HillenbrandNorms {
  vowels: string[]; groups: string[];
  table: Record<string, { f1: number[]; f2: number[]; f3: number[]; f0: number[] }>;
}
export interface Overlay { f1: number | null; f2: number | null; f3: number | null; f0: number | null; }

/** cumulative percentile: bisect_right(sorted, v) / N * 100. null if no norms. */
export function percentile(v: number, sorted: number[]): number | null {
  if (sorted.length === 0) return null;
  let lo = 0, hi = sorted.length;
  while (lo < hi) { const m = (lo + hi) >> 1; if (sorted[m] <= v) lo = m + 1; else hi = m; }
  return (lo / sorted.length) * 100;
}

export function overlayFor(
  vals: { f1: number; f2: number; f3: number; f0: number },
  vowel: string, group: string, norms: HillenbrandNorms,
): Overlay {
  const cell = norms.table[`${vowel}|${group}`];
  if (!cell) return { f1: null, f2: null, f3: null, f0: null };
  return {
    f1: percentile(vals.f1, cell.f1), f2: percentile(vals.f2, cell.f2),
    f3: percentile(vals.f3, cell.f3), f0: percentile(vals.f0, cell.f0),
  };
}

[ ] Step 4: Run, expect PASS + npx tsc --noEmit.
[ ] Step 5: Fixture pin — add a small research/2026-06-05-phon-130-acoustic/percentile_fixture.py that computes bisect_right-based percentiles for a few (value, group, vowel) cases from hillenbrandNorms.json and a TS test asserting percentile/overlayFor match (the PHON-126/142 pattern). Confirms the Worker matches Python.

[ ] Step 6: Commit

git add packages/web/workers/src/lib/acousticOverlay.ts packages/web/workers/src/__tests__/acousticOverlay.test.ts
git commit -m "feat(phon-130): in-Worker Hillenbrand percentile overlay"

Task 3: Parselmouth extraction (`acoustic.py`)¶

Files: Modify packages/audio/pyproject.toml; Create packages/audio/src/phonolex_audio/acoustic.py; Create packages/audio/tests/test_acoustic.py

[ ] Step 1: Add the dep. In packages/audio/pyproject.toml add "praat-parselmouth" to dependencies. Run uv sync --package phonolex-audio (or the workspace equivalent); confirm uv run python -c "import parselmouth; print(parselmouth.__version__)".

[ ] Step 2: Implement extraction (acoustic.py): audio bytes → features. Steady state = median over the central 40% of the voiced region.

"""Parselmouth acoustic extraction for Model #4. F1-F3 track + steady-state, F0, duration."""
from __future__ import annotations
import io
import numpy as np
import parselmouth
from parselmouth.praat import call

# formant ceiling by group (Praat convention): men 5000, women/children 5500.
CEILING = {"men": 5000.0, "women": 5500.0, "boys": 5500.0, "girls": 5500.0}

def extract(audio_bytes: bytes, group: str = "women") -> dict:
    snd = parselmouth.Sound(io.BytesIO(audio_bytes))  # accepts wav bytes
    dur_ms = round(snd.get_total_duration() * 1000)
    ceiling = CEILING.get(group, 5500.0)
    formant = snd.to_formant_burg(max_number_of_formants=5, maximum_formant=ceiling)
    pitch = snd.to_pitch()
    ts = formant.ts()  # frame times
    f1 = [call(formant, "Get value at time", 1, t, "Hertz", "Linear") for t in ts]
    f2 = [call(formant, "Get value at time", 2, t, "Hertz", "Linear") for t in ts]
    f3 = [call(formant, "Get value at time", 3, t, "Hertz", "Linear") for t in ts]
    f0_track = [pitch.get_value_at_time(t) or float("nan") for t in ts]
    def steady(track):
        a = np.array(track, float); a = a[~np.isnan(a)]
        if a.size == 0: return None
        lo, hi = int(a.size*0.3), int(a.size*0.7) or a.size
        return float(np.median(a[lo:hi] if hi > lo else a))
    return {
        "formants": {"f1": steady(f1), "f2": steady(f2), "f3": steady(f3),
                     "track": {"t": list(ts), "f1": f1, "f2": f2, "f3": f3}},
        "f0": {"value": steady(f0_track), "track": f0_track},
        "duration_ms": dur_ms,
        "group": group,
    }

[ ] Step 3: Test (test_acoustic.py) against a real Hillenbrand stimulus wav (the dataset has wavs, or synthesize a steady tone): assert duration_ms > 0, formants.f1 is a plausible Hz value (200–1200), f0.value plausible (80–400). Use a known Hillenbrand /ɑ/ clip and assert F1 in a sane window. Run uv run python -m pytest packages/audio/tests/test_acoustic.py -v.

[ ] Step 4: Commit

git add packages/audio/pyproject.toml packages/audio/src/phonolex_audio/acoustic.py packages/audio/tests/test_acoustic.py
git commit -m "feat(phon-130): Parselmouth F1-F3/F0/duration extraction"

Task 4: `/acoustic` server endpoint¶

Files: Modify packages/audio/src/phonolex_audio/server.py; Modify packages/audio/tests/test_server.py

[ ] Step 1: Implement. Add POST /acoustic to build_app (multipart audio + optional group form field; default women), calling acoustic.extract(bytes, group). Reuse the existing multipart validation. /health may add acoustic: true.
[ ] Step 2: Test (test_server.py, stub acoustic.extract via monkeypatch — no real Praat in CI): /acoustic returns the feature JSON; missing audio → 400. Run uv run python -m pytest packages/audio/tests/test_server.py -v.

[ ] Step 3: Commit

git add packages/audio/src/phonolex_audio/server.py packages/audio/tests/test_server.py
git commit -m "feat(phon-130): phonolex_audio /acoustic endpoint"

Task 5: `/api/audio/acoustic` Worker route + overlay¶

Files: Modify packages/web/workers/src/routes/audio.ts; Modify packages/web/workers/src/__tests__/audio.test.ts

[ ] Step 1: Implement. Add POST /api/audio/acoustic: multipart (audio, target_vowel, group), proxy to the host /acoustic (reuse the AUDIO_INFERENCE_URL + warming pattern from fetchTranscript), then:

import hillenbrandNorms from '../config/hillenbrandNorms.json';
import { overlayFor, type HillenbrandNorms } from '../lib/acousticOverlay';
// after extraction `ex`:
const steady = { f1: ex.formants.f1, f2: ex.formants.f2, f3: ex.formants.f3, f0: ex.f0.value };
const percentiles = overlayFor(steady, target_vowel, group, hillenbrandNorms as HillenbrandNorms);
return c.json({ ...ex, target_vowel, group, percentiles });

Handle null steady values (unvoiced/failed extraction) → percentiles null, 200 (descriptive, no throw).

[ ] Step 2: Test (mirror the pronounce tests; fetchMock the host /acoustic): validation 400s; a mocked extraction returns + the response carries percentiles. Run npx vitest run + npx tsc --noEmit.

[ ] Step 3: Commit

git add packages/web/workers/src/routes/audio.ts packages/web/workers/src/__tests__/audio.test.ts
git commit -m "feat(phon-130): /api/audio/acoustic proxy + percentile overlay"

Task 6: `analyzeAcoustic()` frontend service¶

Files: Modify packages/web/frontend/src/services/acousticApi.ts (Create); Create test

[ ] Step 1: Failing test — mirror audioApi.pronounce.test.ts: analyzeAcoustic(blob, 'i', 'men') posts multipart, returns the result; 503 → TranscriberWarmingError.
[ ] Step 2: Implement acousticApi.ts (mirror pronounceAudio): multipart POST /api/audio/acoustic with audio/target_vowel/group; AcousticResult type {formants, f0, duration_ms, percentiles, target_vowel, group}; reuse TranscriberWarmingError/freshRequestId/baseUrl.

[ ] Step 3: Run test + tsc. Commit

git add packages/web/frontend/src/services/acousticApi.ts packages/web/frontend/src/services/acousticApi.test.ts
git commit -m "feat(phon-130): analyzeAcoustic frontend service"

Task 7: `AcousticViewer` dev page¶

Files: Read PronunciationViewer.tsx first; Create packages/web/frontend/src/components/tools/AcousticViewer.tsx (+ test); Modify main.tsx

[ ] Step 1: Read PronunciationViewer.tsx — reuse its capture scaffold (record/upload/preloaded, warming-state discriminated union, the clip-injection test mechanism).
[ ] Step 2: Failing component test — render AcousticViewer, set target vowel + group, mock analyzeAcoustic to resolve a result with percentiles.f1=18, supply a clip via the file-upload mechanism, click Analyze, assert the F1 value + its percentile render. (Mirror PronunciationViewer.test.tsx's selectFile approach.)
[ ] Step 3: Implement AcousticViewer.tsx: capture controls (mirror) + target-vowel <Select> (12 vowels) + group <Select> (men/women/boys/girls) → Analyze → display F1–F3/F0/duration each with its percentile (a band: green if 10–90th pct, amber/red outside; null → "no norm"). Register <Route path="/dev/acoustic" element={<AcousticViewer />} /> in main.tsx.
[ ] Step 4: Frontend matrix — npx vitest run && npx tsc --noEmit && npm run build.

[ ] Step 5: Commit

git add packages/web/frontend/src/components/tools/AcousticViewer.tsx packages/web/frontend/src/components/tools/AcousticViewer.test.tsx packages/web/frontend/src/main.tsx
git commit -m "feat(phon-130): /dev/acoustic AcousticViewer page"

Task 8: Praat-parity validation + RESULTS¶

Files: Create research/2026-06-05-phon-130-acoustic/{validate_parity.py,RESULTS.md}

[ ] Step 1: Parity script — validate_parity.py: on a few Hillenbrand stimulus wavs, compare acoustic.extract's steady-state F1–F3 against Praat-direct (parselmouth call on the same settings, OR the published vowdata.dat measured values for that exact stimulus) — assert F1–F3 within ±10 Hz, F0 within ±2 Hz (the umbrella §6 gate). Report pass/fail per clip.
[ ] Step 2: Run it (needs the local stimulus wavs + Parselmouth). Confirm the parity gate.
[ ] Step 3: Write RESULTS.md — parity table + 2–3 example extractions with their Hillenbrand percentiles (a known /i/ from a man should land near its own group's 50th pct).

[ ] Step 4: Commit

git add research/2026-06-05-phon-130-acoustic/{validate_parity.py,RESULTS.md}
git commit -m "research(phon-130): Praat-parity validation + RESULTS"

Task 9: Full matrix¶

[ ] cd packages/web/workers && npx vitest run && npx tsc --noEmit; cd packages/web/frontend && npx vitest run && npx tsc --noEmit && npm run build; uv run python -m pytest packages/audio/tests/.
[ ] Manual (optional): start phonolex_audio (now serving /acoustic) + worker + frontend; on /dev/acoustic upload a vowel, pick the vowel + group, confirm F1–F3/F0 + percentile bands render and a Hillenbrand-matched production lands near the 50th pct.

Self-Review¶

Spec coverage: §3.1 extraction → Task 3; §3.2 norms → Task 1; §3.3 Worker proxy + overlay → Tasks 2,5; §3.4 dev page → Tasks 6,7; §4 validation → Task 8; §2 scope (vowel core, target-vowel+group) → Tasks 5,7; out-of-scope (VOT/COG, judgment) honored. ✓

Placeholder scan: Task 1's parse + Task 3's stimulus assertion adapt to the real vowdata.dat layout / Parselmouth output (inspect-first noted) — inherent to data-dependent research code; the overlay (Task 2) and route/service/viewer (Tasks 5–7) are complete code or pinned tests.

Type/name consistency: HillenbrandNorms/Overlay/percentile/overlayFor consistent Tasks 1–5; AcousticResult {formants:{f1,f2,f3,track}, f0:{value,track}, duration_ms, percentiles, target_vowel, group} consistent across server/route/service/viewer; group strings men/women/boys/girls, the 12 vowel IPA symbols, consistent across the norm table, route, and selectors.

PHON-130 — Acoustic Analysis /dev/acoustic — Implementation Plan¶

File Structure¶

Task 1: Build Hillenbrand norm tables¶

Task 2: In-Worker percentile overlay (acousticOverlay.ts)¶

Task 3: Parselmouth extraction (acoustic.py)¶

Task 4: /acoustic server endpoint¶

Task 5: /api/audio/acoustic Worker route + overlay¶

Task 6: analyzeAcoustic() frontend service¶

Task 7: AcousticViewer dev page¶