Tools Overview¶

PhonoLex ships five text-based clinician-facing tools that share a constraint vocabulary (phoneme patterns, CV shapes, psycholinguistic-norm bounds, contrastive pairs) and operate over the same data spine (CMU lexicon, in-house norms, learned phoneme vectors, curated corpus). A sixth tool, Speech Analysis (Beta), adds audio-based production analysis over the same learned feature space.

Custom Word Lists¶

Build word lists by composing rules across the SLP-curated property space:

Phoneme patterns: STARTS_WITH, ENDS_WITH, CONTAINS, CONTAINS_MEDIAL — with include / exclude mode per rule
Property filters: ~150 psycholinguistic dimensions, surfaced as filterable sliders + bucket chips in five SLP-curated accordions (Word Shape, Age Appropriateness, Imagery & Familiarity, Emotional Tone, Word Frequency)
CV shape: filter by syllable structure (CV, CVC, CCVC, CV-CVC, ...)
Sound similarity: anchor on a target word and find phonetically similar candidates with adjustable onset / nucleus / coda weights
Picture cards: a "has picture card" rule restricts results to words with a shipped image (Mulberry Symbols + OpenMoji, CC BY-SA, ~1,700 words); matching results render their picture card
AND logic: all active rules apply jointly

Targets the ~47K canonical content-POS subset (NOUN / VERB / ADJ / ADV) by default.

Text Analysis¶

Analyze passages with per-word property overlay:

Aggregate statistics across the active property
Click-to-open word profile for any word in the passage
Color-gradient highlighting by selected property (frequency, AoA, concreteness, etc.)
Coverage tracking — which words are in-vocab vs. out-of-vocab

Contrast Sets¶

Browse evidence-based contrastive intervention sets across the ~642K minimal pairs in the lexicon:

Minimal pairs — two words differing by one phoneme at one position
Maximal opposition (Gierut 1989) — minimal pair with phonemes that also cross the sonorant class
Multiple opposition (Williams 2000) — a substitute phoneme contrasted with N target phonemes

Filter by position (initial / medial / final / any) and inspect per-pair feature distance + sonorant-difference. A "Picture pairs only" toggle restricts results to pairs where both members have picture-card imagery.

Lookup¶

Per-word profile surface:

Full phonological + psycholinguistic profile (~150 norm columns where available)
Picture card where imagery exists (Mulberry Symbols / OpenMoji)
Phoneme inventory with the 26-d learned feature vectors
Phoneme-by-phoneme feature comparison
Neighboring words via Qwensim semantic similarity (~1.6M edges)
Phonologically-similar words via soft-Levenshtein on learned vectors
Per-property percentile ranks

Targets the full ~125K CMU-phonology vocabulary.

Sentences¶

Retrieve naturalistic English sentences that satisfy the same constraint vocabulary as the other tools, drawn from a curated ~236K-sentence corpus (CoLA, UD-EWT, GUM, Tatoeba, OpenSubtitles) gated for SLP suitability:

Phoneme patterns / CV shapes / norm bounds — same rules as Custom Word Lists, applied at the sentence level
Contrastive pairs — sentence must contain BOTH members of at least one minimal pair / max-opposition / multiple-opposition witness
Per-result word-highlight overlay — pair witnesses and include-rule hits underlined; click any word to open its profile
Tiered ranking — sentences with more constraint-satisfying words rank higher, with rarity as the within-tier tiebreaker

See the Sentences guide for the full constraint catalog.

Speech Analysis (Beta)¶

Record or upload a spoken production against a target word and get back a faithful narrow transcript and a per-position deviation overlay; a session of productions adds a secondary speaker-pattern read:

Faithful transcript + deviation overlay — what was actually said, aligned to the target, each position graded by how far it drifted, with the nearest sound (and substitutions flagged)
Record / upload / batch — single clips or a verified, editable batch of files; targets are checked against the lexicon
Source attribution (bonus) — whether the session patterns like typical / accent / developmental / motor speech, sharpening with more productions
Decision support, clinician-in-the-loop — structured evidence for interpretation, not a diagnosis (Beta)

See the Speech Analysis guide and the Audio Model reference.