Tools Overview¶
PhonoLex ships five clinician-facing tools that share a constraint vocabulary (phoneme patterns, CV shapes, psycholinguistic-norm bounds, contrastive pairs) and operate over the same data spine (CMU lexicon, in-house norms, learned phoneme vectors, curated corpus).
Custom Word Lists¶
Build word lists by composing rules across the SLP-curated property space:
- Phoneme patterns: STARTS_WITH, ENDS_WITH, CONTAINS, CONTAINS_MEDIAL — with include / exclude mode per rule
- Property filters: ~150 psycholinguistic dimensions, surfaced as filterable sliders + bucket chips in five SLP-curated accordions (Word Shape, Age Appropriateness, Imagery & Familiarity, Emotional Tone, Word Frequency)
- CV shape: filter by syllable structure (CV, CVC, CCVC, CV-CVC, ...)
- Sound similarity: anchor on a target word and find phonetically similar candidates with adjustable onset / nucleus / coda weights
- AND logic: all active rules apply jointly
Targets the ~47K canonical content-POS subset (NOUN / VERB / ADJ / ADV) by default.
Text Analysis¶
Analyze passages with per-word property overlay:
- Aggregate statistics across the active property
- Click-to-open word profile for any word in the passage
- Color-gradient highlighting by selected property (frequency, AoA, concreteness, etc.)
- Coverage tracking — which words are in-vocab vs. out-of-vocab
Contrast Sets¶
Browse evidence-based contrastive intervention sets across the ~642K minimal pairs in the lexicon:
- Minimal pairs — two words differing by one phoneme at one position
- Maximal opposition (Gierut 1989) — minimal pair with phonemes that also cross the sonorant class
- Multiple opposition (Williams 2000) — a substitute phoneme contrasted with N target phonemes
Filter by position (initial / medial / final / any) and inspect per-pair feature distance + sonorant-difference.
Lookup¶
Per-word profile surface:
- Full phonological + psycholinguistic profile (~150 norm columns where available)
- Phoneme inventory with the 26-d learned feature vectors
- Phoneme-by-phoneme feature comparison
- Neighboring words via Qwensim semantic similarity (~1.6M edges)
- Phonologically-similar words via soft-Levenshtein on learned vectors
- Per-property percentile ranks
Targets the full ~125K CMU-phonology vocabulary.
Sentences¶
Retrieve naturalistic English sentences that satisfy the same constraint vocabulary as the other tools, drawn from a curated ~236K-sentence corpus (CoLA, UD-EWT, GUM, Tatoeba, OpenSubtitles) gated for SLP suitability:
- Phoneme patterns / CV shapes / norm bounds — same rules as Custom Word Lists, applied at the sentence level
- Contrastive pairs — sentence must contain BOTH members of at least one minimal pair / max-opposition / multiple-opposition witness
- Per-result word-highlight overlay — pair witnesses and include-rule hits underlined; click any word to open its profile
- Tiered ranking — sentences with more constraint-satisfying words rank higher, with rarity as the within-tier tiebreaker
See the Sentences guide for the full constraint catalog.