Skip to content

Tools Overview

PhonoLex ships five clinician-facing tools that share a constraint vocabulary (phoneme patterns, CV shapes, psycholinguistic-norm bounds, contrastive pairs) and operate over the same data spine (CMU lexicon, in-house norms, learned phoneme vectors, curated corpus).

Custom Word Lists

Build word lists by composing rules across the SLP-curated property space:

  • Phoneme patterns: STARTS_WITH, ENDS_WITH, CONTAINS, CONTAINS_MEDIAL — with include / exclude mode per rule
  • Property filters: ~150 psycholinguistic dimensions, surfaced as filterable sliders + bucket chips in five SLP-curated accordions (Word Shape, Age Appropriateness, Imagery & Familiarity, Emotional Tone, Word Frequency)
  • CV shape: filter by syllable structure (CV, CVC, CCVC, CV-CVC, ...)
  • Sound similarity: anchor on a target word and find phonetically similar candidates with adjustable onset / nucleus / coda weights
  • AND logic: all active rules apply jointly

Targets the ~47K canonical content-POS subset (NOUN / VERB / ADJ / ADV) by default.

Text Analysis

Analyze passages with per-word property overlay:

  • Aggregate statistics across the active property
  • Click-to-open word profile for any word in the passage
  • Color-gradient highlighting by selected property (frequency, AoA, concreteness, etc.)
  • Coverage tracking — which words are in-vocab vs. out-of-vocab

Contrast Sets

Browse evidence-based contrastive intervention sets across the ~642K minimal pairs in the lexicon:

  • Minimal pairs — two words differing by one phoneme at one position
  • Maximal opposition (Gierut 1989) — minimal pair with phonemes that also cross the sonorant class
  • Multiple opposition (Williams 2000) — a substitute phoneme contrasted with N target phonemes

Filter by position (initial / medial / final / any) and inspect per-pair feature distance + sonorant-difference.

Lookup

Per-word profile surface:

  • Full phonological + psycholinguistic profile (~150 norm columns where available)
  • Phoneme inventory with the 26-d learned feature vectors
  • Phoneme-by-phoneme feature comparison
  • Neighboring words via Qwensim semantic similarity (~1.6M edges)
  • Phonologically-similar words via soft-Levenshtein on learned vectors
  • Per-property percentile ranks

Targets the full ~125K CMU-phonology vocabulary.

Sentences

Retrieve naturalistic English sentences that satisfy the same constraint vocabulary as the other tools, drawn from a curated ~236K-sentence corpus (CoLA, UD-EWT, GUM, Tatoeba, OpenSubtitles) gated for SLP suitability:

  • Phoneme patterns / CV shapes / norm bounds — same rules as Custom Word Lists, applied at the sentence level
  • Contrastive pairs — sentence must contain BOTH members of at least one minimal pair / max-opposition / multiple-opposition witness
  • Per-result word-highlight overlay — pair witnesses and include-rule hits underlined; click any word to open its profile
  • Tiered ranking — sentences with more constraint-satisfying words rank higher, with rarity as the within-tier tiebreaker

See the Sentences guide for the full constraint catalog.