API Reference¶

PhonoLex provides a public REST API for programmatic access to its full dataset. No API key required.

Base URL: https://phonolex.com/api

Interactive docs: Available at /docs (Swagger UI) and /redoc (ReDoc).

Deployment: Cloudflare Workers + D1 (edge-deployed).

Quick Examples¶

Python¶

import requests

BASE = "https://phonolex.com/api"

# Look up a word
word = requests.get(f"{BASE}/words/cat").json()
print(word["ipa"], word["frequency"], word["concreteness"])

# Search for CVC words with high frequency
results = requests.post(f"{BASE}/words/search", json={
    "patterns": [{"type": "STARTS_WITH", "phoneme": "k"}],
    "filters": {"min_frequency": 50, "max_syllable_count": 1},
    "sort_by": "frequency",
    "limit": 20
}).json()
for w in results["items"]:
    print(w["word"], w["frequency"])

curl¶

# Health check
curl https://phonolex.com/api/health

# Word lookup
curl https://phonolex.com/api/words/cat

# Search
curl -X POST https://phonolex.com/api/words/search \
  -H "Content-Type: application/json" \
  -d '{"filters": {"min_concreteness": 4.5, "max_syllable_count": 2}, "limit": 10}'

R¶

library(httr)
library(jsonlite)

base <- "https://phonolex.com/api"

# Word lookup
word <- fromJSON(content(GET(paste0(base, "/words/cat")), "text"))

# Batch lookup
batch <- fromJSON(content(POST(
  paste0(base, "/words/batch"),
  body = toJSON(list(words = c("cat", "dog", "fish")), auto_unbox = TRUE),
  content_type_json()
), "text"))

Endpoints¶

Meta¶

`GET /api/health`¶

Health check with vocabulary stats.

{
  "status": "healthy",
  "vocabulary_size": 44011,
  "total_edges": 1012327
}

`GET /api/stats`¶

Full statistics including edge type counts and property coverage.

`GET /api/property-metadata`¶

Property definitions with labels, categories, sources, and display configuration. Use this to dynamically build UIs or understand what each property means.

`GET /api/property-ranges`¶

Min/max values for all numeric properties. Useful for building filter sliders.

`GET /api/edge-types`¶

Edge type definitions with labels and descriptions for the 7 relationship types.

Words¶

`GET /api/words/{word}`¶

Get full word data with all properties and percentile ranks.

Example: GET /api/words/cat

{
  "word": "cat",
  "ipa": "kæt",
  "phonemes": ["k", "æ", "t"],
  "syllables": [{"onset": ["k"], "nucleus": "æ", "coda": ["t"], "stress": 1}],
  "phoneme_count": 3,
  "syllable_count": 1,
  "frequency": 57.39,
  "frequency_percentile": 95.8,
  "concreteness": 5.0,
  "concreteness_percentile": 91.2,
  "valence": 6.34,
  "aoa": 3.72,
  ...
}

All properties are returned (null if unavailable for that word), plus {property}_percentile fields (0–100, cumulative percentile rank). 35 properties are filterable via the API; additional structural/derived fields are also included.

`GET /api/words`¶

Browse the vocabulary with pagination and sorting.

Parameter	Type	Default	Description
`sort_by`	string	null	Property to sort by (e.g. `frequency`, `aoa`)
`sort_order`	string	`desc`	`asc` or `desc`
`limit`	int	50	Max items (1–5000)
`offset`	int	0	Items to skip

Response: { items: [...], total: 44011, offset: 0, limit: 50 }

`POST /api/words/search`¶

Unified search combining phoneme patterns, property filters, exclusion rules, sorting, and pagination. This is the primary search endpoint.

Request body:

{
  "patterns": [
    {"type": "STARTS_WITH", "phoneme": "k"},
    {"type": "ENDS_WITH", "phoneme": "t"}
  ],
  "filters": {
    "min_frequency": 10,
    "max_syllable_count": 2,
    "min_concreteness": 3.0
  },
  "exclude_phonemes": ["ʃ", "ʒ"],
  "sort_by": "frequency",
  "sort_order": "desc",
  "limit": 50,
  "offset": 0
}

Pattern types:

Type	Description	Example
`STARTS_WITH`	Word begins with phoneme(s)	`"k"` matches cat, keep, kind
`ENDS_WITH`	Word ends with phoneme(s)	`"t"` matches cat, sit, want
`CONTAINS`	Word contains phoneme(s) anywhere	`"æ"` matches cat, bat, happy
`CONTAINS_MEDIAL`	Contains phoneme(s) in medial position	`"æ"` matches happy (not cat)

Phonemes use IPA notation. Multiple phonemes in a sequence are space-separated: "s t" matches words containing the /st/ cluster.

Filter fields: min_{property} and max_{property} for any of the 35 filterable properties. Multiple filters use AND logic. See GET /api/property-metadata for the full list.

Response: Same paginated format as GET /api/words.

`POST /api/words/batch`¶

Look up multiple words at once. Unknown words are silently omitted.

{"words": ["cat", "dog", "fish", "xyzzy"]}

Returns an array of word objects (max 1000 words per request).

Similarity¶

`POST /api/similarity/search`¶

Find phonologically similar words using soft Levenshtein distance on learned feature vectors.

Request body:

{
  "word": "cat",
  "threshold": 0.7,
  "limit": 20,
  "onset_weight": 0.33,
  "nucleus_weight": 0.33,
  "coda_weight": 0.33
}

Parameter	Type	Default	Description
`word`	string	required	Target word
`threshold`	float	0.7	Minimum similarity (0–1)
`limit`	int	50	Max results (1–500)
`onset_weight`	float	0.33	Weight for onset similarity
`nucleus_weight`	float	0.33	Weight for nucleus similarity
`coda_weight`	float	0.33	Weight for coda similarity

Weight presets:

Preset	Onset	Nucleus	Coda	Use case
Balanced	0.33	0.33	0.33	Overall similarity
Rhymes	0.0	0.5	0.5	Rhyming words
Alliteration	1.0	0.5	0.0	Same initial sound
Assonance	0.0	1.0	0.0	Matching vowels
Consonance	0.5	0.0	0.5	Matching consonants

Response:

[
  {
    "word": { "word": "bat", "ipa": "bæt", ... },
    "similarity": 0.92
  },
  ...
]

Associations¶

`GET /api/associations/{word}`¶

Get cognitive associations from the graph. Returns edges from up to 6 relationship types.

Parameter	Type	Default	Description
`edge_types`	string	all	Comma-separated types: `USF`, `MEN`, `ECCC`, `SPP`, `SimLex`, `WordSim`
`limit`	int	50	Max edges
`offset`	int	0	Pagination offset

Example: GET /api/associations/cat?edge_types=USF,ECCC&limit=10

Response:

{
  "word": "cat",
  "associations": [
    {
      "target": "dog",
      "edge_sources": ["USF"],
      "in_vocabulary": true,
      "usf_forward": 0.178
    },
    ...
  ],
  "total": 12,
  "edge_type_counts": {"USF": 5, "ECCC": 2}
}

`GET /api/associations/{word}/confusability`¶

Get ECCC perceptual confusability edges only (words confused in noise).

`GET /api/associations/compare`¶

Compare shared associations between two words.

Parameter	Type	Description
`word1`	string	First word
`word2`	string	Second word

Returns shared targets, Jaccard similarity, and degree for each word.

Phonemes¶

`GET /api/phonemes`¶

List all 39 English phonemes with their articulatory features.

`GET /api/phonemes/{ipa}`¶

Get features for a single phoneme (38 distinctive features). ASCII g is automatically normalized to IPA ɡ (U+0261).

Example: GET /api/phonemes/k

{
  "ipa": "k",
  "type": "consonant",
  "features": {
    "consonantal": "+",
    "sonorant": "-",
    "continuant": "-",
    "dorsal": "+",
    ...
  }
}

`POST /api/phonemes/compare`¶

Compare two phonemes feature by feature.

{"phoneme1": "k", "phoneme2": "ɡ"}

Returns shared features, differing features, and a similarity score.

`POST /api/phonemes/search`¶

Find phonemes matching specific feature values.

{"features": {"consonantal": "+", "dorsal": "+", "sonorant": "-"}}

Returns all phonemes matching the given feature constraints.

Contrastive Sets¶

`POST /api/contrastive/minimal-pairs`¶

Find minimal pairs for a phoneme contrast.

{
  "phoneme1": "k",
  "phoneme2": "ɡ",
  "position": "initial",
  "limit": 20
}

Parameter	Type	Default	Description
`phoneme1`	string	required	First phoneme (IPA)
`phoneme2`	string	required	Second phoneme (IPA)
`position`	string	null	`initial`, `medial`, `final`, or null for any
`limit`	int	50	Max pairs (1–500)

Response:

[
  {
    "word1": { "word": "cap", ... },
    "word2": { "word": "gap", ... },
    "position": 0,
    "phoneme1": "k",
    "phoneme2": "ɡ"
  },
  ...
]

`POST /api/contrastive/maximal-opposition/pairs`¶

Generate maximally opposed phoneme pairs from a list of unknown phonemes (Gierut 1989–1992). Returns pairs ranked by feature distance.

{
  "unknown_phonemes": ["k", "ɡ", "t", "d"],
  "top_n": 5
}

`POST /api/contrastive/maximal-opposition/word-lists`¶

Find word pairs for a specific maximal opposition phoneme pair.

{
  "phoneme1": "k",
  "phoneme2": "m",
  "position": "initial",
  "max_pairs": 10
}

`POST /api/contrastive/multiple-opposition/targets`¶

Select representative target phonemes for multiple opposition therapy (Maximal Classification + Maximal Distinction).

{
  "substitute_phoneme": "t",
  "target_phonemes": ["k", "ɡ", "d", "s"],
  "count": 3
}

`POST /api/contrastive/multiple-opposition/sets`¶

Generate minimal sets (triplets/quadruplets) for multiple opposition therapy.

{
  "substitute_phoneme": "t",
  "target_phonemes": ["k", "d"],
  "position": "initial",
  "max_sets": 10
}

Text Analysis¶

`POST /api/text/analyze`¶

Analyze a passage for phonological and psycholinguistic properties.

{"text": "The quick brown fox jumps over the lazy dog."}

Response:

{
  "total_words": 9,
  "analyzed_words": 9,
  "unknown_words": [],
  "coverage_percent": 100.0,
  "aggregate_percentiles": {
    "frequency_percentile": 89.2,
    "concreteness_percentile": 54.1,
    "aoa_percentile": 31.7,
    ...
  },
  "word_details": [
    {
      "word": "quick",
      "percentiles": {
        "frequency_percentile": 82.1,
        "concreteness_percentile": 32.5,
        ...
      }
    },
    ...
  ]
}

aggregate_percentiles are weighted averages across all analyzed words. word_details gives per-word percentiles for highlighting and drill-down.

Properties¶

Properties available on word objects, grouped by category (35 are filterable via the API; additional derived fields are included in responses):

Category	Properties
Phonological Complexity	`syllable_count`, `phoneme_count`, `wcm_score`
Phonotactic Probability	`phono_prob_avg`, `positional_prob_avg`
Lexical	`frequency`, `log_frequency`, `contextual_diversity`, `prevalence`, `aoa`, `aoa_kuperman`, `elp_lexical_decision_rt`
Semantic	`imageability`, `familiarity`, `concreteness`, `size`
Affective	`valence`, `arousal`, `dominance`
Cognitive / Embodied	`iconicity`, `boi`, `socialness`
Sensorimotor — Perceptual	`auditory`, `visual`, `haptic`, `gustatory`, `olfactory`, `interoceptive`
Sensorimotor — Action	`hand_arm`, `foot_leg`, `head`, `mouth`, `torso`
Morphological	`morpheme_count`, `is_monomorphemic`, `n_prefixes`, `n_suffixes`

Each numeric property also has a {property}_percentile field (0–100) representing the cumulative percentile rank within the vocabulary.

Error Handling¶

Status	Meaning
`200`	Success
`404`	Word or phoneme not found
`422`	Validation error (bad request body)
`429`	Rate limit exceeded (check `Retry-After` header)
`500`	Server error

Error responses include a detail field with a human-readable message.

API Reference¶

Quick Examples¶

Python¶

curl¶

R¶

Endpoints¶

Meta¶

GET /api/health¶

GET /api/stats¶

GET /api/property-metadata¶

GET /api/property-ranges¶

GET /api/edge-types¶

Words¶

GET /api/words/{word}¶

GET /api/words¶

POST /api/words/search¶

POST /api/words/batch¶

Similarity¶

POST /api/similarity/search¶

Associations¶

GET /api/associations/{word}¶

GET /api/associations/{word}/confusability¶

GET /api/associations/compare¶

Phonemes¶

GET /api/phonemes¶

GET /api/phonemes/{ipa}¶

POST /api/phonemes/compare¶

POST /api/phonemes/search¶

Contrastive Sets¶

POST /api/contrastive/minimal-pairs¶

POST /api/contrastive/maximal-opposition/pairs¶

POST /api/contrastive/maximal-opposition/word-lists¶

POST /api/contrastive/multiple-opposition/targets¶

POST /api/contrastive/multiple-opposition/sets¶

Text Analysis¶

POST /api/text/analyze¶

Properties¶

Error Handling¶

`GET /api/health`¶

`GET /api/stats`¶

`GET /api/property-metadata`¶

`GET /api/property-ranges`¶

`GET /api/edge-types`¶

`GET /api/words/{word}`¶

`GET /api/words`¶

`POST /api/words/search`¶

`POST /api/words/batch`¶

`POST /api/similarity/search`¶

`GET /api/associations/{word}`¶

`GET /api/associations/{word}/confusability`¶

`GET /api/associations/compare`¶

`GET /api/phonemes`¶

`GET /api/phonemes/{ipa}`¶

`POST /api/phonemes/compare`¶

`POST /api/phonemes/search`¶

`POST /api/contrastive/minimal-pairs`¶

`POST /api/contrastive/maximal-opposition/pairs`¶

`POST /api/contrastive/maximal-opposition/word-lists`¶

`POST /api/contrastive/multiple-opposition/targets`¶

`POST /api/contrastive/multiple-opposition/sets`¶

`POST /api/text/analyze`¶