Pronunciation dictionary
OpenAI’s Realtime voices guess pronunciation from spelling. For common English words they’re great. For brand names, SKUs, acronyms, and foreign words, they fail — sometimes badly. The pronunciation dictionary lets you override specific words with phonetic spellings.
How it works
Each entry is a pair:
interface PronunciationEntry { word: string; // the word to override say_as: string; // the phonetic spelling}At session start, Spelo embeds your dictionary into the system prompt:
When you say the word “Hodos360”, pronounce it as “HOE-dose three-sixty”. When you say “Pacific Realty”, pronounce it as “Pacific Reel-tee” (two syllables, not three).
The model follows these instructions reliably.
Writing phonetic spellings
You don’t need IPA. Write it the way you’d write it in an email to someone phoneticizing a name for the first time:
| Word | Say as | Why |
|---|---|---|
Hodos360 | HOE-dose three-sixty | The AI otherwise says “Ha-doss three hundred sixty” |
Atlas Legal | AT-las LEE-gul | Distinguishes from Atlas-as-in-map |
Pho | fuh | Vietnamese noodle soup, not “foe” |
macOS | mac-oh-ess | Not “mac-oss” |
Réalité | ray-ah-lee-tay | French brand name |
EMR | ee-em-are | Acronym; prevent the AI from pronouncing as a word |
Guidelines:
- Use hyphens between syllables. They become micro-pauses in speech.
- Capitalize stressed syllables.
HOE-dosenothoe-DOSE. - Don’t use IPA symbols. The AI gets confused. Plain English approximations work better.
- Test out loud. Read your
say_asaloud; if it sounds right, it’ll sound right to the AI.
Editing in the dashboard
- Dashboard → Voice → Pronunciations
- Click Add entry
- Fill in
word(the thing in your data) andsay_as(how to pronounce it) - Save
- Test by clicking Preview voice and typing a sentence that contains the word
The preview uses the same system prompt as live sessions, so what you hear is what visitors will hear.
Bulk import
For dozens of entries (e.g. a real-estate firm with 200 street names), the dashboard has a Bulk import button that accepts CSV:
word,say_asHodos360,HOE-dose three-sixtyLa Cienega,lah see-EN-uh-guhSepulveda,suh-PUHL-vuh-duhLos Feliz,los FEE-lissProgrammatic updates
Update via the Sites API:
curl -X PATCH https://api.spelo.ai/v1/sites/<site_id> \ -H "Authorization: Bearer $SPELO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "pronunciations": [ { "word": "Hodos360", "say_as": "HOE-dose three-sixty" }, { "word": "La Cienega", "say_as": "lah see-EN-uh-guh" } ] }'Scope and limits
- Entries are per-site. If you operate multiple sites, pronunciations don’t leak across.
- Maximum 500 entries per site (Pro plan and above). Starter plan caps at 50.
- The dictionary is injected into the system prompt, which consumes tokens. At 500 entries you’re spending ~3K tokens per session on pronunciations alone. We automatically deduplicate and sort by frequency in your content to prioritize the most-used entries if you hit the cap.
When pronunciation doesn’t stick
Occasionally the AI will still mispronounce a word despite the dictionary. Usually one of:
- The word is embedded in a URL or code. The AI spells out URLs character-by-character; pronunciations don’t apply.
- The phonetic spelling itself is ambiguous. Try a different spelling.
HOE-doseworks;HODOSEbecomes “ho-dose-ee”. - Model limitation. Rarely, the model just gets stuck. Report at github.com/spelo/spelo/issues — we maintain a list of known problematic names and lobby OpenAI to improve them.