Skip to content
Spelo

Guide · 2026

How to add voice to your website

There are four realistic ways to make your website talk in 2026. Three of them are overkill for most sites, and one has a free tier. Here is the honest comparison — so you pick the right tool instead of the loudest one.

What “add voice” actually means

The phrase is ambiguous. It can mean three different things:

  • Voice output (read-aloud): a button that narrates your page content, usually for accessibility.
  • Voice input (dictation): a mic icon in a form that transcribes what the user says into text.
  • Voice conversation (AI agent): a two-way spoken exchange — visitor talks, AI answers, page updates in real time. This is what Spelo does.

Most businesses want the third but implement the first, then wonder why engagement did not change. Read-aloud is table stakes. Conversation is what converts.

The four approaches, compared

Approach Cost Voice quality Interactivity Setup Best for
Browser text-to-speech Free Robotic None — read-only ~10 lines of JavaScript Accessibility read-aloud only
Chatbot + TTS add-on $20–$100/mo OK Text in, audio out Two tools to maintain Teams already using a chatbot
Hosted voice widget (e.g., Spelo) $0–$499/mo Natural Full-duplex voice One script tag Any site that wants a real conversation
Custom voice agent $5,000+ build, $500+/mo run Natural Whatever you build Weeks of engineering Enterprises with unique requirements

Option 1 — browser text-to-speech

Every modern browser ships with the SpeechSynthesis API. Ten lines of JavaScript and you can make any page read itself aloud. It is free, it works offline, and it sounds exactly like a 2010-era GPS. Use it if your only goal is a read-aloud accessibility button.

Option 2 — chatbot plus a TTS add-on

Take an existing text chatbot (Intercom, Drift, Tidio) and layer a TTS service on top so the bot can speak its replies. You get a voice that sounds better than browser TTS and a familiar chat UI. You also get two systems to pay for, two consoles to manage, and a conversation that still feels like typing in disguise — the visitor has to wait for the bot to write before it speaks.

Option 3 — a hosted voice widget

Paste one script tag. Get full-duplex voice — the visitor can interrupt the agent, the agent can scroll the page, fill filters, and answer in context. This is what Spelo does. Free tier for small sites, usage-based paid plans for growing businesses.

The install is literally this:

<script src="https://spelo.ai/widget.js"
        data-site-id="YOUR_SITE_ID"
        async></script>

That is the whole integration. The widget reads your page content on demand — no scheduled indexing, no database sync jobs. When you update a product or publish a blog post, the voice agent knows about it instantly.

Option 4 — custom voice agent

Build the whole thing yourself: a speech-to-text model, a large language model, a text-to-speech model, and realtime orchestration to make them interrupt each other naturally. Expect six to twelve weeks of engineering, a four-figure monthly cloud bill, and a latency budget of under 300 ms round-trip to feel human. Worth it for enterprises with unique requirements — overkill for a small business.

Which one should you pick?

  • Accessibility only: browser TTS. Free, no vendor.
  • Small business site, blog, or landing page: hosted voice widget. Spelo free tier covers most small sites.
  • E-commerce or lead-gen site: hosted voice widget with database connection. Spelo Starter or Pro.
  • Enterprise with unusual rules (healthcare, fintech): custom build or Spelo Enterprise with dedicated infrastructure.

Common questions

FAQ

What people usually ask

Still curious? Email hello@spelo.ai — usually a same-day reply.

Can AI help me with my website?

Yes — in three main ways: a chatbot that handles typed questions, a text-to-speech reader that narrates pages, or a full voice assistant like Spelo that talks back, searches your content, and takes actions. Voice assistants convert better because they answer faster and feel personal.

How do I create an AI assistant for a website?

The fastest path is a hosted widget you embed with one script tag — Spelo takes about two minutes. Building from scratch requires a speech-to-text model, a large language model, a text-to-speech model, and realtime orchestration — typically weeks of engineering.

How do I get a website to read aloud?

For plain read-aloud, browsers include the SpeechSynthesis API for free — but it sounds robotic and does not answer questions. For natural voice plus conversation, use an AI voice widget like Spelo.

How do I add voice input in HTML?

The Web Speech API supports voice input in Chrome, Edge, and Safari with a few lines of JavaScript. It handles raw transcription but not conversation. Spelo handles transcription, response, and page actions in one drop-in widget.

What is the cheapest way to add voice to a website?

Free: browser-native SpeechSynthesis (for read-aloud) and Web Speech API (for input). Those get you robot-voice narration and transcription but no intelligence. For full voice conversation, Spelo starts at $0 for small sites with monthly plans above that.

Is Spelo WCAG accessible?

Yes. Voice input is a major accessibility win for visitors with motor or vision impairments. Spelo also falls back to a text chat for users who prefer it, and respects prefers-reduced-motion for the orb animation.

Try the voice widget on your site today.

Paste one script tag, talk to your own website. Free tier, no credit card.