Skip to content
GitHub
Get started →

Adapter failures and failover

A connected data adapter is a live dependency. Real databases go down — replicas lag, connection pools exhaust, cloud providers have incidents. This page covers what happens in a Spelo voice call when that occurs and what you can do about it.

What the visitor experiences when the adapter is down

The agent does not crash. It gracefully degrades to DOM-only mode: answers come from the current page’s content (read_page) and the crawled knowledge base (search_knowledge_base), but specific database queries fail soft.

A typical visible failure mode:

Visitor: "Show me 2-bedroom apartments in West Hollywood."
Agent: "I'm having trouble pulling live listings right now, but
our latest published list is on our /apartments page —
I can take you there now."

The agent knows the tool failed (it sees the error in the tool-result envelope), explains it in human terms, and offers a fallback. No technical jargon, no crash, no dropped call.

The failure detection path

Visitor speaks
AI calls search_database (or /v1/<siteId>/query)
Spelo API → your adapter's search() function
Adapter throws / times out / returns malformed
Spelo API returns:
{ "success": true,
"data": { "success": false, "error": "Connection refused" } }
Browser widget forwards to model
Model's system prompt says: "If a tool returns success:false,
apologize briefly + offer the DOM/KB fallback + don't keep retrying."

The wrapper-level success: true means “the API call itself succeeded” — the inner data.success: false means “the adapter said no.” This distinction lets your monitoring pick out adapter outages from API outages.

Retry behavior

At the adapter layer:

AdapterDefault retriesBackoffTimeout per attempt
Postgres, MySQL, MSSQL1 retry200 ms fixed5 s
MongoDB2 retries100 ms, 400 ms4 s
Shopify (Storefront API)1 retry on 5xx only500 ms6 s
Airtable2 retries on 429 (honors Retry-After)per header5 s
Google Sheets1 retry1 s8 s
Firebase1 retry500 ms4 s
REST adapter1 retry on 5xx + network errors500 msdepends on config timeout_ms
Webhook adapter0 retries by default (your endpoint controls this)depends on config

After retries exhaust, the API returns data.success: false with the last error.

At the model layer:

The system prompt instructs the AI to not call search_database again in the same turn after a failure. This avoids tying up the call with retries while the visitor waits in silence. The AI moves on to fallback paths immediately.

Monitoring — the kb-health endpoint

Spelo exposes a health endpoint per site that summarizes adapter status:

Terminal window
curl https://api.spelo.ai/v1/sites/:id/kb-health \
-H "Authorization: Bearer vk_live_..."

Returns:

{
"success": true,
"data": {
"site_id": "ab1c2d3e",
"provider": "openai",
"adapter": {
"type": "postgres",
"status": "healthy",
"last_check": "2026-05-15T07:30:00Z",
"last_success": "2026-05-15T07:30:00Z",
"consecutive_failures": 0
},
"knowledge_base": {
"total_pages": 24,
"total_chunks": 412,
"last_crawl": "2026-05-14T22:00:00Z"
},
"voice_relay_deployed": true,
"gemini_runtime_broken": false
}
}

Hook this into your monitoring (Datadog, Grafana, BetterStack, internal). Alert on adapter.status != "healthy" or consecutive_failures > 3.

status values:

ValueMeaning
healthyLast ping succeeded within the last 5 minutes
degraded1-2 consecutive failures, retries pending
unhealthy3+ consecutive failures, adapter is currently failing
unknownNever been pinged (just configured)

The health check runs every 5 minutes per site automatically.

Webhook events for failures

Subscribe to the adapter.failed event to get notified the moment an adapter starts failing:

Terminal window
curl -X POST https://api.spelo.ai/v1/webhooks \
-H "Authorization: Bearer vk_live_..." \
-d '{
"url": "https://yourapp.com/spelo-adapter-alert",
"events": ["adapter.failed", "adapter.recovered"]
}'

Payload:

{
"event": "adapter.failed",
"site_id": "ab1c2d3e",
"adapter_type": "postgres",
"error_code": "connection_refused",
"error_message": "ECONNREFUSED 10.0.1.50:5432",
"first_failure_at": "2026-05-15T07:45:00Z",
"consecutive_failures": 3
}

A matching adapter.recovered fires when the next health check succeeds.

Recovery patterns

Pattern A — restart the adapter pool

For Postgres / MySQL / MongoDB, after the database itself recovers, the Spelo API holds stale connections from the prior outage. Force a fresh pool:

Terminal window
curl -X POST https://api.spelo.ai/v1/sites/:id/adapter/reset-pool \
-H "Authorization: Bearer vk_live_..."

Returns immediately. The next adapter call will negotiate a fresh connection.

Pattern B — re-validate credentials

If your DB credentials rotated and the adapter is failing with auth errors:

Terminal window
curl -X PATCH https://api.spelo.ai/v1/sites/:id \
-H "Authorization: Bearer vk_live_..." \
-d '{ "adapter": { "config": { "password": "${NEW_PASSWORD}" } } }'

Then verify with POST /v1/sites/:id/adapter/test.

Pattern C — temporarily disable the adapter

If your DB is going to be down for an extended planned maintenance window, disable the adapter so the AI doesn’t try at all (cleaner UX than letting every call fail soft):

Terminal window
curl -X PATCH https://api.spelo.ai/v1/sites/:id \
-H "Authorization: Bearer vk_live_..." \
-d '{ "adapter": { "enabled": false } }'

The AI falls back to DOM-only + knowledge-base mode for the duration. Re-enable when maintenance is done.

What customers should NOT do

Common failure modes by adapter

AdapterMost common causeFix
Postgres / MySQLConnection pool exhaustedIncrease max_connections on DB, or reduce concurrent voice sessions
Postgres / MySQLSSL cert expiredRenew, update ssl.ca in config
MongoDBReplica set primary changedAuto-recovers via driver; may see 1-2 failures during election
ShopifyStorefront API rate-limited (60 req/min per token)Use multiple tokens, or upgrade to Plus
AirtablePer-base rate limit (5 req/sec)Same as Shopify — multiple PATs, or aggregate queries
Google SheetsToken expiredOAuth refresh — see OAuth → refresh UX
RESTEndpoint changed shapeRe-check collections.*.searchable_fields matches
WebhookYour endpoint downThe whole point of using a webhook is you control it — fix your endpoint

See also