Site intelligence endpoint
When the widget initializes, it fetches a site intelligence blob that describes your site’s pages, your data connection’s schema, and the resolved personality prompt. This is what lets the AI answer “what pages do you have?” and “what kinds of things can I search for?” without any prior context.
Endpoint
GET /v1/:siteId/site-intelligenceAuth
Origin-based (same as /token and /query). The widget calls this once per session on page load.
Response 200
interface SiteIntelligenceResponse { generatedAt: string; // ISO timestamp site?: { origin: string; totalPages: number; pages: Array<{ url: string; title: string; description: string; sectionIds: string[]; // id= attributes on headings, for scroll_to }>; }; data?: { collections: Array<{ name: string; source: string; recordCount?: number; description?: string; fields: Array<{ name: string; type: 'string' | 'number' | 'boolean' | 'date' | 'array' | 'object' | 'unknown'; sample?: unknown; }>; }>; }; personality_prompt?: string; // resolved system prompt}Example response
{ "success": true, "data": { "generatedAt": "2026-04-17T14:00:00.000Z", "site": { "origin": "https://emberandoak.com", "totalPages": 12, "pages": [ { "url": "/", "title": "Ember & Oak — Farm-to-table in downtown LA", "description": "Chef-driven dinner menu featuring locally-sourced ingredients.", "sectionIds": ["about", "menu", "reservations", "hours"] }, { "url": "/menu", "title": "Menu — Ember & Oak", "description": "Current dinner menu. Updated seasonally.", "sectionIds": ["starters", "entrees", "desserts", "wine"] } ] }, "data": { "collections": [ { "name": "menu_items", "source": "menu_items", "recordCount": 48, "description": "Current dinner menu items", "fields": [ { "name": "id", "type": "string" }, { "name": "name", "type": "string" }, { "name": "price", "type": "number" }, { "name": "category", "type": "string", "sample": "entree" }, { "name": "vegan", "type": "boolean" } ] } ] }, "personality_prompt": "You are Maya, the AI host at Ember & Oak..." }}How site.pages is built
The pages array is generated by a server-side crawl. When you add a domain to your site, Spelo’s crawler:
- Fetches your
/sitemap.xml(if present) - Otherwise, crawls starting from
/following internal links, respectingrobots.txt - Extracts
<title>,<meta name="description">, and allid=attributes on headings - Stores the result in our DB
The crawl runs nightly and on-demand when you click Refresh in the dashboard. Custom enabled_pages / disabled_pages patterns are respected.
How data.collections is built
From schema introspection. Each connected adapter’s describeSchema() result is stored and served here.
How personality_prompt is built
From template + pronunciations + time-zone awareness + restricted topics + custom instructions.
When the widget uses this
At session start, the widget stuffs site_intelligence.site.pages into the AI’s context so the model can:
- Answer “where do I find X?” with a specific URL
- Call the
navigatetool with the right path - Call
scroll_towith the right section id
And data.collections tells the AI which fields it can ask search_database about.
Caching
Cached aggressively at our edge:
site.pages— TTL 60 minutes (or until you click Refresh)data.collections— TTL 60 minutespersonality_prompt— TTL 5 minutes (so small edits propagate quickly)
Pages regeneration happens in the background; the stale cache is served while rebuilding.
Errors
| HTTP | Code | Cause |
|---|---|---|
| 403 | origin_not_allowed | Bad Origin |
| 404 | site_not_found | Unknown site_id |
| 503 | crawler_unavailable | First-time crawl still running — retry in 30s |
Rate limits
Per site_id: 120 requests / minute. Widget throttles to one per session naturally.
Custom override
Enterprise plans can upload a hand-curated site map + schema instead of relying on the crawler. Contact sales.