Skip to content
GitHub
Get started →

Schema introspection

When you save a data connection in the dashboard, Spelo asks the adapter “what does your data look like?” and feeds the answer into the AI’s system prompt. This is schema introspection, and it’s what lets the AI decide which collection and which fields to query without you hand-writing prompts.

What gets introspected

The describeSchema() method returns a list of collections, each with fields:

interface SchemaCollection {
name: string; // e.g. "properties"
source: string; // e.g. "listings"
recordCount?: number;
description?: string;
fields: {
name: string;
type: 'string' | 'number' | 'boolean' | 'date' | 'array' | 'object' | 'unknown';
sample?: unknown;
nullable?: boolean;
}[];
}

Which adapters support it

AdapterdescribeSchemaHow it discovers fields
Postgresinformation_schema.columns
MySQLinformation_schema.columns
MS SQL ServerINFORMATION_SCHEMA.COLUMNS
MongoDBSampled find() on 25 docs; types inferred
SupabasePostgREST OpenAPI spec
FirebaseSampled .limit(25); types inferred
AirtableGET /meta/bases/:baseId/tables (requires schema.bases:read scope)
Google SheetsFirst row as field names; cell types sampled
ShopifyHard-coded — Storefront API schema is fixed
WooCommerceHard-coded — WC REST schema is stable
REST APIUnknown — we have no way to introspect generic REST
WebhookYou describe your schema in the collection config
Custom adapteroptionalYou implement it yourself
JSONSample first 25 records

Where the schema ends up

Three places:

  1. Dashboard site detail page. You see the discovered fields and can toggle which ones are searchable / filterable / displayable.
  2. Site intelligence endpoint. Fetched by the widget at session start. The AI receives a summarized description.
  3. Site config page map. Combined with the page structure crawled from your site.

See Site intelligence endpoint for the full API shape.

Manual overrides

Introspection is a starting point. You can always:

  • Rename a collection (listingsproperties)
  • Add a description that the AI reads (“Rental listings available in the next 30 days”)
  • Restrict filterable_fields to a subset of the actual columns (e.g. hide internal IDs)
  • Restrict display_fields to slim responses

All of these are edited in the dashboard per-collection and override the introspected defaults.

Type hints matter

The AI uses field types to decide which operators to try:

  • number fields get gt, gte, lt, lte, range prompts (“under $3,000”)
  • string fields get eq, neq, contains
  • array fields get contains (“has a pool” → contains amenities = pool)
  • boolean fields get eq only
  • date fields get range comparisons

If introspection is ambiguous (e.g. a MongoDB field that’s a number in 80% of docs and a string in 20%), the adapter picks the dominant type and flags the field as nullable: true. You can override in the dashboard.

Refreshing the schema

Schemas can change. You renamed a column, added a table, dropped an index. Spelo re-introspects:

  • Automatically every 24 hours (cron job)
  • Whenever you click Refresh schema in the dashboard
  • Whenever you save the data connection

Privacy considerations

Introspection reads metadata (column names, types) plus up to 25 sampled rows to infer types. Sampled rows are not stored — they’re only used to determine types and then discarded.

If your column names are PII-like (e.g. social_security_number), the field appears in the schema but Spelo’s default safety rules prevent it from being exposed to the AI. Configure sensitive fields explicitly by naming them in restricted topics.

Limitations

  • Nested JSON. For MongoDB or Postgres jsonb, introspection lists the top-level keys. Deeply nested fields aren’t flattened. Use the Webhook adapter if you need to query deep structures.
  • Views vs. tables. Most SQL introspection includes views. If your security policy forbids exposing views, exclude them manually in the dashboard.
  • Cross-collection joins. Introspection describes each collection independently. Joins must be modeled as views (SQL) or application logic (webhook).

Debugging introspection

In the dashboard, click the Inspect button on your data connection. You’ll see:

  • The raw SchemaDescription returned by the adapter
  • Any errors (e.g. “permission denied on information_schema.columns”)
  • A “test query” button that runs a sample search

If you see unexpected fields or types, that’s usually a sign your read-only user doesn’t have the schema-introspection privileges it needs. See the adapter-specific docs for the exact grants required.