Skip to main content
Every namespace has an attributes schema that controls how document fields are stored, searched, and filtered. Your schema determines what the search agent can see and how it retrieves your data.

Defining a schema

Pass a schema object when uploading documents. Each key is an attribute name, and the value configures its type and indexing behavior.
curl -X POST https://api.withcharcoal.com/v1/namespaces/support-tickets/documents \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {
        "id": "ticket-4521",
        "title": "Login page returns 500 error",
        "body": "Users report seeing a 500 error when attempting to log in via SSO...",
        "priority": "high",
        "created_at": "2025-03-10T14:22:00Z",
        "tags": ["auth", "sso", "production"]
      }
    ],
    "schema": {
      "title": { "type": "string", "is_searchable": true },
      "body": { "type": "string", "is_searchable": true },
      "priority": { "type": "string", "is_filterable": true },
      "created_at": { "type": "datetime", "is_filterable": true },
      "tags": { "type": "[]string", "is_filterable": true }
    }
  }'
If you omit the schema, Charcoal infers types from the document data. Inference works well for simple cases, but you should always define an explicit schema when you need searchable or filterable attributes — inference will not enable these for you.

Choosing what to index

Every attribute in your schema falls into one of three categories.

Searchable

The search agent runs BM25 full-text search queries against your searchable fields. It chooses which field to query, what search terms to use, and iterates with different queries until it finds what it needs. Searchable fields are the agent’s primary way of discovering relevant documents. Make a field searchable when:
  • It contains natural language text that describes the document’s content
  • A human would ctrl+F through it to find information
  • The field has enough textual content for keyword matching to be useful
Common searchable fields: title, body, description, content, summary, comments, notes Do not make a field searchable when:
  • It contains structured data (IDs, enums, dates, numbers) — use filterable instead
  • It contains very short values (single words, codes) — filtering is more precise
  • It duplicates content from another searchable field — this adds cost without improving recall
Only string and []string attributes can be searchable.
{
  "title": { "type": "string", "is_searchable": true },
  "body": { "type": "string", "is_searchable": true }
}

Filterable

Filterable attributes can be used in filter expressions — both by your API callers and by the search agent itself. The agent can autonomously apply filters on these attributes during its retrieval loop to narrow down results. Make a field filterable when:
  • It has a bounded set of values (status, priority, category, type)
  • You want to scope searches by time range, numeric threshold, or tag membership
  • The agent should be able to narrow its search autonomously (e.g., filtering by status: "open")
Common filterable fields: status, priority, category, created_at, updated_at, tags, type, author Do not make a field filterable when:
  • It contains unique or high-cardinality text (document bodies, descriptions) — use searchable instead
  • You never need to narrow results by this field’s value
{
  "priority": { "type": "string", "is_filterable": true },
  "created_at": { "type": "datetime", "is_filterable": true },
  "tags": { "type": "[]string", "is_filterable": true }
}

Both searchable and filterable

Some fields benefit from both. A title field might be searched for keywords and also filtered for exact matches. This doubles the storage cost for that attribute.
{
  "title": { "type": "string", "is_searchable": true, "is_filterable": true }
}

Stored only

Attributes without is_searchable or is_filterable are stored and returned in results, but the agent cannot query or filter against them. Use this for metadata you want to read but never search on — IDs, URLs, internal references, raw data blobs.

Example: choosing a schema

Consider a knowledge base with support articles:
{
  "schema": {
    "title":       { "type": "string",     "is_searchable": true },
    "body":        { "type": "string",     "is_searchable": true },
    "category":    { "type": "string",     "is_filterable": true },
    "product":     { "type": "string",     "is_filterable": true },
    "tags":        { "type": "[]string",   "is_filterable": true },
    "updated_at":  { "type": "datetime",   "is_filterable": true },
    "author":      { "type": "string" },
    "source_url":  { "type": "string" }
  }
}
  • title and body are searchable — the agent needs to find articles by their content.
  • category, product, tags, and updated_at are filterable — the agent and your callers can narrow searches to specific products, categories, or time ranges.
  • author and source_url are stored only — useful in results but not worth indexing.

Attribute types

TypeDescriptionExample
stringText"hello"
intSigned integer42
uintUnsigned integer100
floatFloating point number3.14
boolBooleantrue
uuidUUID string"550e8400-e29b-41d4-a716-446655440000"
datetimeISO 8601 datetime"2025-01-15T10:30:00Z"
[]stringArray of strings["a", "b"]
[]intArray of integers[1, 2, 3]
[]uintArray of unsigned integers[1, 2, 3]
[]floatArray of floats[1.5, 2.5]
[]boolArray of booleans[true, false]
[]uuidArray of UUIDs["550e8400-..."]
[]datetimeArray of datetimes["2025-01-15T10:30:00Z"]
All attributes are nullable. Types like uuid, uint, and datetime must be declared explicitly in the schema — they cannot be inferred.

Updating attributes

You can modify the following properties on existing attributes in-place:
  • is_searchable — enable or disable full-text search indexing
  • is_filterable — enable or disable filter indexing
You can also add new attributes to an existing schema. New attributes default to null for documents that were uploaded before the attribute was added. To update a schema, submit the updated schema in a document upload request:
curl -X POST https://api.withcharcoal.com/v1/namespaces/support-tickets/documents \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [],
    "schema": {
      "priority": { "type": "string", "is_filterable": true, "is_searchable": true }
    }
  }'
After enabling indexing on an existing attribute, the index needs time to build before queries against it return complete results.

Limitations

  • Types are immutable. Once an attribute’s type is set, it cannot be changed. If you need a different type, create a new attribute.
  • Attributes cannot be deleted in-place. To remove an attribute, re-upsert all documents without it.
  • Attribute names must be 128 bytes or fewer.
  • Max attributes per namespace: 256.
  • Max attribute value size: 8 MiB (4 KiB for filterable attributes).
  • Max document size: 64 MiB (across all attributes).

Billing

Attribute storage is billed based on indexing:
ConfigurationCost
Unindexed (no searchable or filterable)0.5x attribute size
Filterable only1x attribute size
Searchable only (full-text search)1x attribute size
Searchable + filterable2x attribute size
Only enable is_searchable and is_filterable on attributes you actively need — each index doubles the storage cost of that attribute.