Every namespace has an attributes schema that controls how document fields are stored, searched, and filtered. Your schema determines what the search agent can see and how it retrieves your data.
Defining a schema
Pass a schema object when uploading documents. Each key is an attribute name, and the value configures its type and indexing behavior.
curl -X POST https://api.withcharcoal.com/v1/namespaces/support-tickets/documents \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"documents": [
{
"id": "ticket-4521",
"title": "Login page returns 500 error",
"body": "Users report seeing a 500 error when attempting to log in via SSO...",
"priority": "high",
"created_at": "2025-03-10T14:22:00Z",
"tags": ["auth", "sso", "production"]
}
],
"schema": {
"title": { "type": "string", "is_searchable": true },
"body": { "type": "string", "is_searchable": true },
"priority": { "type": "string", "is_filterable": true },
"created_at": { "type": "datetime", "is_filterable": true },
"tags": { "type": "[]string", "is_filterable": true }
}
}'
If you omit the schema, Charcoal infers types from the document data. Inference works well for simple cases, but you should always define an explicit schema when you need searchable or filterable attributes — inference will not enable these for you.
Choosing what to index
Every attribute in your schema falls into one of three categories.
Searchable
The search agent runs BM25 full-text search queries against your searchable fields. It chooses which field to query, what search terms to use, and iterates with different queries until it finds what it needs. Searchable fields are the agent’s primary way of discovering relevant documents.
Make a field searchable when:
- It contains natural language text that describes the document’s content
- A human would ctrl+F through it to find information
- The field has enough textual content for keyword matching to be useful
Common searchable fields: title, body, description, content, summary, comments, notes
Do not make a field searchable when:
- It contains structured data (IDs, enums, dates, numbers) — use filterable instead
- It contains very short values (single words, codes) — filtering is more precise
- It duplicates content from another searchable field — this adds cost without improving recall
Only string and []string attributes can be searchable.
{
"title": { "type": "string", "is_searchable": true },
"body": { "type": "string", "is_searchable": true }
}
Filterable
Filterable attributes can be used in filter expressions — both by your API callers and by the search agent itself. The agent can autonomously apply filters on these attributes during its retrieval loop to narrow down results.
Make a field filterable when:
- It has a bounded set of values (status, priority, category, type)
- You want to scope searches by time range, numeric threshold, or tag membership
- The agent should be able to narrow its search autonomously (e.g., filtering by
status: "open")
Common filterable fields: status, priority, category, created_at, updated_at, tags, type, author
Do not make a field filterable when:
- It contains unique or high-cardinality text (document bodies, descriptions) — use searchable instead
- You never need to narrow results by this field’s value
{
"priority": { "type": "string", "is_filterable": true },
"created_at": { "type": "datetime", "is_filterable": true },
"tags": { "type": "[]string", "is_filterable": true }
}
Both searchable and filterable
Some fields benefit from both. A title field might be searched for keywords and also filtered for exact matches. This doubles the storage cost for that attribute.
{
"title": { "type": "string", "is_searchable": true, "is_filterable": true }
}
Stored only
Attributes without is_searchable or is_filterable are stored and returned in results, but the agent cannot query or filter against them. Use this for metadata you want to read but never search on — IDs, URLs, internal references, raw data blobs.
Example: choosing a schema
Consider a knowledge base with support articles:
{
"schema": {
"title": { "type": "string", "is_searchable": true },
"body": { "type": "string", "is_searchable": true },
"category": { "type": "string", "is_filterable": true },
"product": { "type": "string", "is_filterable": true },
"tags": { "type": "[]string", "is_filterable": true },
"updated_at": { "type": "datetime", "is_filterable": true },
"author": { "type": "string" },
"source_url": { "type": "string" }
}
}
title and body are searchable — the agent needs to find articles by their content.
category, product, tags, and updated_at are filterable — the agent and your callers can narrow searches to specific products, categories, or time ranges.
author and source_url are stored only — useful in results but not worth indexing.
Attribute types
| Type | Description | Example |
|---|
string | Text | "hello" |
int | Signed integer | 42 |
uint | Unsigned integer | 100 |
float | Floating point number | 3.14 |
bool | Boolean | true |
uuid | UUID string | "550e8400-e29b-41d4-a716-446655440000" |
datetime | ISO 8601 datetime | "2025-01-15T10:30:00Z" |
[]string | Array of strings | ["a", "b"] |
[]int | Array of integers | [1, 2, 3] |
[]uint | Array of unsigned integers | [1, 2, 3] |
[]float | Array of floats | [1.5, 2.5] |
[]bool | Array of booleans | [true, false] |
[]uuid | Array of UUIDs | ["550e8400-..."] |
[]datetime | Array of datetimes | ["2025-01-15T10:30:00Z"] |
All attributes are nullable. Types like uuid, uint, and datetime must be declared explicitly in the schema — they cannot be inferred.
Updating attributes
You can modify the following properties on existing attributes in-place:
is_searchable — enable or disable full-text search indexing
is_filterable — enable or disable filter indexing
You can also add new attributes to an existing schema. New attributes default to null for documents that were uploaded before the attribute was added.
To update a schema, submit the updated schema in a document upload request:
curl -X POST https://api.withcharcoal.com/v1/namespaces/support-tickets/documents \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"documents": [],
"schema": {
"priority": { "type": "string", "is_filterable": true, "is_searchable": true }
}
}'
After enabling indexing on an existing attribute, the index needs time to build before queries against it return complete results.
Limitations
- Types are immutable. Once an attribute’s type is set, it cannot be changed. If you need a different type, create a new attribute.
- Attributes cannot be deleted in-place. To remove an attribute, re-upsert all documents without it.
- Attribute names must be 128 bytes or fewer.
- Max attributes per namespace: 256.
- Max attribute value size: 8 MiB (4 KiB for filterable attributes).
- Max document size: 64 MiB (across all attributes).
Billing
Attribute storage is billed based on indexing:
| Configuration | Cost |
|---|
| Unindexed (no searchable or filterable) | 0.5x attribute size |
| Filterable only | 1x attribute size |
| Searchable only (full-text search) | 1x attribute size |
| Searchable + filterable | 2x attribute size |
Only enable is_searchable and is_filterable on attributes you actively need — each index doubles the storage cost of that attribute.