Spice Caching
Configure in-memory caching for SQL query results, search results, and embeddings in the Spice runtime.
Overview
Spice caches results from SQL queries (/v1/sql), search (/v1/search), and embeddings requests. All three caches are enabled by default with a 1-second TTL and 128 MiB max size. Caching applies to HTTP and Arrow Flight APIs.
Configuration
Caching is configured under runtime.caching in spicepod.yaml:
version: v1
kind: Spicepod
name: app
runtime:
caching:
sql_results:
enabled: true
max_size: 1GiB # Default 128MiB
item_ttl: 1m # Default 1s
eviction_policy: lru # lru | tiny_lfu
hashing_algorithm: xxh3
cache_key_type: plan # plan | sql
encoding: none # none | zstd
stale_while_revalidate_ttl: 30s # Default 0s (disabled)
search_results:
enabled: true
max_size: 1GiB
item_ttl: 1m
eviction_policy: lru
embeddings:
enabled: true
max_size: 128MiB
item_ttl: 1m
Common Parameters (All Cache Types)
| Parameter | Default | Description |
|---|---|---|
enabled | true | Enable/disable the cache |
max_size | 128MiB | Maximum cache size |
eviction_policy | lru | lru (Least Recently Used) or tiny_lfu (higher hit rate for skewed access) |
item_ttl | 1s | Cache entry TTL (Time to Live) |
hashing_algorithm | xxh3 | Hash for cache keys: xxh3, ahash, siphash, blake3, xxh32, xxh64, xxh128 |
SQL Results Extra Parameters
| Parameter | Default | Description |
|---|---|---|
cache_key_type | plan | plan = logical plan (matches semantically equivalent queries); sql = raw SQL string (faster, exact match only) |
encoding | none | none or zstd (compresses cached results, 50-90% reduction) |
stale_while_revalidate_ttl | 0s | Serve stale entries while refreshing in background. 0s = disabled |
Choosing Parameters
cache_key_type
plan(default): Matches semantically equivalent queries even with different SQL syntax. Requires query parsing overhead.sql: Faster lookups, exact string match. Avoid with dynamic functions likeNOW().
eviction_policy
lru(default): Good general-purpose policy.tiny_lfu: Better hit rate when some queries are accessed much more frequently than others.
encoding
none(default): Zero compression overhead, uses more memory.zstd: High compression (50-90% reduction) with fast decompression. Use for large result sets.
hashing_algorithm
xxh3(default): Fastest general-purpose.ahash/xxh64/xxh128: Lower collision probability for many cached queries.blake3: Cryptographic security required.siphash: Protection against hash-flooding DoS attacks.
Stale-While-Revalidate
When stale_while_revalidate_ttl is set to a non-zero value:
- Cache entries are served normally until
item_ttlexpires. - After
item_ttlexpires but beforeitem_ttl + stale_while_revalidate_ttl, the stale entry is served immediately withSTALEstatus. - A background task refreshes the cache entry.
- After
item_ttl + stale_while_revalidate_ttl, the entry is evicted.
runtime:
caching:
sql_results:
enabled: true
item_ttl: 10s
stale_while_revalidate_ttl: 10s
# Fresh for 10s → Stale (served while refreshing) for 10s → Evicted
Conflict warning: When using
refresh_mode: cachingon a dataset, do not configure bothruntime.caching.sql_results.stale_while_revalidate_ttlandacceleration.params.caching_stale_while_revalidate_ttlfor the same dataset. Choose one approach.
Cache Control Headers
HTTP API
Use the standard Cache-Control header with /v1/sql and /v1/search:
| Directive | Description |
|---|---|
no-cache | Skip cache for this request; cache the result for future requests |
min-fresh=N | Require cached entry to remain fresh for at least N seconds |
max-stale=N | Accept stale responses up to N seconds old |
only-if-cached | Return only cached responses; error on cache miss |
stale-if-error=N | Serve stale cache (up to N seconds) if fetching fresh data fails |
# Skip cache for this query
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
# Only accept fresh results (at least 30s remaining)
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
# Accept stale up to 60s
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
# Only return if cached
curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'
Spice CLI
spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache
Arrow FlightSQL
Set cache-control in request metadata:
let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");
JDBC:
Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);
Custom Cache Keys
Set the Spice-Cache-Key header to share cache entries across semantically equivalent but syntactically different queries. Valid keys: up to 128 alphanumeric characters plus - and _. Custom keys take precedence over cache_key_type.
# First query — cache MISS
curl -XPOST http://localhost:8090/v1/sql \
-H "spice-cache-key: users_spiceai" \
-d "select * from users where org_id = 1;"
# Different query, same cache key — cache HIT
curl -XPOST http://localhost:8090/v1/sql \
-H "spice-cache-key: users_spiceai" \
-d "select * from users where split_part(email, '@', 2) = 'spice.ai';"
Warning: Ensure queries sharing a cache key are truly semantically equivalent. The runtime will return the cached result regardless of the actual query.
Response Headers
Responses include a header indicating cache status:
| Cache Type | Response Header |
|---|---|
sql_results | Results-Cache-Status |
search_results | Search-Results-Cache-Status |
| Status | Meaning |
|---|---|
HIT | Served from cache |
MISS | Cache checked, result not found |
BYPASS | Cache bypassed (e.g., cache-control: no-cache) |
STALE | Stale entry served while revalidating |
| (absent) | Cache did not apply (disabled or system table query) |
Monitoring / Metrics
Cache metrics are available at the Prometheus-compatible metrics endpoint. Prefix by cache type: results_*, search_results_*, embeddings_*.
| Metric | Type | Description |
|---|---|---|
*_cache_max_size_bytes | Gauge | Configured max cache size |
*_cache_requests | Counter | Total cache lookups |
*_cache_hits | Counter | Total cache hits |
*_cache_items_count | Gauge | Current items in cache |
*_cache_size_bytes | Gauge | Current cache size |
*_cache_evictions | Counter | Total evictions |
*_cache_hit_ratio | Gauge | Hit ratio (hits / total) |
Common Recipes
High-throughput Dashboard (Maximize Hit Rate)
runtime:
caching:
sql_results:
item_ttl: 30s
max_size: 2GiB
eviction_policy: tiny_lfu
encoding: zstd
stale_while_revalidate_ttl: 30s
Low-Latency API (Exact Queries, Fast Lookups)
runtime:
caching:
sql_results:
item_ttl: 5s
cache_key_type: sql
hashing_algorithm: xxh3
Disable Caching Entirely
runtime:
caching:
sql_results:
enabled: false
search_results:
enabled: false
embeddings:
enabled: false
Troubleshooting
| Issue | Solution |
|---|---|
Always getting MISS | Check item_ttl is long enough; verify cache_key_type (plan matches equivalent queries, sql requires exact strings) |
| Cache filling up quickly | Increase max_size, enable zstd encoding, or reduce item_ttl |
| Stale data being served | Reduce item_ttl or stale_while_revalidate_ttl; use cache-control: no-cache for specific queries |
Dynamic functions (NOW()) returning cached results | Switch to cache_key_type: plan or use cache-control: no-cache |
| SWR conflict error | Don't set both runtime.caching.sql_results.stale_while_revalidate_ttl and acceleration.params.caching_stale_while_revalidate_ttl for the same dataset |