Spice Caching

Configure in-memory caching for SQL query results, search results, and embeddings in the Spice runtime.

Overview

Spice caches results from SQL queries (/v1/sql), search (/v1/search), and embeddings requests. All three caches are enabled by default with a 1-second TTL and 128 MiB max size. Caching applies to HTTP and Arrow Flight APIs.

Configuration

Caching is configured under runtime.caching in spicepod.yaml:

version: v1
kind: Spicepod
name: app

runtime:
  caching:
    sql_results:
      enabled: true
      max_size: 1GiB # Default 128MiB
      item_ttl: 1m # Default 1s
      eviction_policy: lru # lru | tiny_lfu
      hashing_algorithm: xxh3
      cache_key_type: plan # plan | sql
      encoding: none # none | zstd
      stale_while_revalidate_ttl: 30s # Default 0s (disabled)
    search_results:
      enabled: true
      max_size: 1GiB
      item_ttl: 1m
      eviction_policy: lru
    embeddings:
      enabled: true
      max_size: 128MiB
      item_ttl: 1m

Common Parameters (All Cache Types)

Parameter	Default	Description
`enabled`	`true`	Enable/disable the cache
`max_size`	`128MiB`	Maximum cache size
`eviction_policy`	`lru`	`lru` (Least Recently Used) or `tiny_lfu` (higher hit rate for skewed access)
`item_ttl`	`1s`	Cache entry TTL (Time to Live)
`hashing_algorithm`	`xxh3`	Hash for cache keys: `xxh3`, `ahash`, `siphash`, `blake3`, `xxh32`, `xxh64`, `xxh128`

SQL Results Extra Parameters

Parameter	Default	Description
`cache_key_type`	`plan`	`plan` = logical plan (matches semantically equivalent queries); `sql` = raw SQL string (faster, exact match only)
`encoding`	`none`	`none` or `zstd` (compresses cached results, 50-90% reduction)
`stale_while_revalidate_ttl`	`0s`	Serve stale entries while refreshing in background. `0s` = disabled

Choosing Parameters

`cache_key_type`

plan (default): Matches semantically equivalent queries even with different SQL syntax. Requires query parsing overhead.
sql: Faster lookups, exact string match. Avoid with dynamic functions like NOW().

`eviction_policy`

lru (default): Good general-purpose policy.
tiny_lfu: Better hit rate when some queries are accessed much more frequently than others.

`encoding`

none (default): Zero compression overhead, uses more memory.
zstd: High compression (50-90% reduction) with fast decompression. Use for large result sets.

`hashing_algorithm`

xxh3 (default): Fastest general-purpose.
ahash / xxh64 / xxh128: Lower collision probability for many cached queries.
blake3: Cryptographic security required.
siphash: Protection against hash-flooding DoS attacks.

Stale-While-Revalidate

When stale_while_revalidate_ttl is set to a non-zero value:

Cache entries are served normally until item_ttl expires.
After item_ttl expires but before item_ttl + stale_while_revalidate_ttl, the stale entry is served immediately with STALE status.
A background task refreshes the cache entry.
After item_ttl + stale_while_revalidate_ttl, the entry is evicted.

runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s
      # Fresh for 10s → Stale (served while refreshing) for 10s → Evicted

Conflict warning: When using refresh_mode: caching on a dataset, do not configure both runtime.caching.sql_results.stale_while_revalidate_ttl and acceleration.params.caching_stale_while_revalidate_ttl for the same dataset. Choose one approach.

Cache Control Headers

HTTP API

Use the standard Cache-Control header with /v1/sql and /v1/search:

Directive	Description
`no-cache`	Skip cache for this request; cache the result for future requests
`min-fresh=N`	Require cached entry to remain fresh for at least N seconds
`max-stale=N`	Accept stale responses up to N seconds old
`only-if-cached`	Return only cached responses; error on cache miss
`stale-if-error=N`	Serve stale cache (up to N seconds) if fetching fresh data fails

# Skip cache for this query
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Only accept fresh results (at least 30s remaining)
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Accept stale up to 60s
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Only return if cached
curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Spice CLI

spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache

Arrow FlightSQL

Set cache-control in request metadata:

let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");

JDBC:

Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);

Custom Cache Keys

Set the Spice-Cache-Key header to share cache entries across semantically equivalent but syntactically different queries. Valid keys: up to 128 alphanumeric characters plus - and _. Custom keys take precedence over cache_key_type.

# First query — cache MISS
curl -XPOST http://localhost:8090/v1/sql \
  -H "spice-cache-key: users_spiceai" \
  -d "select * from users where org_id = 1;"

# Different query, same cache key — cache HIT
curl -XPOST http://localhost:8090/v1/sql \
  -H "spice-cache-key: users_spiceai" \
  -d "select * from users where split_part(email, '@', 2) = 'spice.ai';"

Warning: Ensure queries sharing a cache key are truly semantically equivalent. The runtime will return the cached result regardless of the actual query.

Response Headers

Responses include a header indicating cache status:

Cache Type	Response Header
`sql_results`	`Results-Cache-Status`
`search_results`	`Search-Results-Cache-Status`

Status	Meaning
`HIT`	Served from cache
`MISS`	Cache checked, result not found
`BYPASS`	Cache bypassed (e.g., `cache-control: no-cache`)
`STALE`	Stale entry served while revalidating
(absent)	Cache did not apply (disabled or system table query)

Monitoring / Metrics

Cache metrics are available at the Prometheus-compatible metrics endpoint. Prefix by cache type: results_*, search_results_*, embeddings_*.

Metric	Type	Description
`*_cache_max_size_bytes`	Gauge	Configured max cache size
`*_cache_requests`	Counter	Total cache lookups
`*_cache_hits`	Counter	Total cache hits
`*_cache_items_count`	Gauge	Current items in cache
`*_cache_size_bytes`	Gauge	Current cache size
`*_cache_evictions`	Counter	Total evictions
`*_cache_hit_ratio`	Gauge	Hit ratio (hits / total)

Common Recipes

High-throughput Dashboard (Maximize Hit Rate)

runtime:
  caching:
    sql_results:
      item_ttl: 30s
      max_size: 2GiB
      eviction_policy: tiny_lfu
      encoding: zstd
      stale_while_revalidate_ttl: 30s

Low-Latency API (Exact Queries, Fast Lookups)

runtime:
  caching:
    sql_results:
      item_ttl: 5s
      cache_key_type: sql
      hashing_algorithm: xxh3

Disable Caching Entirely

runtime:
  caching:
    sql_results:
      enabled: false
    search_results:
      enabled: false
    embeddings:
      enabled: false

Troubleshooting

Issue	Solution
Always getting `MISS`	Check `item_ttl` is long enough; verify `cache_key_type` (`plan` matches equivalent queries, `sql` requires exact strings)
Cache filling up quickly	Increase `max_size`, enable `zstd` encoding, or reduce `item_ttl`
Stale data being served	Reduce `item_ttl` or `stale_while_revalidate_ttl`; use `cache-control: no-cache` for specific queries
Dynamic functions (`NOW()`) returning cached results	Switch to `cache_key_type: plan` or use `cache-control: no-cache`
SWR conflict error	Don't set both `runtime.caching.sql_results.stale_while_revalidate_ttl` and `acceleration.params.caching_stale_while_revalidate_ttl` for the same dataset

spice-caching

Safety Notice

Copy this and send it to your AI assistant to learn