Optimize API usage and reduce costs by caching prompt prefixes.
Model | Minimum Token Length |
---|---|
Claude Opus 4, Claude Sonnet 4, Claude Sonnet 3.7, Claude Sonnet 3.5, Claude Opus 3 | 1024 tokens |
Claude Haiku 3.5, Claude Haiku 3 | 2048 tokens |
cache_control
field.cache_control
parameter to any message content you want to cache:
usage
in the response (or message_start
event if streaming):
cache_creation_input_tokens
: Tokens written to the cache when creating a new entrycache_read_input_tokens
: Tokens retrieved from the cache for this request