bedrock-access-gateway/deployment/BedrockProxy.template at 2c518bbd70ce63d1ae8683c0e506ac75829643c0

Files

Kane Zhu b4800c54a0 feat: add prompt caching support for Claude and Nova models

Add comprehensive prompt caching support with flexible control options:

Features:
- ENV variable control (ENABLE_PROMPT_CACHING, default: false)
- Per-request control via extra_body.prompt_caching
- Pattern-based model detection (Claude, Nova)
- Token limit warnings (Nova 20K limit)
- OpenAI-compatible response format (prompt_tokens_details.cached_tokens)

Supported models:
- Claude 3+ models (anthropic.claude-*)
- Nova models (amazon.nova-*)
- Auto-detection prevents breaking unsupported models

Implementation:
- System prompts caching via extra_body.prompt_caching.system
- Messages caching via extra_body.prompt_caching.messages
- Non-streaming and streaming modes
- Compatible with reasoning, thinking, and tool calls

2025-10-15 11:03:19 +08:00

8.5 KiB

Raw Blame History

View Raw

8.5 KiB Raw Blame History

8.5 KiB

Raw Blame History