Neil Mazumdar
66cb51bb36
feat: add Claude Sonnet 4.5 support with global cross-region inference ( #180 )
...
This commit adds comprehensive support for Claude Sonnet 4.5 (claude-sonnet-4-5-20250929),
Anthropic's most intelligent model with enhanced coding capabilities and complex agent support.
Changes:
- Added global cross-region inference profile discovery (global.anthropic.*)
- Fixed temperature/topP compatibility for Claude Sonnet 4.5 (model doesn't support both simultaneously)
- Fixed reasoning_effort parameter handling to prevent KeyError
- Added extended thinking/interleaved thinking support via extra_body parameter
- Updated documentation with Claude Sonnet 4.5 examples (English and Chinese)
- Updated README with Sonnet 4.5 announcement
Technical Details:
- src/api/models/bedrock.py: Added global profile support in list_bedrock_models()
- src/api/models/bedrock.py: Added Claude Sonnet 4.5 detection to remove topP parameter
- src/api/models/bedrock.py: Changed pop("topP") to pop("topP", None) to prevent KeyError
- docs/Usage.md: Added Chat Completions section with Sonnet 4.5 examples
- docs/Usage.md: Updated Interleaved thinking section with Sonnet 4.5 examples
- docs/Usage_CN.md: Added Chinese versions of all Sonnet 4.5 documentation
Model ID: global.anthropic.claude-sonnet-4-5-20250929-v1:0
2025-09-30 16:51:26 +08:00
Mengxin Zhu
371d11d101
chore: cleanup useless files
2025-09-30 16:08:56 +08:00
Mengxin Zhu
e3ee9a707f
docs: update deployment instructions and enhance ECR push script
2025-09-30 16:06:21 +08:00
Divyateja Pasupuleti
bdfa57c277
chore: update requirements to fix vulnerability ( #177 )
...
* chore: update requirements to fix vulnerability
* Update Python base image to version 3.13-slim
2025-09-19 16:15:32 +08:00
jbrockett
911dfe26d6
models: fix Application Inference Profiles mapping ( #175 )
...
* models: fix Application Inference Profiles mapping to include all profiles per model_id; switch to defaultdict(set) and emit all AIPs
* Fix rebase issue
---------
Co-authored-by: Jeremy Brockett <313937+jbrockett@users.noreply.github.com >
2025-08-14 15:21:14 +08:00
RizviR
a2110ff648
Add pagination to list_inference_profiles calls ( #173 )
...
Co-authored-by: Rizvi Rahim <rizvi@rizvir.com >
2025-08-13 10:26:34 +08:00
Fabian Franz
0cce2edab0
feat: update boto3 to version 1.40.4 ( #169 )
...
Updates boto3 from 1.37.0 to 1.40.4 and botocore from 1.37.0 to 1.40.4. This update enables support for AWS_BEARER_TOKEN_BEDROCK functionality and includes the latest AWS service features and bug fixes.
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-authored-by: Claude <noreply@anthropic.com >
2025-08-13 10:23:30 +08:00
heisenbergye
3f1b56a526
feat: support Claude 4 Interleaved thinking (beta) ( #164 )
2025-07-21 16:44:21 +08:00
Mengxin Zhu
76a3614f17
fix: properly handle tool_use messages in conversation
2025-06-30 00:14:26 +08:00
Gagan M
01836087b1
feat: add support to include application inference profiles as models ( #131 )
...
---------
Co-authored-by: Mengxin Zhu <843303+zxkane@users.noreply.github.com >
2025-06-23 22:49:27 +08:00
dependabot[bot]
dd191d7cd9
Bump requests from 2.32.3 to 2.32.4 in /src ( #151 )
...
Bumps [requests](https://github.com/psf/requests ) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases )
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md )
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4 )
---
updated-dependencies:
- dependency-name: requests
dependency-version: 2.32.4
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-20 17:50:19 +08:00
Zack Elias
844efec086
add titan G1 embeddings ( #152 )
2025-06-17 11:09:22 +08:00
UniMa007
aed57307bc
Add Titan Embeddings G2 ( #94 )
2025-05-27 21:52:15 +08:00
Aiden Dai
4e8a913e43
fix empty content issue
2025-04-20 09:21:47 +08:00
Aiden Dai
b27e83624f
fix typo
2025-03-26 13:10:07 +08:00
Aiden Dai
c98e123c8f
optimize error response in streaming
2025-03-26 11:32:39 +08:00
Aiden Dai
4f1a75b49f
fix potential process stuck issue
2025-03-22 18:39:08 +08:00
Aiden Dai
0ead770069
performance improvement
2025-03-13 18:24:08 +08:00
Aiden Dai
fa14ae8c05
apply ruff linter
2025-03-13 14:24:41 +08:00
Aiden Dai
879b8e2ac7
apply ruff linter
2025-03-13 13:58:18 +08:00
Aiden Dai
f21b9a2e84
apply ruff linter
2025-03-13 13:50:57 +08:00
Aiden Dai
33e8fcfd3b
fix potential bad request issue
2025-03-13 07:16:42 +08:00
Aiden Dai
5ff18c0acd
Update usage guide for deepseek-r1
2025-03-11 10:25:50 +08:00
Aiden Dai
fcbfa9fe3d
Update usage guide for deepseek-r1
2025-03-11 10:24:19 +08:00
Aiden Dai
1a9c0f461e
Update usage guide for deepseek-r1
2025-03-11 10:14:06 +08:00
Aiden Dai
66b8967d30
Update usage guide for deepseek-r1
2025-03-11 10:10:58 +08:00
Zhongsheng Ji
fcfebf9d9d
feat: Response 429 if ThrottlingException ( #91 )
2025-03-10 09:01:33 +08:00
Aiden Dai
283115000a
Support of reasoning
2025-02-28 08:08:54 +08:00
Aiden Dai
4095c2e74e
Support of reasoning
2025-02-26 13:28:23 +08:00
Aiden Dai
a46e329c97
Support of reasoning
2025-02-26 12:25:38 +08:00
Omri Shaiko
54f4a2b017
Fix issue with toolResult error with Cursor. Use default DEFAULT_MODEL in ChatRequest ( #110 )
2025-02-26 10:43:44 +08:00
Aiden Dai
3ce47ff278
Partial support of reasoning
2025-02-25 16:23:06 +08:00
Sean Smith
b26ee3e9ea
Added troubleshooting guide and made buttons cool ( #96 )
...
Signed-off-by: Sean Smith <sean.smith@contextual.ai >
2025-02-11 12:40:27 +08:00
Aiden Dai
1cb8a6a603
Update readme
2025-02-10 15:48:34 +08:00
Aiden Dai
c39f6bc942
Use secrets manager for api key
2025-02-10 15:25:12 +08:00
Aiden Dai
74ca3b938e
Update architecture diagram
2025-02-10 10:02:43 +08:00
Aiden Dai
a6f3e1176b
fix secret access issue
2025-02-09 06:53:23 +08:00
Aiden Dai
4d88731233
Use secrets manager for api key
2025-02-08 21:36:59 +08:00
Sean Smith
48bf360456
Security Guide ( #101 )
...
Signed-off-by: Sean Smith <sean.smith@contextual.ai >
2025-02-08 11:40:24 +08:00
yytdfc
093c6fa586
add stop parameter ( #86 )
2024-12-31 11:15:24 +08:00
Aiden Dai
b2c187c716
Increase connect timeout
2024-12-19 16:45:18 +08:00
Aiden Dai
581638b794
Update docs
2024-12-17 17:38:21 +08:00
Aiden Dai
51bc727b38
Use readme
2024-12-16 17:11:54 +08:00
Aiden Dai
dc067affc0
Use yaml template
2024-12-16 16:33:37 +08:00
Aiden Dai
29621ae59c
Automatically detect model list
2024-12-16 16:15:09 +08:00
Aiden Dai
d4938a0af2
Automatically detect model list
2024-12-16 16:01:59 +08:00
Attila Szucs
cb38d328aa
Add environment variable for PORT ( #47 )
...
* Customizable port
* Fix CMD
2024-12-16 10:00:17 +08:00
Fabio Nonato
4fc0d3bc94
Image error fix ( #80 )
...
---------
Co-authored-by: Fabio Nonato <fnp@amazon.com >
2024-12-11 11:26:51 +08:00
Hans Knecht
241d5c0f3e
feat: allow the use of an ENV variable to set the API key if the ParameterStore isn't used. ( #40 )
2024-12-06 14:32:06 +08:00
Fabian Fischer
25b3cfb146
feat: add amazon nova inference profiles in us ( #79 )
2024-12-06 13:52:50 +08:00