Commit Graph

34 Commits

Author SHA1 Message Date
Li Yi
9cea7f9314 chore: polish code with little update (#182)
- Run Docker container as non-root user (appuser) to minimize security risks
- Add Docker HEALTHCHECK for better container orchestration
- Make CORS configurable via ALLOWED_ORIGINS env var with security warning
- Replace assertions with proper error handling (TypeError/ValueError)
- Add 30s timeout to HTTP requests to prevent hanging connections
- Disable auto-reload in production uvicorn settings
2025-10-11 14:49:18 +08:00
Neil Mazumdar
66cb51bb36 feat: add Claude Sonnet 4.5 support with global cross-region inference (#180)
This commit adds comprehensive support for Claude Sonnet 4.5 (claude-sonnet-4-5-20250929),
Anthropic's most intelligent model with enhanced coding capabilities and complex agent support.

Changes:
- Added global cross-region inference profile discovery (global.anthropic.*)
- Fixed temperature/topP compatibility for Claude Sonnet 4.5 (model doesn't support both simultaneously)
- Fixed reasoning_effort parameter handling to prevent KeyError
- Added extended thinking/interleaved thinking support via extra_body parameter
- Updated documentation with Claude Sonnet 4.5 examples (English and Chinese)
- Updated README with Sonnet 4.5 announcement

Technical Details:
- src/api/models/bedrock.py: Added global profile support in list_bedrock_models()
- src/api/models/bedrock.py: Added Claude Sonnet 4.5 detection to remove topP parameter
- src/api/models/bedrock.py: Changed pop("topP") to pop("topP", None) to prevent KeyError
- docs/Usage.md: Added Chat Completions section with Sonnet 4.5 examples
- docs/Usage.md: Updated Interleaved thinking section with Sonnet 4.5 examples
- docs/Usage_CN.md: Added Chinese versions of all Sonnet 4.5 documentation

Model ID: global.anthropic.claude-sonnet-4-5-20250929-v1:0
2025-09-30 16:51:26 +08:00
Mengxin Zhu
371d11d101 chore: cleanup useless files 2025-09-30 16:08:56 +08:00
Mengxin Zhu
e3ee9a707f docs: update deployment instructions and enhance ECR push script 2025-09-30 16:06:21 +08:00
heisenbergye
3f1b56a526 feat: support Claude 4 Interleaved thinking (beta) (#164) 2025-07-21 16:44:21 +08:00
Gagan M
01836087b1 feat: add support to include application inference profiles as models (#131)
---------

Co-authored-by: Mengxin Zhu <843303+zxkane@users.noreply.github.com>
2025-06-23 22:49:27 +08:00
Aiden Dai
fcbfa9fe3d Update usage guide for deepseek-r1 2025-03-11 10:24:19 +08:00
Aiden Dai
4095c2e74e Support of reasoning 2025-02-26 13:28:23 +08:00
Sean Smith
b26ee3e9ea Added troubleshooting guide and made buttons cool (#96)
Signed-off-by: Sean Smith <sean.smith@contextual.ai>
2025-02-11 12:40:27 +08:00
Aiden Dai
1cb8a6a603 Update readme 2025-02-10 15:48:34 +08:00
Aiden Dai
74ca3b938e Update architecture diagram 2025-02-10 10:02:43 +08:00
Aiden Dai
a6f3e1176b fix secret access issue 2025-02-09 06:53:23 +08:00
Aiden Dai
4d88731233 Use secrets manager for api key 2025-02-08 21:36:59 +08:00
Aiden Dai
581638b794 Update docs 2024-12-17 17:38:21 +08:00
Aiden Dai
51bc727b38 Use readme 2024-12-16 17:11:54 +08:00
Hans Knecht
241d5c0f3e feat: allow the use of an ENV variable to set the API key if the ParameterStore isn't used. (#40) 2024-12-06 14:32:06 +08:00
mschfh
17503b032a Add cross-region inference profiles for Llama 3.2 models. (#75) 2024-12-05 11:22:11 +08:00
bkocik
6849ca828a Add cross-region inference profiles for Llama 3.1 models. (#72) 2024-11-20 09:57:35 +08:00
heisenbergye
5f7676608a suppot all Claude models Cross-Region Inference (#65) 2024-10-29 14:43:31 +08:00
Aiden Dai
f0ea117732 Refactor to use new Converse API 2024-06-04 16:20:25 +08:00
Didier Durand
ac87a3787d Fixing 1 typo in README.md 2024-05-31 18:16:53 +02:00
Aiden Dai
d7a26dcf8b Update README 2024-05-10 10:21:03 +08:00
Aiden Dai
27d253fddb Add Llama 3 support 2024-04-25 10:04:55 +08:00
yhx
7df5617037 Add support of anthropic.claude-3-opus-20240229-v1:0 model 2024-04-17 14:22:46 +08:00
Aiden Dai
2bee83a79a Add support of Sydney region 2024-04-12 13:51:57 +08:00
Aiden Dai
1888fa1c98 Update README 2024-04-08 13:30:46 +08:00
Aiden Dai
cee335486a Update README 2024-04-03 19:55:16 +08:00
Aiden Dai
1456c809d3 Update README 2024-04-03 12:50:46 +08:00
Aiden Dai
93080da2e3 Update README 2024-03-28 16:54:46 +08:00
Joao Galego
a81dd84cec Updated docs (EN only)
Changes:
* Fixed OpenAI env var names (`OPENAI_API_BASE` --> `OPENAI_BASE_URL`)
* Added openai<1.0.0 example with openai.ChatCompletion
* Corrected typos and fixed formatting
2024-03-27 12:03:02 +00:00
Aiden Dai
a26b8e6833 Update README 2024-03-27 17:31:02 +08:00
Aiden Dai
951ecfc726 Update README 2024-03-27 17:27:25 +08:00
Aiden Dai
f974cb2728 Initial commit 2024-03-27 15:20:24 +08:00
Amazon GitHub Automation
f77df2c536 Initial commit 2024-03-26 23:57:20 -07:00