Commit Graph

  • 737cf076a0 fix: Fix ImageContent schema to use proper default value (#234) main Donghee Na 2026-03-13 11:42:22 +09:00
  • 6ae73c0c69 fix: merge additionalModelRequestFields instead of overwriting Kane Zhu 2026-03-10 16:41:52 +08:00
  • d1dc4ed164 fix: Support reasoning_tokens at bedrock streaming response (#223) Donghee Na 2026-02-26 12:48:05 +09:00
  • d14596ff47 feat: add Amazon Nova 2 multimodal embeddings support (#222) Gabriel Koo 2026-02-26 11:41:17 +08:00
  • a1844f95d4 Preload tiktoken encoding in Dockerfile (Lambda) (#220) mjkam 2026-02-19 18:00:05 +09:00
  • a150f7bb1c fix: support continue response for claude opus 4.6 (#219) Hooman Yar 2026-02-11 23:21:50 -08:00
  • 9b3da3a5c8 fix(deps): update fastapi and starlette for CVE-2025-62727 (#216) Mengxin Zhu 2026-01-19 11:57:01 +08:00
  • 1a7f55b89b Add support for 'developer' role in chat messages (#209) Angélica de Oliveira 2025-12-09 00:26:10 -03:00
  • b41633b826 feat(apigw): add API Gateway response streaming support (#207) Mengxin Zhu 2025-12-05 10:54:13 +08:00
  • 0411454b3a feat: add claude-opus-4-5 to TEMPERATURE_TOPP_CONFLICT_MODELS set (#208) Hooman Yar 2025-12-04 17:22:37 -08:00
  • 2c518bbd70 fix(docker): add --provenance=false --sbom=false for Lambda compatibility Kane Zhu 2025-11-27 18:53:39 +08:00
  • 37374e79ba fix: Allow the push-to-ecr.sh script to run from anywhere instead of requiring the user to cd manually (#202) Justin Dray 2025-11-19 22:33:43 -08:00
  • b3c1c82367 Fix healthcheck in Dockerfile_ecs (#199) Viktor Isaev 2025-11-20 10:30:00 +04:00
  • ce4cfabb21 Fixed <think> </think> tags for GPT-OSS in bedrock.py (#200) user-error1 2025-11-20 01:29:20 -05:00
  • 7e03ab062d fix: Fix invalid cache_creation_tokens metric key (#195) Donghee Na 2025-10-27 15:31:21 +09:00
  • 18b68bd3a7 🐳 preload tiktoken encoding in Dockerfile_ecs (#193) Shion Ichikawa 2025-10-22 23:28:40 +09:00
  • d86e64eed3 refactor(bedrock): unify inference profile metadata handling and cleanup Kane Zhu 2025-10-16 15:24:02 +08:00
  • b4800c54a0 feat: add prompt caching support for Claude and Nova models Kane Zhu 2025-10-11 14:08:22 +08:00
  • 7756532b4c fix: ECS container /health endpoint does not require API_KEY Bearer Token (#184) Scott Baxter 2025-10-12 22:59:42 -05:00
  • 9cea7f9314 chore: polish code with little update (#182) Li Yi 2025-10-11 14:49:18 +08:00
  • 8177876e5e Support <think> tags (#117) Fabian Franz 2025-09-30 13:29:19 +01:00
  • 66cb51bb36 feat: add Claude Sonnet 4.5 support with global cross-region inference (#180) Neil Mazumdar 2025-09-30 18:21:26 +09:30
  • 371d11d101 chore: cleanup useless files Mengxin Zhu 2025-09-30 16:08:56 +08:00
  • e3ee9a707f docs: update deployment instructions and enhance ECR push script Mengxin Zhu 2025-09-30 16:06:21 +08:00
  • bdfa57c277 chore: update requirements to fix vulnerability (#177) Divyateja Pasupuleti 2025-09-19 13:45:32 +05:30
  • 911dfe26d6 models: fix Application Inference Profiles mapping (#175) jbrockett 2025-08-14 03:21:14 -04:00
  • a2110ff648 Add pagination to list_inference_profiles calls (#173) RizviR 2025-08-13 08:26:34 +06:00
  • 0cce2edab0 feat: update boto3 to version 1.40.4 (#169) Fabian Franz 2025-08-13 03:23:30 +01:00
  • 3f1b56a526 feat: support Claude 4 Interleaved thinking (beta) (#164) heisenbergye 2025-07-21 16:44:21 +08:00
  • 76a3614f17 fix: properly handle tool_use messages in conversation Mengxin Zhu 2025-06-30 00:14:07 +08:00
  • 01836087b1 feat: add support to include application inference profiles as models (#131) Gagan M 2025-06-23 20:19:27 +05:30
  • dd191d7cd9 Bump requests from 2.32.3 to 2.32.4 in /src (#151) dependabot[bot] 2025-06-20 17:50:19 +08:00
  • 844efec086 add titan G1 embeddings (#152) Zack Elias 2025-06-16 23:09:22 -04:00
  • aed57307bc Add Titan Embeddings G2 (#94) UniMa007 2025-05-27 15:52:15 +02:00
  • 4e8a913e43 fix empty content issue Aiden Dai 2025-04-20 09:21:47 +08:00
  • b27e83624f fix typo Aiden Dai 2025-03-26 13:10:07 +08:00
  • c98e123c8f optimize error response in streaming Aiden Dai 2025-03-26 11:32:39 +08:00
  • 4f1a75b49f fix potential process stuck issue Aiden Dai 2025-03-22 18:39:08 +08:00
  • 0ead770069 performance improvement Aiden Dai 2025-03-13 18:24:08 +08:00
  • fa14ae8c05 apply ruff linter Aiden Dai 2025-03-13 14:24:41 +08:00
  • 879b8e2ac7 apply ruff linter Aiden Dai 2025-03-13 13:58:18 +08:00
  • f21b9a2e84 apply ruff linter Aiden Dai 2025-03-13 13:50:57 +08:00
  • 33e8fcfd3b fix potential bad request issue Aiden Dai 2025-03-13 07:16:42 +08:00
  • 5ff18c0acd Update usage guide for deepseek-r1 Aiden Dai 2025-03-11 10:25:50 +08:00
  • fcbfa9fe3d Update usage guide for deepseek-r1 Aiden Dai 2025-03-11 10:24:19 +08:00
  • 1a9c0f461e Update usage guide for deepseek-r1 Aiden Dai 2025-03-11 10:14:06 +08:00
  • 66b8967d30 Update usage guide for deepseek-r1 Aiden Dai 2025-03-11 10:10:58 +08:00
  • fcfebf9d9d feat: Response 429 if ThrottlingException (#91) Zhongsheng Ji 2025-03-10 09:01:33 +08:00
  • 283115000a Support of reasoning Aiden Dai 2025-02-28 08:08:54 +08:00
  • 4095c2e74e Support of reasoning Aiden Dai 2025-02-26 13:28:23 +08:00
  • a46e329c97 Support of reasoning Aiden Dai 2025-02-26 12:25:38 +08:00
  • 54f4a2b017 Fix issue with toolResult error with Cursor. Use default DEFAULT_MODEL in ChatRequest (#110) Omri Shaiko 2025-02-26 04:43:44 +02:00
  • 3ce47ff278 Partial support of reasoning Aiden Dai 2025-02-25 16:22:26 +08:00
  • b26ee3e9ea Added troubleshooting guide and made buttons cool (#96) Sean Smith 2025-02-10 20:40:27 -08:00
  • 1cb8a6a603 Update readme Aiden Dai 2025-02-10 15:48:34 +08:00
  • c39f6bc942 Use secrets manager for api key Aiden Dai 2025-02-10 15:25:12 +08:00
  • 74ca3b938e Update architecture diagram Aiden Dai 2025-02-10 10:02:43 +08:00
  • a6f3e1176b fix secret access issue Aiden Dai 2025-02-09 06:53:23 +08:00
  • 4d88731233 Use secrets manager for api key Aiden Dai 2025-02-08 21:36:59 +08:00
  • 48bf360456 Security Guide (#101) Sean Smith 2025-02-07 19:40:24 -08:00
  • 093c6fa586 add stop parameter (#86) yytdfc 2024-12-31 11:15:24 +08:00
  • b2c187c716 Increase connect timeout Aiden Dai 2024-12-19 16:45:18 +08:00
  • 581638b794 Update docs Aiden Dai 2024-12-17 17:38:21 +08:00
  • 51bc727b38 Use readme Aiden Dai 2024-12-16 17:11:54 +08:00
  • dc067affc0 Use yaml template Aiden Dai 2024-12-16 16:33:37 +08:00
  • 29621ae59c Automatically detect model list Aiden Dai 2024-12-16 16:15:09 +08:00
  • d4938a0af2 Automatically detect model list Aiden Dai 2024-12-16 16:01:19 +08:00
  • cb38d328aa Add environment variable for PORT (#47) Attila Szucs 2024-12-16 03:00:17 +01:00
  • 4fc0d3bc94 Image error fix (#80) Fabio Nonato 2024-12-10 19:26:51 -08:00
  • 241d5c0f3e feat: allow the use of an ENV variable to set the API key if the ParameterStore isn't used. (#40) Hans Knecht 2024-12-06 07:32:06 +01:00
  • 25b3cfb146 feat: add amazon nova inference profiles in us (#79) Fabian Fischer 2024-12-06 06:52:50 +01:00
  • 17503b032a Add cross-region inference profiles for Llama 3.2 models. (#75) mschfh 2024-12-05 04:22:11 +01:00
  • 6849ca828a Add cross-region inference profiles for Llama 3.1 models. (#72) bkocik 2024-11-19 20:57:35 -05:00
  • 11a31b5584 feat: add support for APAC claude 3 profiles (#69) KAEYL98 2024-11-07 16:43:15 +08:00
  • 5f7676608a suppot all Claude models Cross-Region Inference (#65) heisenbergye 2024-10-29 14:43:31 +08:00
  • 9cc3ea8253 chore: publish templates to s3 in release workflow (#64) Meng Xin Zhu 2024-10-28 17:36:35 +08:00
  • 8785c63ddf fix: remove the code review pipeline Aaron Yi 2024-10-25 13:12:59 +08:00
  • 0afd0463e1 fix: add debugging info onto workflow yike5460 2024-10-25 02:33:26 +00:00
  • 3a97677b97 Added "new Claude 3.5 Sonnet" v2 model to the list (#60) Sergei Mikhailov 2024-10-23 19:54:45 +13:00
  • 728ef6d8a6 fix: update workflow action to user var instead of secret yike5460 2024-10-10 06:24:04 +00:00
  • 46fb759137 chore: use correct Dockerfile for building lambda image Mengxin Zhu 2024-10-09 23:39:37 +08:00
  • 326e566105 chore: use arm64 architecture image for lambda Mengxin Zhu 2024-10-09 23:15:10 +08:00
  • c1ee1b4244 chore: add automation script to release images (#58) Meng Xin Zhu 2024-10-09 18:20:14 +08:00
  • 552578a0ee fix: fix action dep issue yike5460 2024-10-09 08:30:19 +00:00
  • d9590d6504 fix: place action file into the right folder yike5460 2024-10-09 08:22:14 +00:00
  • 5c5e370a81 test: folder with error file for code review and pr description dev yike5460 2024-10-09 08:20:13 +00:00
  • 29d333d367 feat: enable code review and pr description in workflow yike5460 2024-10-09 08:13:02 +00:00
  • c655f50616 feat: Handle multiple user messages in a single request (#26) Yuki Sekiya 2024-10-09 16:13:58 +09:00
  • 97d77ab0c5 Merge pull request #39 from diopres/patch-2 Aiden Dai 2024-08-14 14:42:25 +08:00
  • db0817392f feat: add support for Mistral Large 2 (24.07) diopres 2024-08-12 19:06:44 +05:30
  • 2950c15ecb Fix empty response bug Aiden Dai 2024-08-09 17:28:09 +08:00
  • f8faf32a76 Add Llama 3.1 without tool call Aiden Dai 2024-07-30 12:27:08 +08:00
  • f6b73152bc Update boto3 version Aiden Dai 2024-06-25 17:29:47 +08:00
  • 66bdfdf5c1 Support Claude 3.5 Sonnet Aiden Dai 2024-06-21 10:28:04 +08:00
  • 49dd6608a0 Support of tool choice Aiden Dai 2024-06-21 10:24:11 +08:00
  • b3509ee0f0 Support multiple tool calls Aiden Dai 2024-06-11 16:58:26 +08:00
  • 56786f9e32 Update api response Aiden Dai 2024-06-11 10:53:56 +08:00
  • 6ef7641a0d Update api response Aiden Dai 2024-06-07 10:58:44 +08:00
  • 5f84cef13a Refactor to use new Converse API Aiden Dai 2024-06-04 17:01:06 +08:00
  • f0ea117732 Refactor to use new Converse API Aiden Dai 2024-06-04 16:20:25 +08:00