Commit Graph

114 Commits

Author SHA1 Message Date
UniMa007
aed57307bc Add Titan Embeddings G2 (#94) 2025-05-27 21:52:15 +08:00
Aiden Dai
4e8a913e43 fix empty content issue 2025-04-20 09:21:47 +08:00
Aiden Dai
b27e83624f fix typo 2025-03-26 13:10:07 +08:00
Aiden Dai
c98e123c8f optimize error response in streaming 2025-03-26 11:32:39 +08:00
Aiden Dai
4f1a75b49f fix potential process stuck issue 2025-03-22 18:39:08 +08:00
Aiden Dai
0ead770069 performance improvement 2025-03-13 18:24:08 +08:00
Aiden Dai
fa14ae8c05 apply ruff linter 2025-03-13 14:24:41 +08:00
Aiden Dai
879b8e2ac7 apply ruff linter 2025-03-13 13:58:18 +08:00
Aiden Dai
f21b9a2e84 apply ruff linter 2025-03-13 13:50:57 +08:00
Aiden Dai
33e8fcfd3b fix potential bad request issue 2025-03-13 07:16:42 +08:00
Aiden Dai
5ff18c0acd Update usage guide for deepseek-r1 2025-03-11 10:25:50 +08:00
Aiden Dai
fcbfa9fe3d Update usage guide for deepseek-r1 2025-03-11 10:24:19 +08:00
Aiden Dai
1a9c0f461e Update usage guide for deepseek-r1 2025-03-11 10:14:06 +08:00
Aiden Dai
66b8967d30 Update usage guide for deepseek-r1 2025-03-11 10:10:58 +08:00
Zhongsheng Ji
fcfebf9d9d feat: Response 429 if ThrottlingException (#91) 2025-03-10 09:01:33 +08:00
Aiden Dai
283115000a Support of reasoning 2025-02-28 08:08:54 +08:00
Aiden Dai
4095c2e74e Support of reasoning 2025-02-26 13:28:23 +08:00
Aiden Dai
a46e329c97 Support of reasoning 2025-02-26 12:25:38 +08:00
Omri Shaiko
54f4a2b017 Fix issue with toolResult error with Cursor. Use default DEFAULT_MODEL in ChatRequest (#110) 2025-02-26 10:43:44 +08:00
Aiden Dai
3ce47ff278 Partial support of reasoning 2025-02-25 16:23:06 +08:00
Sean Smith
b26ee3e9ea Added troubleshooting guide and made buttons cool (#96)
Signed-off-by: Sean Smith <sean.smith@contextual.ai>
2025-02-11 12:40:27 +08:00
Aiden Dai
1cb8a6a603 Update readme 2025-02-10 15:48:34 +08:00
Aiden Dai
c39f6bc942 Use secrets manager for api key 2025-02-10 15:25:12 +08:00
Aiden Dai
74ca3b938e Update architecture diagram 2025-02-10 10:02:43 +08:00
Aiden Dai
a6f3e1176b fix secret access issue 2025-02-09 06:53:23 +08:00
Aiden Dai
4d88731233 Use secrets manager for api key 2025-02-08 21:36:59 +08:00
Sean Smith
48bf360456 Security Guide (#101)
Signed-off-by: Sean Smith <sean.smith@contextual.ai>
2025-02-08 11:40:24 +08:00
yytdfc
093c6fa586 add stop parameter (#86) 2024-12-31 11:15:24 +08:00
Aiden Dai
b2c187c716 Increase connect timeout 2024-12-19 16:45:18 +08:00
Aiden Dai
581638b794 Update docs 2024-12-17 17:38:21 +08:00
Aiden Dai
51bc727b38 Use readme 2024-12-16 17:11:54 +08:00
Aiden Dai
dc067affc0 Use yaml template 2024-12-16 16:33:37 +08:00
Aiden Dai
29621ae59c Automatically detect model list 2024-12-16 16:15:09 +08:00
Aiden Dai
d4938a0af2 Automatically detect model list 2024-12-16 16:01:59 +08:00
Attila Szucs
cb38d328aa Add environment variable for PORT (#47)
* Customizable port

* Fix CMD
2024-12-16 10:00:17 +08:00
Fabio Nonato
4fc0d3bc94 Image error fix (#80)
---------

Co-authored-by: Fabio Nonato <fnp@amazon.com>
2024-12-11 11:26:51 +08:00
Hans Knecht
241d5c0f3e feat: allow the use of an ENV variable to set the API key if the ParameterStore isn't used. (#40) 2024-12-06 14:32:06 +08:00
Fabian Fischer
25b3cfb146 feat: add amazon nova inference profiles in us (#79) 2024-12-06 13:52:50 +08:00
mschfh
17503b032a Add cross-region inference profiles for Llama 3.2 models. (#75) 2024-12-05 11:22:11 +08:00
bkocik
6849ca828a Add cross-region inference profiles for Llama 3.1 models. (#72) 2024-11-20 09:57:35 +08:00
KAEYL98
11a31b5584 feat: add support for APAC claude 3 profiles (#69) 2024-11-07 16:43:15 +08:00
heisenbergye
5f7676608a suppot all Claude models Cross-Region Inference (#65) 2024-10-29 14:43:31 +08:00
Meng Xin Zhu
9cc3ea8253 chore: publish templates to s3 in release workflow (#64) 2024-10-28 17:36:35 +08:00
Aaron Yi
8785c63ddf fix: remove the code review pipeline
until the access right can be grant to pull request from fork
2024-10-25 13:12:59 +08:00
yike5460
0afd0463e1 fix: add debugging info onto workflow 2024-10-25 02:33:26 +00:00
Sergei Mikhailov
3a97677b97 Added "new Claude 3.5 Sonnet" v2 model to the list (#60) 2024-10-23 14:54:45 +08:00
yike5460
728ef6d8a6 fix: update workflow action to user var instead of secret 2024-10-10 06:24:04 +00:00
Mengxin Zhu
46fb759137 chore: use correct Dockerfile for building lambda image 2024-10-09 23:39:37 +08:00
Mengxin Zhu
326e566105 chore: use arm64 architecture image for lambda 2024-10-09 23:15:10 +08:00
Meng Xin Zhu
c1ee1b4244 chore: add automation script to release images (#58) 2024-10-09 18:20:14 +08:00