diff --git a/README.md b/README.md index c63505a..492025f 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ If you are facing any problems, please raise an issue. ## Overview -Amazon Bedrock offers a wide range of foundation models (such as Claude 3 Opus/Sonnet/Haiku, Llama 2/3, Mistral/Mixtral, +Amazon Bedrock offers a wide range of foundation models (such as Claude 3 Opus/Sonnet/Haiku, Llama 2/3, Mistral/Mixtral, etc.) and a broad set of capabilities for you to build generative AI applications. Check the [Amazon Bedrock](https://aws.amazon.com/bedrock) landing page for additional information. Sometimes, you might have applications developed using OpenAI APIs or SDKs, and you want to experiment with Amazon Bedrock without modifying your codebase. Or you may simply wish to evaluate the capabilities of these foundation models in tools like AutoGen etc. Well, this repository allows you to access Amazon Bedrock models seamlessly through OpenAI APIs and SDKs, enabling you to test these models without code changes. @@ -94,7 +94,7 @@ Please follow the steps below to deploy the Bedrock Proxy APIs into your AWS acc [![Launch Stack](assets/launch-stack.png)](https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=BedrockProxyAPI&templateURL=https://aws-gcr-solutions.s3.amazonaws.com/bedrock-access-gateway/latest/BedrockProxy.template) - **ALB + Fargate** - + [![Launch Stack](assets/launch-stack.png)](https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=BedrockProxyAPI&templateURL=https://aws-gcr-solutions.s3.amazonaws.com/bedrock-access-gateway/latest/BedrockProxyFargate.template) 3. Click "Next". 4. On the "Specify stack details" page, provide the following information: @@ -173,10 +173,12 @@ Currently, Bedrock Access Gateway only supports cross-region Inference for the f - Claude 3 Opus - Claude 3 Sonnet - Claude 3.5 Sonnet +- Meta Llama 3.1 8b Instruct +- Meta Llama 3.1 70b Instruct **Prerequisites:** - IAM policies must allow cross-region access,Callers need permissions to access models and inference profiles in both regions (added in cloudformation template) -- Model access must be enabled in both regions, which defined in inference profiles +- Model access must be enabled in both regions, which defined in inference profiles **Example API Usage:** - To use Bedrock cross-region inference, you include an inference profile when running model inference by specifying the ID of the inference profile as the modelId, such as `us.anthropic.claude-3-5-sonnet-20240620-v1:0` @@ -293,7 +295,7 @@ Fine-tuned models and models with Provisioned Throughput are currently not suppo ### How to upgrade? -To use the latest features, you don't need to redeploy the CloudFormation stack. You simply need to pull the latest image. +To use the latest features, you don't need to redeploy the CloudFormation stack. You simply need to pull the latest image. To do so, depends on which version you deployed: diff --git a/README_CN.md b/README_CN.md index c093a86..61f27a6 100644 --- a/README_CN.md +++ b/README_CN.md @@ -12,7 +12,7 @@ ## 概述 -Amazon Bedrock提供了广泛的基础模型(如Claude 3 Opus/Sonnet/Haiku、Llama 2/3、Mistral/Mixtral等),以及构建生成式AI应用程序的多种功能。更多详细信息,请查看[Amazon +Amazon Bedrock提供了广泛的基础模型(如Claude 3 Opus/Sonnet/Haiku、Llama 2/3、Mistral/Mixtral等),以及构建生成式AI应用程序的多种功能。更多详细信息,请查看[Amazon Bedrock](https://aws.amazon.com/bedrock)。 有时,您可能已经使用OpenAI的API或SDK构建了应用程序,并希望在不修改代码的情况下试用Amazon @@ -96,7 +96,7 @@ OpenAI 的 API 或 SDK 无缝集成并试用 Amazon Bedrock 的模型,而无需 [![Launch Stack](assets/launch-stack.png)](https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=BedrockProxyAPI&templateURL=https://aws-gcr-solutions.s3.amazonaws.com/bedrock-access-gateway/latest/BedrockProxy.template) - **ALB + Fargate** - + [![Launch Stack](assets/launch-stack.png)](https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=BedrockProxyAPI&templateURL=https://aws-gcr-solutions.s3.amazonaws.com/bedrock-access-gateway/latest/BedrockProxyFargate.template) 3. 单击"下一步"。 4. 在"指定堆栈详细信息"页面,提供以下信息: @@ -175,6 +175,8 @@ Cross-Region Inference 支持跨区域访问的基础模型,即允许用户在 - Claude 3 Opus - Claude 3 Sonnet - Claude 3.5 Sonnet +- Meta Llama 3.1 8b Instruct +- Meta Llama 3.1 70b Instruct **使用前提:** - IAM Policy 有 inference profiles 相关的权限和调用模型的权限 (cloudformation template 中已添加) diff --git a/src/api/models/bedrock.py b/src/api/models/bedrock.py index ce0a609..48e9c59 100644 --- a/src/api/models/bedrock.py +++ b/src/api/models/bedrock.py @@ -35,8 +35,8 @@ from api.schema import ( EmbeddingsResponse, EmbeddingsUsage, Embedding, - - + + ) from api.setting import DEBUG, AWS_REGION @@ -139,12 +139,26 @@ class BedrockModel(BaseChatModel): "tool_call": False, "stream_tool_call": False, }, + # Llama 3.1 8b cross-region inference profile + "us.meta.llama3-1-8b-instruct-v1:0": { + "system": True, + "multimodal": False, + "tool_call": False, + "stream_tool_call": False, + }, "meta.llama3-1-8b-instruct-v1:0": { "system": True, "multimodal": False, "tool_call": False, "stream_tool_call": False, }, + # Llama 3.1 70b cross-region inference profile + "us.meta.llama3-1-70b-instruct-v1:0": { + "system": True, + "multimodal": False, + "tool_call": False, + "stream_tool_call": False, + }, "meta.llama3-1-70b-instruct-v1:0": { "system": True, "multimodal": False, @@ -467,7 +481,7 @@ class BedrockModel(BaseChatModel): def _reframe_multi_payloard(self, messages: list) -> list: """ Receive messages and reformat them to comply with the Claude format - + With OpenAI format requests, it's not a problem to repeatedly receive messages from the same role, but with Claude format requests, you cannot repeatedly receive messages from the same role. @@ -493,12 +507,12 @@ bedrock_format_messages=[ reformatted_messages = [] current_role = None current_content = [] - + # Search through the list of messages and combine messages from the same role into one list for message in messages: next_role = message['role'] next_content = message['content'] - + # If the next role is different from the previous message, add the previous role's messages to the list if next_role != current_role: if current_content: @@ -509,20 +523,20 @@ bedrock_format_messages=[ # Switch to the new role current_role = next_role current_content = [] - + # Add the message content to current_content if isinstance(next_content, str): current_content.append({"text": next_content}) elif isinstance(next_content, list): current_content.extend(next_content) - + # Add the last role's messages to the list if current_content: reformatted_messages.append({ "role": current_role, "content": current_content }) - + return reformatted_messages