Use readme
This commit is contained in:
75
README.md
75
README.md
@@ -6,10 +6,26 @@ OpenAI-compatible RESTful APIs for Amazon Bedrock
|
|||||||
|
|
||||||
## Breaking Changes
|
## Breaking Changes
|
||||||
|
|
||||||
The source code is refactored with the new [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) by bedrock which provides native support with tool calls.
|
This tool can now automatically detect new models supported in Amazon Bedrock.
|
||||||
|
So whenever new models are added to Amazon Bedrock, you can immediately try them without the need to wait for code changes to this repo.
|
||||||
|
|
||||||
If you are facing any problems, please raise an issue.
|
This is to use the `ListFoundationModels` api and the `ListInferenceProfiles` api by Amazon Bedrock, due to this change, additional IAM permissions are required to your Lambda/Fargate role.
|
||||||
|
|
||||||
|
If you are facing error: 'Unsupported model xxx, please use models API to get a list of supported models' even the model ID is correct,
|
||||||
|
please either update your existing stack with the new template in the deployment folder or manually add below permissions to the related Lambda/Fargate role.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Action": [
|
||||||
|
"bedrock:ListFoundationModels",
|
||||||
|
"bedrock:ListInferenceProfiles"
|
||||||
|
],
|
||||||
|
"Resource": "*",
|
||||||
|
"Effect": "Allow"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Please raise an GitHub issue if you still have problems.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
@@ -160,65 +176,10 @@ print(completion.choices[0].message.content)
|
|||||||
|
|
||||||
Please check [Usage Guide](./docs/Usage.md) for more details about how to use embedding API, multimodal API and tool call.
|
Please check [Usage Guide](./docs/Usage.md) for more details about how to use embedding API, multimodal API and tool call.
|
||||||
|
|
||||||
### Bedrock Cross-Region Inference
|
|
||||||
|
|
||||||
|
|
||||||
Cross-Region Inference supports accessing foundation models across regions, allowing users to invoke models hosted in different AWS regions for inference. Main advantages:
|
|
||||||
- **Improved Availability**: Provides regional redundancy and enhanced fault tolerance. When issues occur in the primary region, services can failover to backup regions, ensuring continuous service availability and business continuity.
|
|
||||||
- **Reduced Latency**: Enables selection of regions geographically closest to users, optimizing network paths and reducing transmission time, resulting in better user experience and response times.
|
|
||||||
- **Better Performance and Capacity**: Implements load balancing to distribute request pressure, provides greater service capacity and throughput, and better handles traffic spikes.
|
|
||||||
- **Flexibility**: Allows selection of models from different regions based on requirements, meets specific regional compliance requirements, and enables more flexible resource allocation and management.
|
|
||||||
- **Cost Benefits**: Enables selection of more cost-effective regions, reduces overall operational costs through resource optimization, and improves resource utilization efficiency.
|
|
||||||
|
|
||||||
|
|
||||||
Please check [Bedrock Cross-Region Inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html)
|
|
||||||
|
|
||||||
**limitation:**
|
|
||||||
Currently, Bedrock Access Gateway only supports cross-region Inference for the following models:
|
|
||||||
- Claude 3 Haiku
|
|
||||||
- Claude 3 Opus
|
|
||||||
- Claude 3 Sonnet
|
|
||||||
- Claude 3.5 Sonnet
|
|
||||||
- Meta Llama 3.1 8b Instruct
|
|
||||||
- Meta Llama 3.1 70b Instruct
|
|
||||||
- Meta Llama 3.2 1B Instruct
|
|
||||||
- Meta Llama 3.2 3B Instruct
|
|
||||||
- Meta Llama 3.2 11B Vision Instruct
|
|
||||||
- Meta Llama 3.2 90B Vision Instruct
|
|
||||||
|
|
||||||
**Prerequisites:**
|
|
||||||
- IAM policies must allow cross-region access,Callers need permissions to access models and inference profiles in both regions (added in cloudformation template)
|
|
||||||
- Model access must be enabled in both regions, which defined in inference profiles
|
|
||||||
|
|
||||||
**Example API Usage:**
|
|
||||||
- To use Bedrock cross-region inference, you include an inference profile when running model inference by specifying the ID of the inference profile as the modelId, such as `us.anthropic.claude-3-5-sonnet-20240620-v1:0`
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl $OPENAI_BASE_URL/chat/completions \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "Authorization: Bearer $OPENAI_API_KEY" \
|
|
||||||
-d '{
|
|
||||||
"model": "us.anthropic.claude-3-5-sonnet-20240620-v1:0",
|
|
||||||
"max_tokens": 2048,
|
|
||||||
"messages": [
|
|
||||||
{
|
|
||||||
"role": "user",
|
|
||||||
"content": "Hello!"
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Other Examples
|
## Other Examples
|
||||||
|
|
||||||
### AutoGen
|
|
||||||
|
|
||||||
Below is an image of setting up the model in AutoGen studio.
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
### LangChain
|
### LangChain
|
||||||
|
|
||||||
Make sure you use `ChatOpenAI(...)` instead of `OpenAI(...)`
|
Make sure you use `ChatOpenAI(...)` instead of `OpenAI(...)`
|
||||||
|
|||||||
73
README_CN.md
73
README_CN.md
@@ -6,9 +6,26 @@
|
|||||||
|
|
||||||
## 重大变更
|
## 重大变更
|
||||||
|
|
||||||
项目源代码已使用Bedrock提供的新 [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) 进行了重构,该API对工具调用提供了原生支持。
|
这个方案现在可以自动检测 Amazon Bedrock 中支持的新模型。
|
||||||
|
因此,当 Amazon Bedrock 添加新模型时,您可以立即尝试使用它们,无需等待此代码库的更新。
|
||||||
|
|
||||||
如果您遇到任何问题,请提 Github Issue。
|
这是通过使用Amazon Bedrock 的 `ListFoundationModels API` 和 `ListInferenceProfiles` API 实现的。由于这一变更,您需要为 Lambda/Fargate 角色添加额外的 IAM 权限。
|
||||||
|
|
||||||
|
如果您遇到错误:"Unsupported model xxx, please use models API to get a list of supported models"(即使Model ID 是正确的),
|
||||||
|
请使用Deployment 文件夹中的新模板更新您现有的堆栈,或手动为相关的 Lambda/Fargate 角色添加以下权限。
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Action": [
|
||||||
|
"bedrock:ListFoundationModels",
|
||||||
|
"bedrock:ListInferenceProfiles"
|
||||||
|
],
|
||||||
|
"Resource": "*",
|
||||||
|
"Effect": "Allow"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
如果依然有问题,请提个GitHub issue。
|
||||||
|
|
||||||
## 概述
|
## 概述
|
||||||
|
|
||||||
@@ -158,62 +175,10 @@ print(completion.choices[0].message.content)
|
|||||||
|
|
||||||
请查看[使用指南](./docs/Usage_CN.md)以获取有关如何使用Embedding API、多模态API和Tool Call的更多详细信息。
|
请查看[使用指南](./docs/Usage_CN.md)以获取有关如何使用Embedding API、多模态API和Tool Call的更多详细信息。
|
||||||
|
|
||||||
### Bedrock Cross-Region Inference
|
|
||||||
|
|
||||||
Cross-Region Inference 支持跨区域访问的基础模型,即允许用户在一个 AWS 区域中调用其他区域的基础模型进行推理。主要优势:
|
|
||||||
- **提高可用性**: 提供区域冗余,增强容错能力。当主要区域出现问题时可以切换到备用区域,确保服务的持续可用性和业务连续性
|
|
||||||
- **降低延迟**: 可以选择地理位置最接近用户的区域,优化网络路径,减少传输时间,提供更好的用户体验和响应速度
|
|
||||||
- **性能和容量优化**: 实现负载均衡,分散请求压力,提供更大的服务容量和吞吐量,能够更好地处理流量峰值
|
|
||||||
- **灵活性**: 根据需求选择不同区域的模型,满足特定地区的合规要求,更灵活的资源调配和管理
|
|
||||||
- **成本效益**: 可以选择成本更优的区域,通过优化资源使用降低总体运营成本,更好的资源利用效率
|
|
||||||
|
|
||||||
详细介绍请查看[Bedrock Cross-Region Inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html)
|
|
||||||
|
|
||||||
**限制条件:**
|
|
||||||
当前 Gateway 只添加了对 Claude 3 Haiku/Claude 3 Opus/Claude 3 Sonnet/Claude 3.5 Sonnet 的跨区域调用
|
|
||||||
- Claude 3 Haiku
|
|
||||||
- Claude 3 Opus
|
|
||||||
- Claude 3 Sonnet
|
|
||||||
- Claude 3.5 Sonnet
|
|
||||||
- Meta Llama 3.1 8b Instruct
|
|
||||||
- Meta Llama 3.1 70b Instruct
|
|
||||||
- Meta Llama 3.2 1B Instruct
|
|
||||||
- Meta Llama 3.2 3B Instruct
|
|
||||||
- Meta Llama 3.2 11B Vision Instruct
|
|
||||||
- Meta Llama 3.2 90B Vision Instruct
|
|
||||||
|
|
||||||
**使用前提:**
|
|
||||||
- IAM Policy 有 inference profiles 相关的权限和调用模型的权限 (cloudformation template 中已添加)
|
|
||||||
- 对 inference profiles 中定义的模型和区域中都启用模型访问权限
|
|
||||||
|
|
||||||
**使用方法:**
|
|
||||||
- 在调用模型时设置 modelId 为 inference profile ID, 例如 `us.anthropic.claude-3-5-sonnet-20240620-v1:0`
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl $OPENAI_BASE_URL/chat/completions \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-H "Authorization: Bearer $OPENAI_API_KEY" \
|
|
||||||
-d '{
|
|
||||||
"model": "us.anthropic.claude-3-5-sonnet-20240620-v1:0",
|
|
||||||
"max_tokens": 2048,
|
|
||||||
"messages": [
|
|
||||||
{
|
|
||||||
"role": "user",
|
|
||||||
"content": "Hello!"
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
## 其他例子
|
## 其他例子
|
||||||
|
|
||||||
### AutoGen
|
|
||||||
|
|
||||||
例如在AutoGen studio配置和使用模型
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
### LangChain
|
### LangChain
|
||||||
|
|
||||||
请确保使用的示`ChatOpenAI(...)` ,而不是`OpenAI(...)`
|
请确保使用的示`ChatOpenAI(...)` ,而不是`OpenAI(...)`
|
||||||
|
|||||||
Binary file not shown.
|
Before Width: | Height: | Size: 209 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 212 KiB |
@@ -8,6 +8,6 @@ RUN pip install --no-cache-dir --upgrade -r /app/requirements.txt
|
|||||||
|
|
||||||
COPY ./api /app/api
|
COPY ./api /app/api
|
||||||
|
|
||||||
ENV PORT 80
|
ENV PORT=80
|
||||||
|
|
||||||
CMD ["sh", "-c", "uvicorn api.app:app --host 0.0.0.0 --port ${PORT}"]
|
CMD ["sh", "-c", "uvicorn api.app:app --host 0.0.0.0 --port ${PORT}"]
|
||||||
|
|||||||
Reference in New Issue
Block a user