Initial commit
This commit is contained in:
244
README.md
244
README.md
@@ -1,11 +1,245 @@
|
||||
## My Project
|
||||
[中文](./README_CN.md)
|
||||
|
||||
TODO: Fill this README out!
|
||||
# Bedrock Access Gateway
|
||||
|
||||
Be sure to:
|
||||
OpenAI-Compatible RESTful APIs for Amazon Bedrock
|
||||
|
||||
* Change the title in this README
|
||||
* Edit your repository description on GitHub
|
||||
## Overview
|
||||
|
||||
Amazon Bedrock offers a wide range of foundation models (such as Claude 3 Sonnet/Haiku, Llama 2, Mistral/Mixtral etc.)
|
||||
and a broad set of capabilities for you to build generative AI applications.
|
||||
Check [Amazon Bedrock](https://aws.amazon.com/bedrock) for more details.
|
||||
|
||||
Sometimes, you might have applications developed using OpenAI APIs or SDKs, and you want to experiment with Amazon
|
||||
Bedrock without modifying your codebase. Or you may simply wish to evaluate the capabilities of these foundation models
|
||||
in tools like AutoGen etc. Well, this repository allows you to access Amazon Bedrock models seamlessly through OpenAI
|
||||
APIs and SDKs, enabling you to test these models without code changes.
|
||||
|
||||
If you find this GitHub repository useful, please consider giving it a free star to show your appreciation and support
|
||||
for the project.
|
||||
|
||||
Features:
|
||||
|
||||
- [x] Support streaming response via server-sent events (SSE)
|
||||
- [x] Support Model APIs
|
||||
- [x] Support Chat Completion APIs
|
||||
- [ ] Support Function Call/Tool Call
|
||||
- [ ] Support Embedding APIs
|
||||
- [ ] Support Image APIs
|
||||
|
||||
> NOTE: 1. The legacy [text completion](https://platform.openai.com/docs/api-reference/completions) API is not
|
||||
> supported, you should move to chat completion API. 2. May support other APIs such as fine-tuning, Assistants API etc.
|
||||
> in the future.
|
||||
|
||||
Supported Amazon Bedrock models (Model IDs):
|
||||
|
||||
- anthropic.claude-instant-v1
|
||||
- anthropic.claude-v2:1
|
||||
- anthropic.claude-v2
|
||||
- anthropic.claude-3-sonnet-20240229-v1:0
|
||||
- anthropic.claude-3-haiku-20240307-v1:0
|
||||
- meta.llama2-13b-chat-v1
|
||||
- meta.llama2-70b-chat-v1
|
||||
- mistral.mistral-7b-instruct-v0:2
|
||||
- mistral.mixtral-8x7b-instruct-v0:1
|
||||
|
||||
> Note: The default model is set to `anthropic.claude-3-sonnet-20240229-v1:0`. You can change it via Lambda environment
|
||||
> variables.
|
||||
|
||||
## Get Started
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Please make sure you have met below prerequisites:
|
||||
|
||||
- Access to Amazon Bedrock foundation models.
|
||||
|
||||
If you haven't got model access, please follow
|
||||
the [Set Up](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) guide
|
||||
|
||||
### Architecture
|
||||
|
||||
The following diagram illustrates the solution architecture. Note that it also includes a new **VPC** with two public
|
||||
subnets only for the Application Load Balancer (ALB).
|
||||
|
||||

|
||||
|
||||
### Deployment
|
||||
|
||||
Please follow below steps to deploy the Bedrock Proxy APIs into your AWS account. Only support regions where Amazon
|
||||
Bedrock is available (such as us-west-2). The deployment will take approximately 3-5 minutes.
|
||||
|
||||
**Step 1: Create you own custom API key (Optional)**
|
||||
|
||||
> NOTE: This step is to use any string (without spaces) you like to create a custom API Key (credential) that will be
|
||||
> used to access the proxy API later. This key does not have to match your actual OpenAI key, and you don't even need to
|
||||
> have an OpenAI API key. It is recommended that you take this step and ensure that you keep the key safe and private.
|
||||
|
||||
1. Open the AWS Management Console and navigate to the Systems Manager service.
|
||||
2. In the left-hand navigation pane, click on "Parameter Store".
|
||||
3. Click on the "Create parameter" button.
|
||||
4. In the "Create parameter" window, select the following options:
|
||||
- Name: Enter a descriptive name for your parameter (e.g., "BedrockProxyAPIKey").
|
||||
- Description: Optionally, provide a description for the parameter.
|
||||
- Tier: Select **Standard**.
|
||||
- Type: Select **SecureString**.
|
||||
- Value: Any string (without spaces).
|
||||
5. Click "Create parameter".
|
||||
6. Make a note of the parameter name you used (e.g., "BedrockProxyAPIKey"). You'll need this in the next step.
|
||||
|
||||
**Step 2: Deploy the CloudFormation stack**
|
||||
|
||||
1. Sign in to AWS Management Console, switch to the region to deploy the CloudFormation Stack to.
|
||||
2. Click the following button to launch the CloudFormation Stack in that region.
|
||||
|
||||
[](https://console.aws.amazon.com/cloudformation/home#/stacks/create/template?stackName=BedrockProxyAPI&templateURL=https://aws-gcr-solutions.s3.amazonaws.com/bedrock-proxy-api/latest/BedrockProxy.template)
|
||||
|
||||
3. Click "Next".
|
||||
4. On the "Specify stack details" page, provide the following information:
|
||||
- Stack name: Change the stack name if needed.
|
||||
- ApiKeyParam (if you set up an API key in Step 1): Enter the parameter name you used for storing the API key (
|
||||
e.g., "BedrockProxyAPIKey"). If you did not set up an API key, leave this field blank.
|
||||
Click "Next".
|
||||
5. On the "Configure stack options" page, you can leave the default settings or customize them according to your needs.
|
||||
6. Click "Next".
|
||||
7. On the "Review" page, review the details of the stack you're about to create. Check the "I acknowledge that AWS
|
||||
CloudFormation might create IAM resources" checkbox at the bottom.
|
||||
8. Click "Create stack".
|
||||
|
||||
That is it! Once deployed, click the CloudFormation stack and go to **Outputs** tab, you can find the API Base URL
|
||||
from `APIBaseUrl`, the value should look like `http://xxxx.xxx.elb.amazonaws.com/api/v1`.
|
||||
|
||||
### SDK/API Usage
|
||||
|
||||
All you need is the API Key and the API Base URL. And if you didn't
|
||||
set up your own key, then the default API Key `bedrock` will be used.
|
||||
|
||||
Now, you can try out the proxy APIs. Let's say you want to test Claude 3 Sonnet model, then
|
||||
use `anthropic.claude-3-sonnet-20240229-v1:0` as the Model ID.
|
||||
|
||||
- **Example API Usage**
|
||||
|
||||
```bash
|
||||
curl https://<API base url>/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Authorization: Bearer <API Key>" \
|
||||
-d '{
|
||||
"model": "anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Hello!"
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
- **Example SDK Usage**
|
||||
|
||||
```bash
|
||||
export OPENAI_API_KEY=<API key>
|
||||
export OPENAI_API_BASE=<API base url>
|
||||
```
|
||||
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI()
|
||||
completion = client.chat.completions.create(
|
||||
model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
messages=[{"role": "user", "content": "Hello!"}],
|
||||
)
|
||||
|
||||
print(completion.choices[0].message.content)
|
||||
```
|
||||
|
||||
## Other Examples
|
||||
|
||||
### AutoGen
|
||||
|
||||
Below is an image of setting up the model in AutoGen studio.
|
||||
|
||||

|
||||
|
||||
### LangChain
|
||||
|
||||
Make sure you use `ChatOpenAI(...)` instead of `OpenAI(...)`
|
||||
|
||||
```python
|
||||
# pip install langchain-openai
|
||||
from langchain.chains import LLMChain
|
||||
from langchain.prompts import PromptTemplate
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
chat = ChatOpenAI(
|
||||
model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
temperature=0,
|
||||
openai_api_key="xxxx",
|
||||
openai_api_base="http://xxx.elb.amazonaws.com/api/v1",
|
||||
)
|
||||
|
||||
template = """Question: {question}
|
||||
|
||||
Answer: Let's think step by step."""
|
||||
|
||||
prompt = PromptTemplate.from_template(template)
|
||||
llm_chain = LLMChain(prompt=prompt, llm=chat)
|
||||
|
||||
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
|
||||
response = llm_chain.invoke(question)
|
||||
print(response)
|
||||
|
||||
```
|
||||
|
||||
## FAQs
|
||||
|
||||
### About Privacy
|
||||
|
||||
This application does not collect any of your data. Furthermore, it does not log any requests or responses by default.
|
||||
|
||||
### Why not used API Gateway instead of Application Load Balancer?
|
||||
|
||||
Short answer is that API Gateway does not support server-sent events (SSE) for streaming response.
|
||||
|
||||
### Which regions are supported?
|
||||
|
||||
This solution only supports the regions where Amazon Bedrock is available, so:
|
||||
|
||||
- US East (N. Virginia)
|
||||
- US West (Oregon)
|
||||
- Asia Pacific (Singapore)
|
||||
- Asia Pacific (Tokyo)
|
||||
- Europe (Frankfurt)
|
||||
|
||||
Note that not all models are available in those regions.
|
||||
|
||||
### Can I build and use my own ECR image
|
||||
|
||||
Yes, you can clone the repo and build the container image by yourself (src/Dockerfile) and then push to your ECR repo.
|
||||
|
||||
Replace the repo url in the CloudFormation template before you deploy.
|
||||
|
||||
### Can I run this locally
|
||||
|
||||
Yes, you can run this locally, then the API base url should be like `http://localhost:8000/api/v1`
|
||||
|
||||
### Any performance sacrifice or latency increase by using the proxy APIs
|
||||
|
||||
This is yet to be tested. But you should use this solution for PoC only.
|
||||
|
||||
### Any plan to support SageMaker models?
|
||||
|
||||
Currently, there is no plan of supporting SageMaker models. This depends on if there are customer asks.
|
||||
|
||||
### Any plan to support Bedrock custom models?
|
||||
|
||||
Fine-tuned models and models with Provisioned Throughput are not supported. You can clone the repo and make the
|
||||
customization if needed.
|
||||
|
||||
### How to upgrade?
|
||||
|
||||
If there is no changes on architecture, you can simply deploy the latest image to your Lambda to use the new
|
||||
features (manually) without redeploying the whole CloudFormation stack.
|
||||
|
||||
## Security
|
||||
|
||||
|
||||
Reference in New Issue
Block a user