feat(apigw): add API Gateway response streaming support (#207)
Replace ALB + Lambda architecture with API Gateway REST API + Lambda using response streaming for SSE support. This provides: - No VPC required, reducing complexity and cost - Native streaming support via API Gateway response streaming - Pay-per-request pricing model Changes: - Add Lambda Web Adapter to Dockerfile for streaming support - Replace BedrockProxy.template with API Gateway configuration - Update README with new deployment options and latest models - Update architecture diagram for API Gateway flow
This commit is contained in:
42
README.md
42
README.md
@@ -4,9 +4,16 @@ OpenAI-compatible RESTful APIs for Amazon Bedrock
|
|||||||
|
|
||||||
## What's New 🔥
|
## What's New 🔥
|
||||||
|
|
||||||
This project now supports **Claude Sonnet 4.5**, Anthropic's most intelligent model with enhanced coding capabilities and complex agent support, available via global cross-region inference.
|
**API Gateway Response Streaming Support** - You can now deploy with Amazon API Gateway REST API instead of ALB, enabling true response streaming for better latency and cost optimization. See [Deployment Options](#deployment-options) for details.
|
||||||
|
|
||||||
It also supports reasoning for both **Claude 3.7 Sonnet** and **DeepSeek R1**. Check [How to Use](./docs/Usage.md#reasoning) for more details. You need to first run the Models API to refresh the model list.
|
**Latest Models Supported:**
|
||||||
|
- **Claude 4.5 Family**: Opus 4.5, Sonnet 4.5, Haiku 4.5 - Anthropic's most intelligent models with enhanced coding and agent capabilities
|
||||||
|
- **Amazon Nova**: Nova Micro, Nova Lite, Nova Pro, Nova Premier - Amazon's native foundation models with multimodal support
|
||||||
|
- **DeepSeek**: DeepSeek-R1 (reasoning), DeepSeek-V3.1 - Advanced reasoning and general-purpose models
|
||||||
|
- **Qwen 3**: Qwen3-32B, Qwen3-235B, Qwen3-Coder-30B, Qwen3-Coder-480B - Alibaba's latest language and coding models
|
||||||
|
- **OpenAI OSS**: gpt-oss-20b, gpt-oss-120b - Open-source GPT models available via Bedrock
|
||||||
|
|
||||||
|
It also supports reasoning for **Claude 4/4.5** (extended thinking and interleaved thinking) and **DeepSeek R1**. Check [How to Use](./docs/Usage.md#reasoning) for more details. You need to first run the Models API to refresh the model list.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
@@ -46,13 +53,18 @@ Please make sure you have met below prerequisites:
|
|||||||
|
|
||||||
### Architecture
|
### Architecture
|
||||||
|
|
||||||
The following diagram illustrates the reference architecture. Note that it also includes a new **VPC** with two public subnets only for the Application Load Balancer (ALB).
|
The following diagram illustrates the reference architecture. It uses [Amazon API Gateway response streaming](https://aws.amazon.com/blogs/compute/building-responsive-apis-with-amazon-api-gateway-response-streaming/) with Lambda for SSE support.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
You can also choose to use [AWS Fargate](https://aws.amazon.com/fargate/) behind the ALB instead of [AWS Lambda](https://aws.amazon.com/lambda/), the main difference is the latency of the first byte for streaming response (Fargate is lower).
|
### Deployment Options
|
||||||
|
|
||||||
Alternatively, you can use Lambda Function URL to replace ALB, see [example](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming)
|
| Option | Pros | Cons | Best For |
|
||||||
|
|--------|------|------|----------|
|
||||||
|
| **API Gateway + Lambda** | No VPC required, pay-per-request, native streaming support, lower operational overhead | Potential cold starts | Most use cases, cost-sensitive deployments |
|
||||||
|
| **ALB + Fargate** | Lowest streaming latency, no cold starts | Higher cost, requires VPC | High-throughput, latency-sensitive workloads |
|
||||||
|
|
||||||
|
You can also use Lambda Function URL as an alternative, see [example](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming)
|
||||||
|
|
||||||
### Deployment
|
### Deployment
|
||||||
|
|
||||||
@@ -105,8 +117,8 @@ After creation, you'll see your secret in the Secrets Manager console. Make note
|
|||||||
**Step 3: Deploy the CloudFormation stack**
|
**Step 3: Deploy the CloudFormation stack**
|
||||||
|
|
||||||
1. Download the CloudFormation template you want to use:
|
1. Download the CloudFormation template you want to use:
|
||||||
- For Lambda: [`deployment/BedrockProxy.template`](deployment/BedrockProxy.template)
|
- For API Gateway + Lambda: [`deployment/BedrockProxy.template`](deployment/BedrockProxy.template)
|
||||||
- For Fargate: [`deployment/BedrockProxyFargate.template`](deployment/BedrockProxyFargate.template)
|
- For ALB + Fargate: [`deployment/BedrockProxyFargate.template`](deployment/BedrockProxyFargate.template)
|
||||||
|
|
||||||
2. Sign in to AWS Management Console and navigate to the CloudFormation service in your target region.
|
2. Sign in to AWS Management Console and navigate to the CloudFormation service in your target region.
|
||||||
|
|
||||||
@@ -227,7 +239,7 @@ For more information about creating and managing application inference profiles,
|
|||||||
This proxy now supports **Prompt Caching** for Claude and Nova models, which can reduce costs by up to 90% and latency by up to 85% for workloads with repeated prompts.
|
This proxy now supports **Prompt Caching** for Claude and Nova models, which can reduce costs by up to 90% and latency by up to 85% for workloads with repeated prompts.
|
||||||
|
|
||||||
**Supported Models:**
|
**Supported Models:**
|
||||||
- Claude 3+ models (Claude 3.5 Haiku, Claude 3.7 Sonnet, Claude 4, Claude 4.5, etc.)
|
- Claude models (Claude 3.5 Haiku, Claude 4, Claude 4.5, etc.)
|
||||||
- Nova models (Nova Micro, Nova Lite, Nova Pro, Nova Premier)
|
- Nova models (Nova Micro, Nova Lite, Nova Pro, Nova Premier)
|
||||||
|
|
||||||
**Enabling Prompt Caching:**
|
**Enabling Prompt Caching:**
|
||||||
@@ -249,7 +261,7 @@ client = OpenAI()
|
|||||||
|
|
||||||
# Cache system prompts
|
# Cache system prompts
|
||||||
response = client.chat.completions.create(
|
response = client.chat.completions.create(
|
||||||
model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
|
model="global.anthropic.claude-haiku-4-5-20251001-v1:0",
|
||||||
messages=[
|
messages=[
|
||||||
{"role": "system", "content": "You are an expert assistant with knowledge of..."},
|
{"role": "system", "content": "You are an expert assistant with knowledge of..."},
|
||||||
{"role": "user", "content": "Help me with this task"}
|
{"role": "user", "content": "Help me with this task"}
|
||||||
@@ -271,7 +283,7 @@ curl $OPENAI_BASE_URL/chat/completions \
|
|||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-H "Authorization: Bearer $OPENAI_API_KEY" \
|
-H "Authorization: Bearer $OPENAI_API_KEY" \
|
||||||
-d '{
|
-d '{
|
||||||
"model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
|
"model": "global.anthropic.claude-haiku-4-5-20251001-v1:0",
|
||||||
"messages": [
|
"messages": [
|
||||||
{"role": "system", "content": "Long system prompt..."},
|
{"role": "system", "content": "Long system prompt..."},
|
||||||
{"role": "user", "content": "Question"}
|
{"role": "user", "content": "Question"}
|
||||||
@@ -334,9 +346,11 @@ print(response)
|
|||||||
|
|
||||||
This application does not collect any of your data. Furthermore, it does not log any requests or responses by default.
|
This application does not collect any of your data. Furthermore, it does not log any requests or responses by default.
|
||||||
|
|
||||||
### Why not used API Gateway instead of Application Load Balancer?
|
### Why choose API Gateway vs ALB?
|
||||||
|
|
||||||
Short answer is that API Gateway does not support server-sent events (SSE) for streaming response.
|
**API Gateway + Lambda** uses [API Gateway response streaming](https://aws.amazon.com/blogs/compute/building-responsive-apis-with-amazon-api-gateway-response-streaming/) with [Lambda Web Adapter](https://github.com/awslabs/aws-lambda-web-adapter) to support SSE streaming without requiring a VPC. This is a cost-effective, serverless option with up to 10 minutes timeout.
|
||||||
|
|
||||||
|
**ALB + Fargate** provides the lowest streaming latency with no cold starts, ideal for high-throughput workloads.
|
||||||
|
|
||||||
### Which regions are supported?
|
### Which regions are supported?
|
||||||
|
|
||||||
@@ -360,9 +374,9 @@ The API base url should look like `http://localhost:8000/api/v1`.
|
|||||||
|
|
||||||
### Any performance sacrifice or latency increase by using the proxy APIs
|
### Any performance sacrifice or latency increase by using the proxy APIs
|
||||||
|
|
||||||
Comparing with the AWS SDK call, the referenced architecture will bring additional latency on response, you can try and test that on you own.
|
Compared with direct AWS SDK calls, the proxy architecture will add some latency. The default API Gateway + Lambda deployment provides good streaming performance with Lambda response streaming.
|
||||||
|
|
||||||
Also, you can use Lambda Web Adapter + Function URL (see [example](https://github.com/awslabs/aws-lambda-web-adapter/tree/main/examples/fastapi-response-streaming)) to replace ALB or AWS Fargate to replace Lambda to get better performance on streaming response.
|
For lowest latency on streaming responses, consider the ALB + Fargate deployment option which eliminates cold starts and provides consistent performance.
|
||||||
|
|
||||||
### Any plan to support SageMaker models?
|
### Any plan to support SageMaker models?
|
||||||
|
|
||||||
|
|||||||
BIN
assets/arch.png
BIN
assets/arch.png
Binary file not shown.
|
Before Width: | Height: | Size: 54 KiB After Width: | Height: | Size: 50 KiB |
@@ -1,4 +1,4 @@
|
|||||||
Description: Bedrock Access Gateway - OpenAI-compatible RESTful APIs for Amazon Bedrock
|
Description: Bedrock Access Gateway - OpenAI-compatible RESTful APIs for Amazon Bedrock (API Gateway + Lambda with Streaming)
|
||||||
Parameters:
|
Parameters:
|
||||||
ApiKeySecretArn:
|
ApiKeySecretArn:
|
||||||
Type: String
|
Type: String
|
||||||
@@ -19,116 +19,8 @@ Parameters:
|
|||||||
- "false"
|
- "false"
|
||||||
Description: Enable prompt caching for supported models (Claude, Nova). When enabled, adds cachePoint to system prompts and messages for cost savings.
|
Description: Enable prompt caching for supported models (Claude, Nova). When enabled, adds cachePoint to system prompts and messages for cost savings.
|
||||||
Resources:
|
Resources:
|
||||||
VPCB9E5F0B4:
|
# IAM Role for Lambda
|
||||||
Type: AWS::EC2::VPC
|
ProxyApiHandlerServiceRole:
|
||||||
Properties:
|
|
||||||
CidrBlock: 10.250.0.0/16
|
|
||||||
EnableDnsHostnames: true
|
|
||||||
EnableDnsSupport: true
|
|
||||||
InstanceTenancy: default
|
|
||||||
Tags:
|
|
||||||
- Key: Name
|
|
||||||
Value: BedrockProxy/VPC
|
|
||||||
VPCPublicSubnet1SubnetB4246D30:
|
|
||||||
Type: AWS::EC2::Subnet
|
|
||||||
Properties:
|
|
||||||
AvailabilityZone:
|
|
||||||
Fn::Select:
|
|
||||||
- 0
|
|
||||||
- Fn::GetAZs: ""
|
|
||||||
CidrBlock: 10.250.0.0/24
|
|
||||||
MapPublicIpOnLaunch: true
|
|
||||||
Tags:
|
|
||||||
- Key: aws-cdk:subnet-name
|
|
||||||
Value: Public
|
|
||||||
- Key: aws-cdk:subnet-type
|
|
||||||
Value: Public
|
|
||||||
- Key: Name
|
|
||||||
Value: BedrockProxy/VPC/PublicSubnet1
|
|
||||||
VpcId:
|
|
||||||
Ref: VPCB9E5F0B4
|
|
||||||
VPCPublicSubnet1RouteTableFEE4B781:
|
|
||||||
Type: AWS::EC2::RouteTable
|
|
||||||
Properties:
|
|
||||||
Tags:
|
|
||||||
- Key: Name
|
|
||||||
Value: BedrockProxy/VPC/PublicSubnet1
|
|
||||||
VpcId:
|
|
||||||
Ref: VPCB9E5F0B4
|
|
||||||
VPCPublicSubnet1RouteTableAssociation0B0896DC:
|
|
||||||
Type: AWS::EC2::SubnetRouteTableAssociation
|
|
||||||
Properties:
|
|
||||||
RouteTableId:
|
|
||||||
Ref: VPCPublicSubnet1RouteTableFEE4B781
|
|
||||||
SubnetId:
|
|
||||||
Ref: VPCPublicSubnet1SubnetB4246D30
|
|
||||||
VPCPublicSubnet1DefaultRoute91CEF279:
|
|
||||||
Type: AWS::EC2::Route
|
|
||||||
Properties:
|
|
||||||
DestinationCidrBlock: 0.0.0.0/0
|
|
||||||
GatewayId:
|
|
||||||
Ref: VPCIGWB7E252D3
|
|
||||||
RouteTableId:
|
|
||||||
Ref: VPCPublicSubnet1RouteTableFEE4B781
|
|
||||||
DependsOn:
|
|
||||||
- VPCVPCGW99B986DC
|
|
||||||
VPCPublicSubnet2Subnet74179F39:
|
|
||||||
Type: AWS::EC2::Subnet
|
|
||||||
Properties:
|
|
||||||
AvailabilityZone:
|
|
||||||
Fn::Select:
|
|
||||||
- 1
|
|
||||||
- Fn::GetAZs: ""
|
|
||||||
CidrBlock: 10.250.1.0/24
|
|
||||||
MapPublicIpOnLaunch: true
|
|
||||||
Tags:
|
|
||||||
- Key: aws-cdk:subnet-name
|
|
||||||
Value: Public
|
|
||||||
- Key: aws-cdk:subnet-type
|
|
||||||
Value: Public
|
|
||||||
- Key: Name
|
|
||||||
Value: BedrockProxy/VPC/PublicSubnet2
|
|
||||||
VpcId:
|
|
||||||
Ref: VPCB9E5F0B4
|
|
||||||
VPCPublicSubnet2RouteTable6F1A15F1:
|
|
||||||
Type: AWS::EC2::RouteTable
|
|
||||||
Properties:
|
|
||||||
Tags:
|
|
||||||
- Key: Name
|
|
||||||
Value: BedrockProxy/VPC/PublicSubnet2
|
|
||||||
VpcId:
|
|
||||||
Ref: VPCB9E5F0B4
|
|
||||||
VPCPublicSubnet2RouteTableAssociation5A808732:
|
|
||||||
Type: AWS::EC2::SubnetRouteTableAssociation
|
|
||||||
Properties:
|
|
||||||
RouteTableId:
|
|
||||||
Ref: VPCPublicSubnet2RouteTable6F1A15F1
|
|
||||||
SubnetId:
|
|
||||||
Ref: VPCPublicSubnet2Subnet74179F39
|
|
||||||
VPCPublicSubnet2DefaultRouteB7481BBA:
|
|
||||||
Type: AWS::EC2::Route
|
|
||||||
Properties:
|
|
||||||
DestinationCidrBlock: 0.0.0.0/0
|
|
||||||
GatewayId:
|
|
||||||
Ref: VPCIGWB7E252D3
|
|
||||||
RouteTableId:
|
|
||||||
Ref: VPCPublicSubnet2RouteTable6F1A15F1
|
|
||||||
DependsOn:
|
|
||||||
- VPCVPCGW99B986DC
|
|
||||||
VPCIGWB7E252D3:
|
|
||||||
Type: AWS::EC2::InternetGateway
|
|
||||||
Properties:
|
|
||||||
Tags:
|
|
||||||
- Key: Name
|
|
||||||
Value: BedrockProxy/VPC
|
|
||||||
VPCVPCGW99B986DC:
|
|
||||||
Type: AWS::EC2::VPCGatewayAttachment
|
|
||||||
Properties:
|
|
||||||
InternetGatewayId:
|
|
||||||
Ref: VPCIGWB7E252D3
|
|
||||||
VpcId:
|
|
||||||
Ref: VPCB9E5F0B4
|
|
||||||
ProxyApiHandlerServiceRoleBE71BFB1:
|
|
||||||
Type: AWS::IAM::Role
|
Type: AWS::IAM::Role
|
||||||
Properties:
|
Properties:
|
||||||
AssumeRolePolicyDocument:
|
AssumeRolePolicyDocument:
|
||||||
@@ -139,12 +31,9 @@ Resources:
|
|||||||
Service: lambda.amazonaws.com
|
Service: lambda.amazonaws.com
|
||||||
Version: "2012-10-17"
|
Version: "2012-10-17"
|
||||||
ManagedPolicyArns:
|
ManagedPolicyArns:
|
||||||
- Fn::Join:
|
- !Sub "arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
|
||||||
- ""
|
|
||||||
- - "arn:"
|
ProxyApiHandlerServiceRoleDefaultPolicy:
|
||||||
- Ref: AWS::Partition
|
|
||||||
- :iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
|
|
||||||
ProxyApiHandlerServiceRoleDefaultPolicy86681202:
|
|
||||||
Type: AWS::IAM::Policy
|
Type: AWS::IAM::Policy
|
||||||
Properties:
|
Properties:
|
||||||
PolicyDocument:
|
PolicyDocument:
|
||||||
@@ -166,122 +55,124 @@ Resources:
|
|||||||
- secretsmanager:GetSecretValue
|
- secretsmanager:GetSecretValue
|
||||||
- secretsmanager:DescribeSecret
|
- secretsmanager:DescribeSecret
|
||||||
Effect: Allow
|
Effect: Allow
|
||||||
Resource:
|
Resource: !Ref ApiKeySecretArn
|
||||||
Ref: ApiKeySecretArn
|
|
||||||
Version: "2012-10-17"
|
Version: "2012-10-17"
|
||||||
PolicyName: ProxyApiHandlerServiceRoleDefaultPolicy86681202
|
PolicyName: ProxyApiHandlerServiceRoleDefaultPolicy
|
||||||
Roles:
|
Roles:
|
||||||
- Ref: ProxyApiHandlerServiceRoleBE71BFB1
|
- !Ref ProxyApiHandlerServiceRole
|
||||||
ProxyApiHandlerEC15A492:
|
|
||||||
|
# Lambda Function with Lambda Web Adapter for streaming
|
||||||
|
ProxyApiHandler:
|
||||||
Type: AWS::Lambda::Function
|
Type: AWS::Lambda::Function
|
||||||
Properties:
|
Properties:
|
||||||
Architectures:
|
Architectures:
|
||||||
- arm64
|
- arm64
|
||||||
Code:
|
Code:
|
||||||
ImageUri:
|
ImageUri: !Ref ContainerImageUri
|
||||||
Ref: ContainerImageUri
|
Description: Bedrock Proxy API Handler with Response Streaming
|
||||||
Description: Bedrock Proxy API Handler
|
|
||||||
Environment:
|
Environment:
|
||||||
Variables:
|
Variables:
|
||||||
|
# Lambda Web Adapter settings
|
||||||
|
AWS_LWA_INVOKE_MODE: RESPONSE_STREAM
|
||||||
|
AWS_LWA_READINESS_CHECK_PATH: /health
|
||||||
|
AWS_LWA_ASYNC_INIT: "true"
|
||||||
|
PORT: "8080"
|
||||||
|
# Application settings
|
||||||
DEBUG: "false"
|
DEBUG: "false"
|
||||||
API_KEY_SECRET_ARN:
|
API_KEY_SECRET_ARN: !Ref ApiKeySecretArn
|
||||||
Ref: ApiKeySecretArn
|
DEFAULT_MODEL: !Ref DefaultModelId
|
||||||
DEFAULT_MODEL:
|
|
||||||
Ref: DefaultModelId
|
|
||||||
DEFAULT_EMBEDDING_MODEL: cohere.embed-multilingual-v3
|
DEFAULT_EMBEDDING_MODEL: cohere.embed-multilingual-v3
|
||||||
ENABLE_CROSS_REGION_INFERENCE: "true"
|
ENABLE_CROSS_REGION_INFERENCE: "true"
|
||||||
ENABLE_APPLICATION_INFERENCE_PROFILES: "true"
|
ENABLE_APPLICATION_INFERENCE_PROFILES: "true"
|
||||||
ENABLE_PROMPT_CACHING:
|
ENABLE_PROMPT_CACHING: !Ref EnablePromptCaching
|
||||||
Ref: EnablePromptCaching
|
API_ROUTE_PREFIX: /v1
|
||||||
MemorySize: 1024
|
MemorySize: 1024
|
||||||
PackageType: Image
|
PackageType: Image
|
||||||
Role:
|
Role: !GetAtt ProxyApiHandlerServiceRole.Arn
|
||||||
Fn::GetAtt:
|
|
||||||
- ProxyApiHandlerServiceRoleBE71BFB1
|
|
||||||
- Arn
|
|
||||||
Timeout: 600
|
Timeout: 600
|
||||||
DependsOn:
|
DependsOn:
|
||||||
- ProxyApiHandlerServiceRoleDefaultPolicy86681202
|
- ProxyApiHandlerServiceRoleDefaultPolicy
|
||||||
- ProxyApiHandlerServiceRoleBE71BFB1
|
- ProxyApiHandlerServiceRole
|
||||||
ProxyApiHandlerInvoke2UTWxhlfyqbT5FTn5jvgbLgjFfJwzswGk55DU1HYF6C33779:
|
|
||||||
|
# API Gateway REST API (Regional)
|
||||||
|
RestApi:
|
||||||
|
Type: AWS::ApiGateway::RestApi
|
||||||
|
Properties:
|
||||||
|
Name: BedrockProxyApi
|
||||||
|
Description: Bedrock Access Gateway - OpenAI-compatible API with streaming support
|
||||||
|
EndpointConfiguration:
|
||||||
|
Types:
|
||||||
|
- REGIONAL
|
||||||
|
Body:
|
||||||
|
openapi: "3.0.1"
|
||||||
|
info:
|
||||||
|
title: BedrockProxyApi
|
||||||
|
version: "1.0"
|
||||||
|
paths:
|
||||||
|
/{proxy+}:
|
||||||
|
x-amazon-apigateway-any-method:
|
||||||
|
parameters:
|
||||||
|
- name: proxy
|
||||||
|
in: path
|
||||||
|
required: true
|
||||||
|
schema:
|
||||||
|
type: string
|
||||||
|
x-amazon-apigateway-integration:
|
||||||
|
type: aws_proxy
|
||||||
|
httpMethod: POST
|
||||||
|
uri: !Sub "arn:aws:apigateway:${AWS::Region}:lambda:path/2021-11-15/functions/${ProxyApiHandler.Arn}/response-streaming-invocations"
|
||||||
|
passthroughBehavior: when_no_match
|
||||||
|
timeoutInMillis: 600000
|
||||||
|
responseTransferMode: STREAM
|
||||||
|
responses:
|
||||||
|
default:
|
||||||
|
description: Default response
|
||||||
|
/:
|
||||||
|
x-amazon-apigateway-any-method:
|
||||||
|
x-amazon-apigateway-integration:
|
||||||
|
type: aws_proxy
|
||||||
|
httpMethod: POST
|
||||||
|
uri: !Sub "arn:aws:apigateway:${AWS::Region}:lambda:path/2021-11-15/functions/${ProxyApiHandler.Arn}/response-streaming-invocations"
|
||||||
|
passthroughBehavior: when_no_match
|
||||||
|
timeoutInMillis: 600000
|
||||||
|
responseTransferMode: STREAM
|
||||||
|
responses:
|
||||||
|
default:
|
||||||
|
description: Default response
|
||||||
|
|
||||||
|
# Lambda Permission for API Gateway
|
||||||
|
LambdaPermission:
|
||||||
Type: AWS::Lambda::Permission
|
Type: AWS::Lambda::Permission
|
||||||
Properties:
|
Properties:
|
||||||
|
FunctionName: !Ref ProxyApiHandler
|
||||||
Action: lambda:InvokeFunction
|
Action: lambda:InvokeFunction
|
||||||
FunctionName:
|
Principal: apigateway.amazonaws.com
|
||||||
Fn::GetAtt:
|
SourceArn: !Sub "arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${RestApi}/*"
|
||||||
- ProxyApiHandlerEC15A492
|
|
||||||
- Arn
|
# API Gateway Deployment
|
||||||
Principal: elasticloadbalancing.amazonaws.com
|
ApiDeployment:
|
||||||
ProxyALB87756780:
|
Type: AWS::ApiGateway::Deployment
|
||||||
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
|
|
||||||
Properties:
|
Properties:
|
||||||
LoadBalancerAttributes:
|
RestApiId: !Ref RestApi
|
||||||
- Key: deletion_protection.enabled
|
|
||||||
Value: "false"
|
|
||||||
Scheme: internet-facing
|
|
||||||
SecurityGroups:
|
|
||||||
- Fn::GetAtt:
|
|
||||||
- ProxyALBSecurityGroup0D6CA3DA
|
|
||||||
- GroupId
|
|
||||||
Subnets:
|
|
||||||
- Ref: VPCPublicSubnet1SubnetB4246D30
|
|
||||||
- Ref: VPCPublicSubnet2Subnet74179F39
|
|
||||||
Type: application
|
|
||||||
DependsOn:
|
DependsOn:
|
||||||
- VPCPublicSubnet1DefaultRoute91CEF279
|
- RestApi
|
||||||
- VPCPublicSubnet1RouteTableAssociation0B0896DC
|
|
||||||
- VPCPublicSubnet2DefaultRouteB7481BBA
|
# API Gateway Stage
|
||||||
- VPCPublicSubnet2RouteTableAssociation5A808732
|
ApiStage:
|
||||||
ProxyALBSecurityGroup0D6CA3DA:
|
Type: AWS::ApiGateway::Stage
|
||||||
Type: AWS::EC2::SecurityGroup
|
|
||||||
Properties:
|
Properties:
|
||||||
GroupDescription: Automatically created Security Group for ELB BedrockProxyALB1CE4CAD1
|
RestApiId: !Ref RestApi
|
||||||
SecurityGroupEgress:
|
DeploymentId: !Ref ApiDeployment
|
||||||
- CidrIp: 255.255.255.255/32
|
StageName: api
|
||||||
Description: Disallow all traffic
|
Description: API Stage with streaming support
|
||||||
FromPort: 252
|
|
||||||
IpProtocol: icmp
|
|
||||||
ToPort: 86
|
|
||||||
SecurityGroupIngress:
|
|
||||||
- CidrIp: 0.0.0.0/0
|
|
||||||
Description: Allow from anyone on port 80
|
|
||||||
FromPort: 80
|
|
||||||
IpProtocol: tcp
|
|
||||||
ToPort: 80
|
|
||||||
VpcId:
|
|
||||||
Ref: VPCB9E5F0B4
|
|
||||||
ProxyALBListener933E9515:
|
|
||||||
Type: AWS::ElasticLoadBalancingV2::Listener
|
|
||||||
Properties:
|
|
||||||
DefaultActions:
|
|
||||||
- TargetGroupArn:
|
|
||||||
Ref: ProxyALBListenerTargetsGroup187739FA
|
|
||||||
Type: forward
|
|
||||||
LoadBalancerArn:
|
|
||||||
Ref: ProxyALB87756780
|
|
||||||
Port: 80
|
|
||||||
Protocol: HTTP
|
|
||||||
ProxyALBListenerTargetsGroup187739FA:
|
|
||||||
Type: AWS::ElasticLoadBalancingV2::TargetGroup
|
|
||||||
Properties:
|
|
||||||
HealthCheckEnabled: false
|
|
||||||
TargetType: lambda
|
|
||||||
Targets:
|
|
||||||
- Id:
|
|
||||||
Fn::GetAtt:
|
|
||||||
- ProxyApiHandlerEC15A492
|
|
||||||
- Arn
|
|
||||||
DependsOn:
|
|
||||||
- ProxyApiHandlerInvoke2UTWxhlfyqbT5FTn5jvgbLgjFfJwzswGk55DU1HYF6C33779
|
|
||||||
Outputs:
|
Outputs:
|
||||||
APIBaseUrl:
|
APIBaseUrl:
|
||||||
Description: Proxy API Base URL (OPENAI_API_BASE)
|
Description: Proxy API Base URL (OPENAI_API_BASE)
|
||||||
Value:
|
Value: !Sub "https://${RestApi}.execute-api.${AWS::Region}.amazonaws.com/api/v1"
|
||||||
Fn::Join:
|
RestApiId:
|
||||||
- ""
|
Description: API Gateway REST API ID
|
||||||
- - http://
|
Value: !Ref RestApi
|
||||||
- Fn::GetAtt:
|
LambdaFunctionArn:
|
||||||
- ProxyALB87756780
|
Description: Lambda Function ARN
|
||||||
- DNSName
|
Value: !GetAtt ProxyApiHandler.Arn
|
||||||
- /api/v1
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,9 +1,15 @@
|
|||||||
FROM public.ecr.aws/lambda/python:3.12
|
FROM public.ecr.aws/lambda/python:3.12
|
||||||
|
|
||||||
|
# Add Lambda Web Adapter for API Gateway response streaming
|
||||||
|
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.9.1 /lambda-adapter /opt/extensions/lambda-adapter
|
||||||
|
|
||||||
COPY ./api ./api
|
COPY ./api ./api
|
||||||
|
|
||||||
COPY requirements.txt .
|
COPY requirements.txt .
|
||||||
|
|
||||||
RUN pip3 install -r requirements.txt -U --no-cache-dir
|
RUN pip3 install -r requirements.txt -U --no-cache-dir
|
||||||
|
|
||||||
CMD [ "api.app.handler" ]
|
# Lambda Web Adapter requires overriding the Lambda base image entrypoint
|
||||||
|
# to run the web app directly instead of the Lambda runtime handler
|
||||||
|
ENTRYPOINT []
|
||||||
|
CMD ["python", "-m", "uvicorn", "api.app:app", "--host", "0.0.0.0", "--port", "8080"]
|
||||||
Reference in New Issue
Block a user