Amazon Bedrock Model Quotas

These are AWS default quotas for region us-east-1. An account's applied value can be higher if a limit increase was approved. Adjustable quotas can be raised via Service Quotas; many per-model token limits are adjustable, batch minimums are not. Hover any number for the exact value.

Anthropic Claude

39 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Global RPM	Global TPM	Global TPD	Tokens/day	Lat-opt RPM	Lat-opt TPM	Mantle ITPM	Mantle OTPM
Anthropic Claude 3 Haiku	1K	2M	2K	4M	—	—	—	2.88B	—	—	—	—
Anthropic Claude 3 Opus	50	400K	100	800K	—	—	—	—	—	—	—	—
Anthropic Claude 3 Sonnet	500	1M	1K	2M	—	—	—	—	—	—	—	—
Anthropic Claude 3.5 Haiku	1K	2M	2K	4M	—	—	—	2.88B	100	500K	—	—
Anthropic Claude 3.5 Sonnet V1	50	400K	100	800K	—	—	—	2.88B	—	—	—	—
Anthropic Claude 3.5 Sonnet V2	50	400K	100	800K	—	—	—	2.88B	—	—	—	—
Anthropic Claude 3.7 Sonnet V1	—	—	250	1M	—	—	—	720M	—	—	—	—
Anthropic Claude Fable 5	—	—	—	200K	—	500K	720M	144M	—	—	—	—
Anthropic Claude Haiku 4.5	—	—	10K	5M	10K	5M	7.2B	3.6B	—	—	—	—
Anthropic Claude Opus 4 V1	—	—	200	200K	—	—	—	144M	—	—	—	—
Anthropic Claude Opus 4.1	—	—	50	500K	—	—	—	360M	—	—	—	—
Anthropic Claude Opus 4.5	—	—	10K	2M	10K	2M	2.88B	1.44B	—	—	—	—
Anthropic Claude Opus 4.6 V1	—	—	10K	3M	10K	3M	4.32B	2.16B	—	—	—	—
Anthropic Claude Opus 4.7	—	—	—	30M	—	30M	43.2B	21.6B	—	—	20M	4M
Anthropic Claude Opus 4.8	—	—	—	30M	—	30M	43.2B	21.6B	—	—	20M	4M
Anthropic Claude Sonnet 4 V1	—	—	200	200K	200	200K	288M	144M	—	—	—	—
Anthropic Claude Sonnet 4 V1 1M Context Length	—	—	5	1M	—	—	—	720M	—	—	—	—
Anthropic Claude Sonnet 4.5 V1	—	—	10K	5M	10K	5M	7.2B	3.6B	—	—	—	—
Anthropic Claude Sonnet 4.5 V1 1M Context Length	—	—	1K	1M	1K	1M	1.44B	720M	—	—	—	—
Anthropic Claude Sonnet 4.6	—	—	10K	6M	10K	6M	8.64B	4.32B	—	—	—	—

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Anthropic Claude 3 Haiku	100K	100	100K	1	5	100
Anthropic Claude 3 Opus	100K	100	100K	1	5	100
Anthropic Claude 3 Sonnet	100K	100	100K	1	5	100
Anthropic Claude 3.5 Haiku	100K	100	100K	1	5	100
Anthropic Claude 3.5 Sonnet V1	100K	100	100K	1	5	100
Anthropic Claude 3.5 Sonnet V2	100K	100	100K	1	5	100
Anthropic Claude 3.7 Sonnet V1	100K	100	100K	1	5	100
Anthropic Claude Haiku 4.5	100K	100	100K	1	5	100
Anthropic Claude Opus 4.5	100K	100	100K	1	5	100
Anthropic Claude Opus 4.6 V1	100K	100	100K	1	5	100
Anthropic Claude Sonnet 4 V1	100K	100	100K	1	5	100
Anthropic Claude Sonnet 4.5 V1	100K	100	100K	1	5	100
Anthropic Claude Sonnet 4.6	100K	100	100K	1	5	100

Provisioned throughput

Model	MU/PT model
Anthropic Claude 3 Haiku 200K	0
Anthropic Claude 3 Haiku 48K	0
Anthropic Claude 3 Sonnet 200K	0
Anthropic Claude 3 Sonnet 28K	0
Anthropic Claude 3.5 Haiku 16K	0
Anthropic Claude 3.5 Haiku 200K	0
Anthropic Claude 3.5 Haiku 64K	0
Anthropic Claude 3.5 Sonnet 18K	0
Anthropic Claude 3.5 Sonnet 200K	0
Anthropic Claude 3.5 Sonnet 51K	0
Anthropic Claude 3.5 Sonnet V2 18K	0
Anthropic Claude 3.5 Sonnet V2 200K	0
Anthropic Claude 3.5 Sonnet V2 51K	0
Anthropic Claude Instant V1 100K	0
Anthropic Claude V2 100K	0
Anthropic Claude V2 18K	0
Anthropic Claude V2.1 18K	0
Anthropic Claude V2.1 200K	0

Model customization

Model	Train+val records
Anthropic Claude 3 Haiku	10K
Claude 3-5-Haiku	10K

Amazon (Nova / Titan)

36 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Global RPM	Global TPM	Global TPD	Tokens/day	Lat-opt RPM	Lat-opt TPM	Lat-opt TPD	Concurrent reqs	Async concurrent
Amazon Nova 2 Lite	—	—	2K	8M	2K	8M	11.52B	5.76B	—	—	—	—	—
Amazon Nova 2 Multimodal Embeddings V1	2K	—	—	—	—	—	—	—	—	—	—	—	30
Amazon Nova 2 Omni	—	—	2K	8M	2K	8M	11.52B	5.76B	—	—	—	—	—
Amazon Nova 2 Pro Preview	—	—	100	1M	100	1M	1.44B	720M	—	—	—	—	—
Amazon Nova 2 Sonic	—	—	—	—	—	—	—	—	—	—	—	20	—
Amazon Nova Canvas	100	—	—	—	—	—	—	—	—	—	—	—	—
Amazon Nova Lite	2K	4M	4K	8M	—	—	—	5.76B	—	—	—	—	—
Amazon Nova Micro	2K	4M	4K	8M	—	—	—	5.76B	—	—	—	—	—
Amazon Nova Premier V1	—	—	500	2M	—	—	—	1.44B	—	—	—	—	—
Amazon Nova Pro V1	250	1M	500	2M	—	—	—	1.44B	10	40K	57.6M	—	—
Amazon Nova Reel1.0	—	—	—	—	—	—	—	—	—	—	—	10	—
Amazon Nova Reel1.1	—	—	—	—	—	—	—	—	—	—	—	3	—
Amazon Nova Sonic	—	—	—	—	—	—	—	—	—	—	—	20	—
Amazon Rerank 1.0	200	—	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Image Generator G1	60	—	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Image Generator G1 V2	60	2K	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Multimodal Embeddings G1	2K	300K	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Text Embeddings	2K	300K	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Text Embeddings V2	6K	300K	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Text Express	400	300K	—	—	—	—	—	—	—	—	—	—	—
Amazon Titan Text Premier	100	300K	—	—	—	—	—	—	—	—	—	—	—

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)	Concurrent jobs (custom)
Amazon Nova 2 Lite	100K	100	100K	1	—	100	—
Amazon Nova 2 Multimodal Embeddings V1	100K	100	100K	1	100	100	—
Amazon Nova Lite	100K	100	100K	1	100	100	—
Amazon Nova Micro	100K	100	100K	1	5	100	—
Amazon Nova Premier V1	100K	100	100K	1	5	100	—
Amazon Nova Pro V1	100K	100	100K	1	100	100	—
Amazon Titan Multimodal Embeddings G1	100K	100	100K	1	5	100	3
Amazon Titan Text Embeddings V2	100K	100	100K	1	5	100	3

Provisioned throughput

Model	MU/PT model	MU/PT (24k ctx)	MU/PT (128k ctx)	MU/PT (300k ctx)	MU (no commit)
Amazon Nova 2 Lite V1.0 256K	0	—	—	—	—
Amazon Nova Canvas	0	—	—	—	—
Amazon Nova Lite	—	0	—	0	—
Amazon Nova Micro	—	0	0	—	—
Amazon Nova Pro V1	—	0	—	0	—
Amazon Titan Embeddings G1 - Text	0	—	—	—	—
Amazon Titan Image Generator G1	0	—	—	—	—
Amazon Titan Image Generator G2	0	—	—	—	—
Amazon Titan Lite V1 4K	0	—	—	—	—
Amazon Titan Multimodal Embeddings G1	0	—	—	—	—
Amazon Titan Text Embeddings V2	0	—	—	—	—
Amazon Titan Text G1 - Express 8K	0	—	—	—	—
Amazon Titan Text Premier V1 32K	0	—	—	—	—
base model Amazon Nova 2 Lite V1.0 256K	—	—	—	—	0
custom model Amazon Nova 2 Lite V1.0 256K	—	—	—	—	0

Model customization

Model	Train+val records	Custom deploy RPM	Custom deploy TPM	Custom deploy TPD	Max FT ctx length
Amazon Nova 2 Lite	20K	2K	4M	5.76B	—
Amazon Nova Lite	20K	2K	4M	5.76B	—
Amazon Nova Micro	20K	2K	4M	5.76B	—
Amazon Nova Micro V1 distillation customization jobs	—	—	—	—	32K
Amazon Nova Pro V1	20K	200	800K	1.15B	—
Amazon Nova V1 distillation customization jobs	—	—	—	—	32K
Amazon Titan Image Generator G1	10K	—	—	—	—
Amazon Titan Multimodal Embeddings G1	50K	—	—	—	—
Titan Text G1 - Express	10K	—	—	—	—
Titan Text G1 - Express v1 Continued Pre-Training job	100K	—	—	—	—
Titan Text G1 - Lite	10K	—	—	—	—
Titan Text G1 - Lite v1 Continued Pre-Training job	100K	—	—	—	—
Titan Text G1 - Premier	20K	—	—	—	—

Meta Llama

18 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Tokens/day	Lat-opt RPM	Lat-opt TPM
Meta Llama 3 70B Instruct	400	300K	—	—	—	—	—
Meta Llama 3 8B Instruct	800	300K	—	—	—	—	—
Meta Llama 3.1 70B Instruct	400	300K	800	600K	—	100	40K
Meta Llama 3.1 8B Instruct	800	300K	1.6K	600K	—	—	—
Meta Llama 3.2 11B Instruct	400	300K	—	—	432M	—	—
Meta Llama 3.2 1B Instruct	800	300K	1.6K	600K	432M	—	—
Meta Llama 3.2 3B Instruct	800	300K	1.6K	600K	432M	—	—
Meta Llama 3.2 90B Instruct	400	300K	—	—	432M	—	—
Meta Llama 3.3 70B Instruct	—	—	800	600K	—	—	—
Meta Llama 4 Maverick V1	—	—	800	600K	432M	—	—
Meta Llama 4 Scout V1	—	—	800	600K	432M	—	—

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Llama 3.1 405B Instruct	100K	100	100K	1	5	100
Meta Llama 3.1 70B Instruct	100K	100	100K	1	5	100
Meta Llama 3.1 8B Instruct	100K	100	100K	1	5	100
Meta Llama 3.2 11B Instruct	100K	100	100K	1	5	100
Meta Llama 3.2 1B Instruct	100K	100	100K	1	5	100
Meta Llama 3.2 3B Instruct	100K	100	100K	1	5	100
Meta Llama 3.2 90B Instruct	100K	100	100K	1	5	100
Meta Llama 3.3 70B Instruct	100K	100	100K	1	5	100
Meta Llama 4 Maverick V1	100K	100	100K	1	5	100
Meta Llama 4 Scout V1	100K	100	100K	1	5	100

Provisioned throughput

Model	MU/PT model	MU (commitment)
Meta Llama 2 13B	0	—
Meta Llama 2 70B	0	—
Meta Llama 2 Chat 13B	0	—
Meta Llama 2 Chat 70B	0	—
Meta Llama 3 70B Instruct	0	—
Meta Llama 3 8B Instruct	0	—
Meta Llama 4 Scout 17B Instruct 10M	—	0
Meta Llama 4 Scout 17B Instruct 128K	—	0

Model customization

Model	Train+val records
Meta Llama 2 13B	10K
Meta Llama 2 70B	10K

Mistral AI

23 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Tokens/day
Magistral Small 1.2	10K	100M	—	—	144B
Ministral 14B 3.0	10K	100M	—	—	144B
Ministral 3B 3.0	10K	100M	—	—	144B
Ministral 8B 3.0	10K	100M	—	—	144B
Mistral AI Mistral 7B Instruct	800	300K	—	—	432M
Mistral AI Mistral Large	400	300K	—	—	432M
Mistral AI Mistral Small	400	300K	—	—	432M
Mistral AI Mixtral 8X7B Instruct	—	—	—	—	432M
Mistral AI Mixtral 8X7BB Instruct	—	300K	—	—	—
Mistral Devstral 2 123b	10K	100M	—	—	144B
Mistral Large 3	10K	100M	—	—	144B
Mistral Mixtral 8x7b Instruct	400	—	—	—	—
Mistral Pixtral Large 25.02 V1	—	—	10	80K	57.6M
Voxtral Mini 1.0	10K	100M	—	—	144B
Voxtral Small 1.0	10K	100M	—	—	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Devstral 2 123B	100K	100	100K	1	5	100
Magistral Small 2509	100K	100	100K	1	5	100
Ministral 3 14B	100K	100	100K	1	5	100
Ministral 3 8B	100K	100	100K	1	5	100
Ministral 3B	100K	100	100K	1	5	100
Mistral AI Mistral Small	100K	100	100K	1	5	100
Mistral Large 2 (24.07)	100K	100	100K	1	5	100
Mistral Large 3	100K	100	100K	1	5	100
Voxtral Mini 3B 2507	100K	100	100K	1	5	100
Voxtral Small 24B 2507	100K	100	100K	1	5	100

Provisioned throughput

Model	MU/PT model
Mistral AI Mistral Small	0

Cohere

6 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Global RPM	Global TPM	Global TPD	Tokens/day
Cohere Command R	400	300K	—	—	—	—	—	—
Cohere Command R Plus	400	300K	—	—	—	—	—	—
Cohere Embed English	2K	300K	—	—	—	—	—	—
Cohere Embed Multilingual	2K	300K	—	—	—	—	—	—
Cohere Embed V4	1K	150K	2K	300K	2K	300K	432M	216M
Cohere Rerank 3.5	250	—	—	—	—	—	—	—

Provisioned throughput

Model	MU/PT model
Cohere Command R	0
Cohere Command R Plus	0
Cohere Embed English	0
Cohere Embed Multilingual	0

AI21 Labs

4 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
AI21 Labs Jamba 1.5 Large	100	300K	432M
AI21 Labs Jamba 1.5 Mini	100	300K	432M

Provisioned throughput

Model	MU/PT model
AI21 Labs Jurassic-2 Mid	0
AI21 Labs Jurassic-2 Ultra	0

DeepSeek

2 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Tokens/day
DeepSeek R1 V1	—	—	200	200K	144M
DeepSeek V3.2	10K	100M	—	—	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
DeepSeek V3.2	100K	100	100K	1	5	100

OpenAI

6 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day	Mantle ITPM	Mantle OTPM
GPT-5.4	—	—	—	20M	4M
GPT-5.5	—	—	—	5M	1M
OpenAI GPT OSS 120b	10K	100M	144B	—	—
OpenAI GPT OSS 20b	10K	100M	144B	—	—
OpenAI GPT OSS Safeguard 120b	10K	100M	144B	—	—
OpenAI GPT OSS Safeguard 20b	10K	100M	144B	—	—

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
OpenAI GPT OSS 120b	100K	100	100K	1	5	100
OpenAI GPT OSS 20b	100K	100	100K	1	5	100
OpenAI GPT OSS Safeguard 120b	100K	100	100K	1	5	100
OpenAI GPT OSS Safeguard 20b	100K	100	100K	1	5	100

Qwen

8 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
Qwen3 32B V1	10K	100M	144B
Qwen3 Coder 30B a3b V1	10K	100M	144B
Qwen3 Coder Next	10K	100M	144B
Qwen3 Next 80B A3B	10K	100M	144B
Qwen3 VL 235B A22B	10K	100M	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Qwen3 32B V1	100K	100	100K	1	5	100
Qwen3 Coder 30B	100K	100	100K	1	5	100
Qwen3 Coder Next	100K	100	100K	1	5	100
Qwen3 Next 80B	100K	100	100K	1	5	100
Qwen3 VL 235B	100K	100	100K	1	5	100

Z.ai (GLM)

5 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
Z.ai GLM 5	10K	100M	144B
Z.ai GLM-4.7	10K	100M	144B
Z.ai GLM-4.7 Flash	10K	100M	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
GLM 4.7	100K	100	100K	1	5	100
GLM 4.7 Flash	100K	100	100K	1	5	100
Z.ai GLM 5	100K	100	100K	1	5	100

Writer

3 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Cross-region RPM	Cross-region TPM	Tokens/day
Writer AI Palmyra X4 V1	—	—	10	150K	108M
Writer AI Palmyra X5 V1	—	—	10	150K	108M
Writer Palmyra Vision 7B	10K	100M	—	—	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Writer Palmyra Vision 7B	100K	100	100K	1	5	100

TwelveLabs

3 models

Inference rate limits

Model	On-demand RPM	Cross-region RPM	Concurrent reqs	Async concurrent
Twelve Labs Marengo	100	200	30	—
Twelve Labs Pegasus	60	120	30	—
TwelveLabs Marengo Embed 3.0	500	1K	—	10

Stability AI

15 models

Inference rate limits

Model	On-demand RPM	Cross-region RPM
Stable Image Conservative Upscale	2	4
Stable Image Control Sketch	10	20
Stable Image Control Structure	10	20
Stable Image Creative Upscale	2	4
Stable Image Erase Object	10	20
Stable Image Fast Upscale	10	20
Stable Image Inpaint	10	20
Stable Image Outpaint	2	4
Stable Image Remove Background	10	20
Stable Image Search and Recolor	10	20
Stable Image Search and Replace	10	20
Stable Image Style Guide	10	20
Stable Image Style Transfer	10	20

Provisioned throughput

Model	MU/PT model
Stability.ai Stable Diffusion XL 0.8	0
Stability.ai Stable Diffusion XL 1.0	0

Google Gemma

3 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
Gemma 3 12B	10K	100M	144B
Gemma 3 27B	10K	100M	144B
Gemma 3 4B	10K	100M	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Gemma 3 12B	100K	100	100K	1	5	100
Gemma 3 27B	100K	100	100K	1	5	100
Gemma 3 4B	100K	100	100K	1	5	100

NVIDIA

6 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
NVIDIA Nemotron 3 Super 120B A12B	10K	100M	144B
NVIDIA Nemotron Nano 2	10K	100M	144B
NVIDIA Nemotron Nano 2 VL	10K	100M	144B
NVIDIA Nemotron Nano 3 30B	10K	100M	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
NVIDIA Nemotron 3 Super 120B A12B	100K	100	100K	1	5	100
NVIDIA Nemotron Nano 12B	100K	100	100K	1	5	100
NVIDIA Nemotron Nano 3 30B	100K	100	100K	1	5	100
NVIDIA Nemotron Nano 9B	100K	100	100K	1	5	100

MiniMax

3 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
Minimax M2	10K	100M	144B
Minimax M2.1	10K	100M	144B
MiniMax M2.5	10K	100M	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Minimax M2	100K	100	100K	1	5	100
Minimax M2.1	100K	100	100K	1	5	100
MiniMax M2.5	100K	100	100K	1	5	100

Other

5 models

Inference rate limits

Model	On-demand RPM	On-demand TPM	Tokens/day
Kimi K2 Thinking	10K	100M	144B
Moonshot AI Kimi K2.5	10K	100M	144B

Batch inference

Model	Max records/job	Min records/job	Records/input file	Input file GB	Job size GB	Concurrent jobs (base)
Kimi K2 Thinking	100K	100	100K	1	5	100
Kimi K2.5	100K	100	100K	1	5	100

Provisioned throughput

Model	MU (commitment)
Meta Maverick 4 Scout 17B Instruct 128K	0
Meta Maverick 4 Scout 17B Instruct 1M	0

Account & API quotas

288 quotas

Service-wide limits not tied to a specific model — feature capacities, control-plane API request rates, and customization account limits.

Model customization (9)

Quota	Default value
Custom models per account	100
In-progress custom model deployments	2
Maximum input file size for distillation customization jobs	2
Maximum line length for distillation customization jobs	16
Maximum number of prompts for distillation customization jobs	15K
Maximum number of training records for an Amazon Nova Canvas Fine-tuning job	10K
Minimum number of prompts for distillation customization jobs	100
Scheduled customization jobs	10
Total number of custom model deployments	10

Knowledge Bases (37)

Quota	Default value
Concurrent ingestion jobs per account	5
Concurrent ingestion jobs per data source	1
Concurrent ingestion jobs per knowledge base	1
Concurrent IngestKnowledgeBaseDocuments and DeleteKnowledgeBaseDocuments requests per account	10
CreateDataSource requests per second	2
CreateKnowledgeBase requests per second	2
Data sources per knowledge base	5
DeleteDataSource requests per second	2
DeleteKnowledgeBase requests per second	2
DeleteKnowledgeBaseDocuments requests per second	5
Files to add or update per ingestion job	5M
Files to delete per ingestion job	5M
Files to ingest per IngestKnowledgeBaseDocuments job.	25
GenerateQuery requests per second	2
GetDataSource requests per second	10
GetIngestionJob requests per second	10
GetKnowledgeBase requests per second	10
GetKnowledgeBaseDocuments requests per second	5
Ingestion job file size with text content	50
Ingestion job size	100
IngestKnowledgeBaseDocuments requests per second	5
IngestKnowledgeBaseDocuments total payload size	6
Knowledge bases per account	100
ListDataSources requests per second	10
ListIngestionJobs requests per second	10
ListKnowledgeBaseDocuments requests per second	5
ListKnowledgeBases requests per second	10
Maximum number of files for BDA parser	1K
Maximum number of files for Foundation Models as a parser	1K
Rerank requests per second	10
Retrieve requests per second	20
RetrieveAndGenerate requests per second	20
RetrieveAndGenerateStream requests per second	20
StartIngestionJob requests per second	0.1
UpdateDataSource requests per second	2
UpdateKnowledgeBase requests per second	2
User query size	1K

Data Automation (39)

Quota	Default value
(Console) Maximum document file size (MB)	200
(Console) Maximum number of pages per document file	20
CreateBlueprint - Max number of blueprints per account	350
CreateBlueprintVersion - Max number of Blueprint versions per Blueprint	10
CreateDataAutomationLibrary - Max number of data automation libraries per account	10
Description length for fields (Characters)	300
InvokeBlueprintOptimizationAsync - Max number of blueprint optimization concurrent jobs	3
InvokeBlueprintOptimizationAsync - Max number of blueprint optimization jobs per day	30
InvokeDataAutomation(Sync) - Document - Max number of requests	60
InvokeDataAutomation(Sync) - Image - Max number of requests	200
InvokeDataAutomationAsync - Audio - Max number of concurrent jobs	20
InvokeDataAutomationAsync - Document - Max number of concurrent jobs	25
InvokeDataAutomationAsync - Image - Max number of concurrent jobs	20
InvokeDataAutomationAsync - Max number of open jobs	1.8K
InvokeDataAutomationAsync - Video - Max number of concurrent jobs	20
Max number of vocabulary phrases per library	500
Maximum audio file size (MB)	2K
Maximum audio length (Minutes)	240
Maximum Audio Sample Rate (Hz)	48K
Maximum Blueprints per Project (Audios)	1
Maximum Blueprints per Project (Documents)	40
Maximum Blueprints per Project (Images)	1
Maximum Blueprints per Project (Videos)	1
Maximum document file size (MB)	500
Maximum image file size (MB)	5
Maximum instruction field length for Audio Blueprint - (Characters)	500
Maximum JSON Blueprint Size (Characters)	100K
Maximum Levels of Field Hierarchy	1
Maximum number of Blueprints per Start Inference request (Audios)	1
Maximum number of Blueprints per Start Inference request (Documents)	10
Maximum number of Blueprints per Start Inference request (Images)	1
Maximum number of Blueprints per Start Inference request (Videos)	1
Maximum number of list fields per Blueprint	15
Maximum Number of pages per document	3K
Maximum Resolution	8K
Maximum video file size (MB)	10.2K
Maximum video length (Minutes)	240
Minimum audio length (Miliseconds)	500
Minimum Audio Sample Rate (Hz)	8K

Automated Reasoning (36)

Quota	Default value
Annotations in policy	10
CancelAutomatedReasoningPolicyBuildWorkflow requests per second	5
Concurrent builds per policy	2
Concurrent policy builds per account	5
CreateAutomatedReasoningPolicy requests per second	5
CreateAutomatedReasoningPolicyTestCase requests per second	5
CreateAutomatedReasoningPolicyVersion requests per second	5
DeleteAutomatedReasoningPolicy requests per second	5
DeleteAutomatedReasoningPolicyBuildWorkflow requests per second	5
DeleteAutomatedReasoningPolicyTestCase requests per second	5
ExportAutomatedReasoningPolicyVersion requests per second	5
GetAutomatedReasoningPolicy requests per second	10
GetAutomatedReasoningPolicyAnnotations requests per second	10
GetAutomatedReasoningPolicyBuildWorkflow requests per second	10
GetAutomatedReasoningPolicyBuildWorkflowResultAssets requests per second	10
GetAutomatedReasoningPolicyNextScenario requests per second	10
GetAutomatedReasoningPolicyTestCase requests per second	10
GetAutomatedReasoningPolicyTestResult requests per second	10
ListAutomatedReasoningPolicies requests per second	5
ListAutomatedReasoningPolicyBuildWorkflows requests per second	5
ListAutomatedReasoningPolicyTestCases requests per second	5
ListAutomatedReasoningPolicyTestResults requests per second	5
Policies per account	100
Rules in policy	500
Source document size (MB)	5
Source document tokens	122.9K
StartAutomatedReasoningPolicyBuildWorkflow requests per second	1
StartAutomatedReasoningPolicyTestWorkflow requests per second	1
Tests per policy	100
Types per policy	50
UpdateAutomatedReasoningPolicy requests per second	5
UpdateAutomatedReasoningPolicyAnnotations requests per second	5
UpdateAutomatedReasoningPolicyTestCase requests per second	5
Values per type in policy	50
Variables in policy	200
Versions per policy	1K

Evaluation (12)

Quota	Default value
Number of concurrent automatic model evaluation jobs	20
Number of concurrent model evaluation jobs that use human workers	10
Number of custom metrics	10
Number of custom prompt datasets in a human-based model evaluation job	1
Number of datasets per job	5
Number of evaluation jobs	5K
Number of metrics per dataset	3
Number of models in a model evaluation job that uses human workers	2
Number of models in automated model evaluation job	1
Number of prompts in a custom prompt dataset	1K
Size of prompt	4
Task time for workers	30

Advanced Prompt Optimization (2)

Quota	Default value
Active jobs per account	20
Inactive jobs per account	5K

Flows (35)

Quota	Default value
Agent nodes per flow	20
Collector nodes per flow	1
Condition nodes per flow	5
Conditions per condition node	5
CreateFlow requests per second	2
CreateFlowAlias requests per second	2
CreateFlowVersion requests per second	2
DeleteFlow requests per second	2
DeleteFlowAlias requests per second	2
DeleteFlowVersion requests per second	2
Flow aliases per flow	10
Flow executions per account	1K
Flow versions per flow	10
Flows per account	100
GetFlow requests per second	10
GetFlowAlias requests per second	10
GetFlowVersion requests per second	10
Inline code nodes per flow	5
Input nodes per flow	1
Iterator nodes per flow	1
Knowledge base nodes per flow	20
Lambda function nodes per flow	20
Lex nodes per flow	5
ListFlowAliases requests per second	10
ListFlows requests per second	10
ListFlowVersions requests per second	10
Output nodes per flow	20
PrepareFlow requests per second	2
Prompt nodes per flow	20
S3 retrieval nodes per flow	10
S3 storage nodes per flow	10
Total nodes per flow	40
UpdateFlow requests per second	2
UpdateFlowAlias requests per second	2
ValidateFlowDefinition requests per second	2

General (17)

Quota	Default value
Action groups per Agent	20
Agent Collaborators per Agent	1K
Agents per account	1K
APIs per Agent	11
Associated aliases per Agent	10
Associated knowledge bases per Agent	2
Characters in Agent instructions	20K
Concurrent model import jobs	1
Custom models with a creating status per account	2
Enabled action groups per agent	15
Endpoints per inference profile	5
Imported models per account	3
Inference profiles per account	1K
Model units no-commitment Provisioned Throughputs across base models	0
Model units no-commitment Provisioned Throughputs across custom models	0
Number of custom prompt routers per account	500
Parameters per function	5

API request rates (53)

Quota	Default value
AssociateAgentKnowledgeBase requests per second	6
CreateAgent requests per second	6
CreateAgentActionGroup requests per second	12
CreateAgentAlias requests per second	2
DeleteAgent requests per second	2
DeleteAgentActionGroup requests per second	2
DeleteAgentAlias requests per second	2
DeleteAgentVersion requests per second	2
DisassociateAgentKnowledgeBase requests per second	4
GetAgent requests per second	15
GetAgentActionGroup requests per second	20
GetAgentAlias requests per second	10
GetAgentKnowledgeBase requests per second	15
GetAgentVersion requests per second	10
ListAgentActionGroups requests per second	10
ListAgentAliases requests per second	10
ListAgentKnowledgeBases requests per second	10
ListAgents requests per second	10
ListAgentVersions requests per second	10
PrepareAgent requests per second	2
Throttle rate limit for Bedrock Data Automation Runtime: ListTagsForResource	25
Throttle rate limit for Bedrock Data Automation Runtime: TagResource	25
Throttle rate limit for Bedrock Data Automation Runtime: UntagResource	25
Throttle rate limit for Bedrock Data Automation: ListTagsForResource	25
Throttle rate limit for Bedrock Data Automation: TagResource	25
Throttle rate limit for Bedrock Data Automation: UntagResource	25
Throttle rate limit for CreateBlueprint	5
Throttle rate limit for CreateBlueprintVersion	5
Throttle rate limit for CreateDataAutomationLibrary	3
Throttle rate limit for CreateDataAutomationProject	5
Throttle rate limit for DeleteBlueprint	5
Throttle rate limit for DeleteDataAutomationLibrary	3
Throttle rate limit for DeleteDataAutomationProject	5
Throttle rate limit for GetBlueprint	5
Throttle rate limit for GetDataAutomationLibrary	5
Throttle rate limit for GetDataAutomationLibraryEntity	5
Throttle rate limit for GetDataAutomationLibraryIngestionJob	5
Throttle rate limit for GetDataAutomationProject	5
Throttle rate limit for GetDataAutomationStatus	10
Throttle rate limit for InvokeDataAutomationAsync	10
Throttle rate limit for InvokeDataAutomationLibraryIngestionJob	5
Throttle rate limit for ListBlueprints	5
Throttle rate limit for ListDataAutomationLibraries	5
Throttle rate limit for ListDataAutomationLibraryEntities	5
Throttle rate limit for ListDataAutomationLibraryIngestionJobs	5
Throttle rate limit for ListDataAutomationProjects	5
Throttle rate limit for UpdateBlueprint	5
Throttle rate limit for UpdateDataAutomationLibrary	5
Throttle rate limit for UpdateDataAutomationProject	5
UpdateAgent requests per second	4
UpdateAgentActionGroup requests per second	6
UpdateAgentAlias requests per second	2
UpdateAgentKnowledgeBase requests per second	4

Guardrails (21)

Quota	Default value
Automated Reasoning policies per guardrail	2
Contextual grounding query length in text units	1
Contextual grounding response length in text units	5
Contextual grounding source length in text units	100
Example phrases per Topic	5
Guardrails per account	100
On-demand ApplyGuardrail Content filter policy text units per second	200
On-demand ApplyGuardrail Content filter policy text units per second (standard)	200
On-demand ApplyGuardrail contextual grounding policy text units per second	106
On-demand ApplyGuardrail Denied topic policy text units per second	50
On-demand ApplyGuardrail Denied topic policy text units per second (standard)	200
On-demand ApplyGuardrail requests per second	100
On-demand ApplyGuardrail Sensitive information filter policy text units per second	500
On-demand ApplyGuardrail Word filter policy text units per second	500
On-demand InvokeGuardrailChecks requests per minute	1.5K
Regex entities in Sensitive Information Filter	30
Regex length in characters	500
Topics per guardrail	30
Versions per guardrail	20
Word length in characters	100
Words per word policy	10K

Managed Knowledge Bases (19)

Quota	Default value
AgenticRetrieveStream requests per second per account	1
AgenticRetrieveStream user query size	10K
Concurrent ingestion jobs per knowledge base	50
Data sources per knowledge base	200
DeleteKnowledgeBaseDocuments requests per second	10
DeleteResourcePolicy requests per second	5
Files to ingest per IngestKnowledgeBaseDocuments request	10
GetDocumentContent requests per second per account	100
GetDocumentContent requests per second per knowledge base	5
GetResourcePolicy requests per second	5
Individual file extracted text size (MB)	30
IngestKnowledgeBaseDocuments requests per second	20
Knowledge bases per account	1K
ListKnowledgeBaseDocuments requests per second	10
PutResourcePolicy requests per second	5
Retrieve requests per second per account	100
Retrieve requests per second per knowledge base	5
Retrieve user query size	10K
Total storage size per knowledge base (TB)	10

Prompt management (8)

Quota	Default value
CreatePrompt requests per second	2
CreatePromptVersion requests per second	2
DeletePrompt requests per second	2
GetPrompt requests per second	10
ListPrompts requests per second	10
Prompts per account	500
UpdatePrompt requests per second	2
Versions per prompt	10