AWS Startups · Bedrock Reference

Amazon Bedrock Model Quotas

Default service quotas for every Bedrock foundation model, grouped by family — requests & tokens per minute, daily token caps, batch, and provisioned throughput. Pulled live from the AWS Service Quotas API, so it never goes stale.

Region: us-east-1 · US East (N. Virginia)
Models: 185
Families: 17
These are AWS default quotas for region us-east-1. An account's applied value can be higher if a limit increase was approved. Adjustable quotas can be raised via Service Quotas; many per-model token limits are adjustable, batch minimums are not. Hover any number for the exact value.

Anthropic Claude

39 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMGlobal RPMGlobal TPMGlobal TPDTokens/dayLat-opt RPMLat-opt TPMMantle ITPMMantle OTPM
Anthropic Claude 3 Haiku1K2M2K4M2.88B
Anthropic Claude 3 Opus50400K100800K
Anthropic Claude 3 Sonnet5001M1K2M
Anthropic Claude 3.5 Haiku1K2M2K4M2.88B100500K
Anthropic Claude 3.5 Sonnet V150400K100800K2.88B
Anthropic Claude 3.5 Sonnet V250400K100800K2.88B
Anthropic Claude 3.7 Sonnet V12501M720M
Anthropic Claude Fable 5200K500K720M144M
Anthropic Claude Haiku 4.510K5M10K5M7.2B3.6B
Anthropic Claude Opus 4 V1200200K144M
Anthropic Claude Opus 4.150500K360M
Anthropic Claude Opus 4.510K2M10K2M2.88B1.44B
Anthropic Claude Opus 4.6 V110K3M10K3M4.32B2.16B
Anthropic Claude Opus 4.730M30M43.2B21.6B20M4M
Anthropic Claude Opus 4.830M30M43.2B21.6B20M4M
Anthropic Claude Sonnet 4 V1200200K200200K288M144M
Anthropic Claude Sonnet 4 V1 1M Context Length51M720M
Anthropic Claude Sonnet 4.5 V110K5M10K5M7.2B3.6B
Anthropic Claude Sonnet 4.5 V1 1M Context Length1K1M1K1M1.44B720M
Anthropic Claude Sonnet 4.610K6M10K6M8.64B4.32B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Anthropic Claude 3 Haiku100K100100K15100
Anthropic Claude 3 Opus100K100100K15100
Anthropic Claude 3 Sonnet100K100100K15100
Anthropic Claude 3.5 Haiku100K100100K15100
Anthropic Claude 3.5 Sonnet V1100K100100K15100
Anthropic Claude 3.5 Sonnet V2100K100100K15100
Anthropic Claude 3.7 Sonnet V1100K100100K15100
Anthropic Claude Haiku 4.5100K100100K15100
Anthropic Claude Opus 4.5100K100100K15100
Anthropic Claude Opus 4.6 V1100K100100K15100
Anthropic Claude Sonnet 4 V1100K100100K15100
Anthropic Claude Sonnet 4.5 V1100K100100K15100
Anthropic Claude Sonnet 4.6100K100100K15100
Provisioned throughput
ModelMU/PT model
Anthropic Claude 3 Haiku 200K0
Anthropic Claude 3 Haiku 48K0
Anthropic Claude 3 Sonnet 200K0
Anthropic Claude 3 Sonnet 28K0
Anthropic Claude 3.5 Haiku 16K0
Anthropic Claude 3.5 Haiku 200K0
Anthropic Claude 3.5 Haiku 64K0
Anthropic Claude 3.5 Sonnet 18K0
Anthropic Claude 3.5 Sonnet 200K0
Anthropic Claude 3.5 Sonnet 51K0
Anthropic Claude 3.5 Sonnet V2 18K0
Anthropic Claude 3.5 Sonnet V2 200K0
Anthropic Claude 3.5 Sonnet V2 51K0
Anthropic Claude Instant V1 100K0
Anthropic Claude V2 100K0
Anthropic Claude V2 18K0
Anthropic Claude V2.1 18K0
Anthropic Claude V2.1 200K0
Model customization
ModelTrain+val records
Anthropic Claude 3 Haiku10K
Claude 3-5-Haiku10K

Amazon (Nova / Titan)

36 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMGlobal RPMGlobal TPMGlobal TPDTokens/dayLat-opt RPMLat-opt TPMLat-opt TPDConcurrent reqsAsync concurrent
Amazon Nova 2 Lite2K8M2K8M11.52B5.76B
Amazon Nova 2 Multimodal Embeddings V12K30
Amazon Nova 2 Omni2K8M2K8M11.52B5.76B
Amazon Nova 2 Pro Preview1001M1001M1.44B720M
Amazon Nova 2 Sonic20
Amazon Nova Canvas100
Amazon Nova Lite2K4M4K8M5.76B
Amazon Nova Micro2K4M4K8M5.76B
Amazon Nova Premier V15002M1.44B
Amazon Nova Pro V12501M5002M1.44B1040K57.6M
Amazon Nova Reel1.010
Amazon Nova Reel1.13
Amazon Nova Sonic20
Amazon Rerank 1.0200
Amazon Titan Image Generator G160
Amazon Titan Image Generator G1 V2602K
Amazon Titan Multimodal Embeddings G12K300K
Amazon Titan Text Embeddings2K300K
Amazon Titan Text Embeddings V26K300K
Amazon Titan Text Express400300K
Amazon Titan Text Premier100300K
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)Concurrent jobs (custom)
Amazon Nova 2 Lite100K100100K1100
Amazon Nova 2 Multimodal Embeddings V1100K100100K1100100
Amazon Nova Lite100K100100K1100100
Amazon Nova Micro100K100100K15100
Amazon Nova Premier V1100K100100K15100
Amazon Nova Pro V1100K100100K1100100
Amazon Titan Multimodal Embeddings G1100K100100K151003
Amazon Titan Text Embeddings V2100K100100K151003
Provisioned throughput
ModelMU/PT modelMU/PT (24k ctx)MU/PT (128k ctx)MU/PT (300k ctx)MU (no commit)
Amazon Nova 2 Lite V1.0 256K0
Amazon Nova Canvas0
Amazon Nova Lite00
Amazon Nova Micro00
Amazon Nova Pro V100
Amazon Titan Embeddings G1 - Text0
Amazon Titan Image Generator G10
Amazon Titan Image Generator G20
Amazon Titan Lite V1 4K0
Amazon Titan Multimodal Embeddings G10
Amazon Titan Text Embeddings V20
Amazon Titan Text G1 - Express 8K0
Amazon Titan Text Premier V1 32K0
base model Amazon Nova 2 Lite V1.0 256K0
custom model Amazon Nova 2 Lite V1.0 256K0
Model customization
ModelTrain+val recordsCustom deploy RPMCustom deploy TPMCustom deploy TPDMax FT ctx length
Amazon Nova 2 Lite20K2K4M5.76B
Amazon Nova Lite20K2K4M5.76B
Amazon Nova Micro20K2K4M5.76B
Amazon Nova Micro V1 distillation customization jobs32K
Amazon Nova Pro V120K200800K1.15B
Amazon Nova V1 distillation customization jobs32K
Amazon Titan Image Generator G110K
Amazon Titan Multimodal Embeddings G150K
Titan Text G1 - Express10K
Titan Text G1 - Express v1 Continued Pre-Training job100K
Titan Text G1 - Lite10K
Titan Text G1 - Lite v1 Continued Pre-Training job100K
Titan Text G1 - Premier20K

Meta Llama

18 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMTokens/dayLat-opt RPMLat-opt TPM
Meta Llama 3 70B Instruct400300K
Meta Llama 3 8B Instruct800300K
Meta Llama 3.1 70B Instruct400300K800600K10040K
Meta Llama 3.1 8B Instruct800300K1.6K600K
Meta Llama 3.2 11B Instruct400300K432M
Meta Llama 3.2 1B Instruct800300K1.6K600K432M
Meta Llama 3.2 3B Instruct800300K1.6K600K432M
Meta Llama 3.2 90B Instruct400300K432M
Meta Llama 3.3 70B Instruct800600K
Meta Llama 4 Maverick V1800600K432M
Meta Llama 4 Scout V1800600K432M
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Llama 3.1 405B Instruct100K100100K15100
Meta Llama 3.1 70B Instruct100K100100K15100
Meta Llama 3.1 8B Instruct100K100100K15100
Meta Llama 3.2 11B Instruct100K100100K15100
Meta Llama 3.2 1B Instruct100K100100K15100
Meta Llama 3.2 3B Instruct100K100100K15100
Meta Llama 3.2 90B Instruct100K100100K15100
Meta Llama 3.3 70B Instruct100K100100K15100
Meta Llama 4 Maverick V1100K100100K15100
Meta Llama 4 Scout V1100K100100K15100
Provisioned throughput
ModelMU/PT modelMU (commitment)
Meta Llama 2 13B0
Meta Llama 2 70B0
Meta Llama 2 Chat 13B0
Meta Llama 2 Chat 70B0
Meta Llama 3 70B Instruct0
Meta Llama 3 8B Instruct0
Meta Llama 4 Scout 17B Instruct 10M0
Meta Llama 4 Scout 17B Instruct 128K0
Model customization
ModelTrain+val records
Meta Llama 2 13B10K
Meta Llama 2 70B10K

Mistral AI

23 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMTokens/day
Magistral Small 1.210K100M144B
Ministral 14B 3.010K100M144B
Ministral 3B 3.010K100M144B
Ministral 8B 3.010K100M144B
Mistral AI Mistral 7B Instruct800300K432M
Mistral AI Mistral Large400300K432M
Mistral AI Mistral Small400300K432M
Mistral AI Mixtral 8X7B Instruct432M
Mistral AI Mixtral 8X7BB Instruct300K
Mistral Devstral 2 123b10K100M144B
Mistral Large 310K100M144B
Mistral Mixtral 8x7b Instruct400
Mistral Pixtral Large 25.02 V11080K57.6M
Voxtral Mini 1.010K100M144B
Voxtral Small 1.010K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Devstral 2 123B100K100100K15100
Magistral Small 2509100K100100K15100
Ministral 3 14B100K100100K15100
Ministral 3 8B100K100100K15100
Ministral 3B100K100100K15100
Mistral AI Mistral Small100K100100K15100
Mistral Large 2 (24.07)100K100100K15100
Mistral Large 3100K100100K15100
Voxtral Mini 3B 2507100K100100K15100
Voxtral Small 24B 2507100K100100K15100
Provisioned throughput
ModelMU/PT model
Mistral AI Mistral Small0

Cohere

6 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMGlobal RPMGlobal TPMGlobal TPDTokens/day
Cohere Command R400300K
Cohere Command R Plus400300K
Cohere Embed English2K300K
Cohere Embed Multilingual2K300K
Cohere Embed V41K150K2K300K2K300K432M216M
Cohere Rerank 3.5250
Provisioned throughput
ModelMU/PT model
Cohere Command R0
Cohere Command R Plus0
Cohere Embed English0
Cohere Embed Multilingual0

AI21 Labs

4 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
AI21 Labs Jamba 1.5 Large100300K432M
AI21 Labs Jamba 1.5 Mini100300K432M
Provisioned throughput
ModelMU/PT model
AI21 Labs Jurassic-2 Mid0
AI21 Labs Jurassic-2 Ultra0

DeepSeek

2 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMTokens/day
DeepSeek R1 V1200200K144M
DeepSeek V3.210K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
DeepSeek V3.2100K100100K15100

OpenAI

6 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/dayMantle ITPMMantle OTPM
GPT-5.420M4M
GPT-5.55M1M
OpenAI GPT OSS 120b10K100M144B
OpenAI GPT OSS 20b10K100M144B
OpenAI GPT OSS Safeguard 120b10K100M144B
OpenAI GPT OSS Safeguard 20b10K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
OpenAI GPT OSS 120b100K100100K15100
OpenAI GPT OSS 20b100K100100K15100
OpenAI GPT OSS Safeguard 120b100K100100K15100
OpenAI GPT OSS Safeguard 20b100K100100K15100

Qwen

8 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
Qwen3 32B V110K100M144B
Qwen3 Coder 30B a3b V110K100M144B
Qwen3 Coder Next10K100M144B
Qwen3 Next 80B A3B10K100M144B
Qwen3 VL 235B A22B10K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Qwen3 32B V1100K100100K15100
Qwen3 Coder 30B100K100100K15100
Qwen3 Coder Next100K100100K15100
Qwen3 Next 80B100K100100K15100
Qwen3 VL 235B100K100100K15100

Z.ai (GLM)

5 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
Z.ai GLM 510K100M144B
Z.ai GLM-4.710K100M144B
Z.ai GLM-4.7 Flash10K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
GLM 4.7100K100100K15100
GLM 4.7 Flash100K100100K15100
Z.ai GLM 5100K100100K15100

Writer

3 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMCross-region RPMCross-region TPMTokens/day
Writer AI Palmyra X4 V110150K108M
Writer AI Palmyra X5 V110150K108M
Writer Palmyra Vision 7B10K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Writer Palmyra Vision 7B100K100100K15100

TwelveLabs

3 models

Inference rate limits

ModelOn-demand RPMCross-region RPMConcurrent reqsAsync concurrent
Twelve Labs Marengo10020030
Twelve Labs Pegasus6012030
TwelveLabs Marengo Embed 3.05001K10

Stability AI

15 models

Inference rate limits

ModelOn-demand RPMCross-region RPM
Stable Image Conservative Upscale24
Stable Image Control Sketch1020
Stable Image Control Structure1020
Stable Image Creative Upscale24
Stable Image Erase Object1020
Stable Image Fast Upscale1020
Stable Image Inpaint1020
Stable Image Outpaint24
Stable Image Remove Background1020
Stable Image Search and Recolor1020
Stable Image Search and Replace1020
Stable Image Style Guide1020
Stable Image Style Transfer1020
Provisioned throughput
ModelMU/PT model
Stability.ai Stable Diffusion XL 0.80
Stability.ai Stable Diffusion XL 1.00

Google Gemma

3 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
Gemma 3 12B10K100M144B
Gemma 3 27B10K100M144B
Gemma 3 4B10K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Gemma 3 12B100K100100K15100
Gemma 3 27B100K100100K15100
Gemma 3 4B100K100100K15100

NVIDIA

6 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
NVIDIA Nemotron 3 Super 120B A12B10K100M144B
NVIDIA Nemotron Nano 210K100M144B
NVIDIA Nemotron Nano 2 VL10K100M144B
NVIDIA Nemotron Nano 3 30B10K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
NVIDIA Nemotron 3 Super 120B A12B100K100100K15100
NVIDIA Nemotron Nano 12B100K100100K15100
NVIDIA Nemotron Nano 3 30B100K100100K15100
NVIDIA Nemotron Nano 9B100K100100K15100

MiniMax

3 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
Minimax M210K100M144B
Minimax M2.110K100M144B
MiniMax M2.510K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Minimax M2100K100100K15100
Minimax M2.1100K100100K15100
MiniMax M2.5100K100100K15100

Other

5 models

Inference rate limits

ModelOn-demand RPMOn-demand TPMTokens/day
Kimi K2 Thinking10K100M144B
Moonshot AI Kimi K2.510K100M144B
Batch inference
ModelMax records/jobMin records/jobRecords/input fileInput file GBJob size GBConcurrent jobs (base)
Kimi K2 Thinking100K100100K15100
Kimi K2.5100K100100K15100
Provisioned throughput
ModelMU (commitment)
Meta Maverick 4 Scout 17B Instruct 128K0
Meta Maverick 4 Scout 17B Instruct 1M0

Account & API quotas

288 quotas

Service-wide limits not tied to a specific model — feature capacities, control-plane API request rates, and customization account limits.

Model customization (9)
QuotaDefault value
Custom models per account100
In-progress custom model deployments2
Maximum input file size for distillation customization jobs2
Maximum line length for distillation customization jobs16
Maximum number of prompts for distillation customization jobs15K
Maximum number of training records for an Amazon Nova Canvas Fine-tuning job10K
Minimum number of prompts for distillation customization jobs100
Scheduled customization jobs10
Total number of custom model deployments10
Knowledge Bases (37)
QuotaDefault value
Concurrent ingestion jobs per account5
Concurrent ingestion jobs per data source1
Concurrent ingestion jobs per knowledge base1
Concurrent IngestKnowledgeBaseDocuments and DeleteKnowledgeBaseDocuments requests per account10
CreateDataSource requests per second2
CreateKnowledgeBase requests per second2
Data sources per knowledge base5
DeleteDataSource requests per second2
DeleteKnowledgeBase requests per second2
DeleteKnowledgeBaseDocuments requests per second5
Files to add or update per ingestion job5M
Files to delete per ingestion job5M
Files to ingest per IngestKnowledgeBaseDocuments job.25
GenerateQuery requests per second2
GetDataSource requests per second10
GetIngestionJob requests per second10
GetKnowledgeBase requests per second10
GetKnowledgeBaseDocuments requests per second5
Ingestion job file size with text content50
Ingestion job size100
IngestKnowledgeBaseDocuments requests per second5
IngestKnowledgeBaseDocuments total payload size6
Knowledge bases per account100
ListDataSources requests per second10
ListIngestionJobs requests per second10
ListKnowledgeBaseDocuments requests per second5
ListKnowledgeBases requests per second10
Maximum number of files for BDA parser1K
Maximum number of files for Foundation Models as a parser1K
Rerank requests per second10
Retrieve requests per second20
RetrieveAndGenerate requests per second20
RetrieveAndGenerateStream requests per second20
StartIngestionJob requests per second0.1
UpdateDataSource requests per second2
UpdateKnowledgeBase requests per second2
User query size1K
Data Automation (39)
QuotaDefault value
(Console) Maximum document file size (MB)200
(Console) Maximum number of pages per document file20
CreateBlueprint - Max number of blueprints per account350
CreateBlueprintVersion - Max number of Blueprint versions per Blueprint10
CreateDataAutomationLibrary - Max number of data automation libraries per account10
Description length for fields (Characters)300
InvokeBlueprintOptimizationAsync - Max number of blueprint optimization concurrent jobs3
InvokeBlueprintOptimizationAsync - Max number of blueprint optimization jobs per day30
InvokeDataAutomation(Sync) - Document - Max number of requests60
InvokeDataAutomation(Sync) - Image - Max number of requests200
InvokeDataAutomationAsync - Audio - Max number of concurrent jobs20
InvokeDataAutomationAsync - Document - Max number of concurrent jobs25
InvokeDataAutomationAsync - Image - Max number of concurrent jobs20
InvokeDataAutomationAsync - Max number of open jobs1.8K
InvokeDataAutomationAsync - Video - Max number of concurrent jobs20
Max number of vocabulary phrases per library500
Maximum audio file size (MB)2K
Maximum audio length (Minutes)240
Maximum Audio Sample Rate (Hz)48K
Maximum Blueprints per Project (Audios)1
Maximum Blueprints per Project (Documents)40
Maximum Blueprints per Project (Images)1
Maximum Blueprints per Project (Videos)1
Maximum document file size (MB)500
Maximum image file size (MB)5
Maximum instruction field length for Audio Blueprint - (Characters)500
Maximum JSON Blueprint Size (Characters)100K
Maximum Levels of Field Hierarchy1
Maximum number of Blueprints per Start Inference request (Audios)1
Maximum number of Blueprints per Start Inference request (Documents)10
Maximum number of Blueprints per Start Inference request (Images)1
Maximum number of Blueprints per Start Inference request (Videos)1
Maximum number of list fields per Blueprint15
Maximum Number of pages per document3K
Maximum Resolution8K
Maximum video file size (MB)10.2K
Maximum video length (Minutes)240
Minimum audio length (Miliseconds)500
Minimum Audio Sample Rate (Hz)8K
Automated Reasoning (36)
QuotaDefault value
Annotations in policy10
CancelAutomatedReasoningPolicyBuildWorkflow requests per second5
Concurrent builds per policy2
Concurrent policy builds per account5
CreateAutomatedReasoningPolicy requests per second5
CreateAutomatedReasoningPolicyTestCase requests per second5
CreateAutomatedReasoningPolicyVersion requests per second5
DeleteAutomatedReasoningPolicy requests per second5
DeleteAutomatedReasoningPolicyBuildWorkflow requests per second5
DeleteAutomatedReasoningPolicyTestCase requests per second5
ExportAutomatedReasoningPolicyVersion requests per second5
GetAutomatedReasoningPolicy requests per second10
GetAutomatedReasoningPolicyAnnotations requests per second10
GetAutomatedReasoningPolicyBuildWorkflow requests per second10
GetAutomatedReasoningPolicyBuildWorkflowResultAssets requests per second10
GetAutomatedReasoningPolicyNextScenario requests per second10
GetAutomatedReasoningPolicyTestCase requests per second10
GetAutomatedReasoningPolicyTestResult requests per second10
ListAutomatedReasoningPolicies requests per second5
ListAutomatedReasoningPolicyBuildWorkflows requests per second5
ListAutomatedReasoningPolicyTestCases requests per second5
ListAutomatedReasoningPolicyTestResults requests per second5
Policies per account100
Rules in policy500
Source document size (MB)5
Source document tokens122.9K
StartAutomatedReasoningPolicyBuildWorkflow requests per second1
StartAutomatedReasoningPolicyTestWorkflow requests per second1
Tests per policy100
Types per policy50
UpdateAutomatedReasoningPolicy requests per second5
UpdateAutomatedReasoningPolicyAnnotations requests per second5
UpdateAutomatedReasoningPolicyTestCase requests per second5
Values per type in policy50
Variables in policy200
Versions per policy1K
Evaluation (12)
QuotaDefault value
Number of concurrent automatic model evaluation jobs20
Number of concurrent model evaluation jobs that use human workers10
Number of custom metrics10
Number of custom prompt datasets in a human-based model evaluation job1
Number of datasets per job5
Number of evaluation jobs5K
Number of metrics per dataset3
Number of models in a model evaluation job that uses human workers2
Number of models in automated model evaluation job1
Number of prompts in a custom prompt dataset1K
Size of prompt4
Task time for workers30
Advanced Prompt Optimization (2)
QuotaDefault value
Active jobs per account20
Inactive jobs per account5K
Flows (35)
QuotaDefault value
Agent nodes per flow20
Collector nodes per flow1
Condition nodes per flow5
Conditions per condition node5
CreateFlow requests per second2
CreateFlowAlias requests per second2
CreateFlowVersion requests per second2
DeleteFlow requests per second2
DeleteFlowAlias requests per second2
DeleteFlowVersion requests per second2
Flow aliases per flow10
Flow executions per account1K
Flow versions per flow10
Flows per account100
GetFlow requests per second10
GetFlowAlias requests per second10
GetFlowVersion requests per second10
Inline code nodes per flow5
Input nodes per flow1
Iterator nodes per flow1
Knowledge base nodes per flow20
Lambda function nodes per flow20
Lex nodes per flow5
ListFlowAliases requests per second10
ListFlows requests per second10
ListFlowVersions requests per second10
Output nodes per flow20
PrepareFlow requests per second2
Prompt nodes per flow20
S3 retrieval nodes per flow10
S3 storage nodes per flow10
Total nodes per flow40
UpdateFlow requests per second2
UpdateFlowAlias requests per second2
ValidateFlowDefinition requests per second2
General (17)
QuotaDefault value
Action groups per Agent20
Agent Collaborators per Agent1K
Agents per account1K
APIs per Agent11
Associated aliases per Agent10
Associated knowledge bases per Agent2
Characters in Agent instructions20K
Concurrent model import jobs1
Custom models with a creating status per account2
Enabled action groups per agent15
Endpoints per inference profile5
Imported models per account3
Inference profiles per account1K
Model units no-commitment Provisioned Throughputs across base models0
Model units no-commitment Provisioned Throughputs across custom models0
Number of custom prompt routers per account500
Parameters per function5
API request rates (53)
QuotaDefault value
AssociateAgentKnowledgeBase requests per second6
CreateAgent requests per second6
CreateAgentActionGroup requests per second12
CreateAgentAlias requests per second2
DeleteAgent requests per second2
DeleteAgentActionGroup requests per second2
DeleteAgentAlias requests per second2
DeleteAgentVersion requests per second2
DisassociateAgentKnowledgeBase requests per second4
GetAgent requests per second15
GetAgentActionGroup requests per second20
GetAgentAlias requests per second10
GetAgentKnowledgeBase requests per second15
GetAgentVersion requests per second10
ListAgentActionGroups requests per second10
ListAgentAliases requests per second10
ListAgentKnowledgeBases requests per second10
ListAgents requests per second10
ListAgentVersions requests per second10
PrepareAgent requests per second2
Throttle rate limit for Bedrock Data Automation Runtime: ListTagsForResource25
Throttle rate limit for Bedrock Data Automation Runtime: TagResource25
Throttle rate limit for Bedrock Data Automation Runtime: UntagResource25
Throttle rate limit for Bedrock Data Automation: ListTagsForResource25
Throttle rate limit for Bedrock Data Automation: TagResource25
Throttle rate limit for Bedrock Data Automation: UntagResource25
Throttle rate limit for CreateBlueprint5
Throttle rate limit for CreateBlueprintVersion5
Throttle rate limit for CreateDataAutomationLibrary3
Throttle rate limit for CreateDataAutomationProject5
Throttle rate limit for DeleteBlueprint5
Throttle rate limit for DeleteDataAutomationLibrary3
Throttle rate limit for DeleteDataAutomationProject5
Throttle rate limit for GetBlueprint5
Throttle rate limit for GetDataAutomationLibrary5
Throttle rate limit for GetDataAutomationLibraryEntity5
Throttle rate limit for GetDataAutomationLibraryIngestionJob5
Throttle rate limit for GetDataAutomationProject5
Throttle rate limit for GetDataAutomationStatus10
Throttle rate limit for InvokeDataAutomationAsync10
Throttle rate limit for InvokeDataAutomationLibraryIngestionJob5
Throttle rate limit for ListBlueprints5
Throttle rate limit for ListDataAutomationLibraries5
Throttle rate limit for ListDataAutomationLibraryEntities5
Throttle rate limit for ListDataAutomationLibraryIngestionJobs5
Throttle rate limit for ListDataAutomationProjects5
Throttle rate limit for UpdateBlueprint5
Throttle rate limit for UpdateDataAutomationLibrary5
Throttle rate limit for UpdateDataAutomationProject5
UpdateAgent requests per second4
UpdateAgentActionGroup requests per second6
UpdateAgentAlias requests per second2
UpdateAgentKnowledgeBase requests per second4
Guardrails (21)
QuotaDefault value
Automated Reasoning policies per guardrail2
Contextual grounding query length in text units1
Contextual grounding response length in text units5
Contextual grounding source length in text units100
Example phrases per Topic5
Guardrails per account100
On-demand ApplyGuardrail Content filter policy text units per second200
On-demand ApplyGuardrail Content filter policy text units per second (standard)200
On-demand ApplyGuardrail contextual grounding policy text units per second106
On-demand ApplyGuardrail Denied topic policy text units per second50
On-demand ApplyGuardrail Denied topic policy text units per second (standard)200
On-demand ApplyGuardrail requests per second100
On-demand ApplyGuardrail Sensitive information filter policy text units per second500
On-demand ApplyGuardrail Word filter policy text units per second500
On-demand InvokeGuardrailChecks requests per minute1.5K
Regex entities in Sensitive Information Filter30
Regex length in characters500
Topics per guardrail30
Versions per guardrail20
Word length in characters100
Words per word policy10K
Managed Knowledge Bases (19)
QuotaDefault value
AgenticRetrieveStream requests per second per account1
AgenticRetrieveStream user query size10K
Concurrent ingestion jobs per knowledge base50
Data sources per knowledge base200
DeleteKnowledgeBaseDocuments requests per second10
DeleteResourcePolicy requests per second5
Files to ingest per IngestKnowledgeBaseDocuments request10
GetDocumentContent requests per second per account100
GetDocumentContent requests per second per knowledge base5
GetResourcePolicy requests per second5
Individual file extracted text size (MB)30
IngestKnowledgeBaseDocuments requests per second20
Knowledge bases per account1K
ListKnowledgeBaseDocuments requests per second10
PutResourcePolicy requests per second5
Retrieve requests per second per account100
Retrieve requests per second per knowledge base5
Retrieve user query size10K
Total storage size per knowledge base (TB)10
Prompt management (8)
QuotaDefault value
CreatePrompt requests per second2
CreatePromptVersion requests per second2
DeletePrompt requests per second2
GetPrompt requests per second10
ListPrompts requests per second10
Prompts per account500
UpdatePrompt requests per second2
Versions per prompt10