Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attem...
🚨 错误信息
InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: 96f78182-b0e4-4e57-991d-79639de48165, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251127215017529953871hSy7pGA2) (request id: 20251127215013419007644vXFKl4BB) (request id: 20251127215013410983617jXN9N6vx) (request id: 20251127215002807406249KbPEBQcW)
🤖 AI解决方案
错误分析
1. 错误翻译
原文:ThrottlingException: Too many requests, please wait before trying again.
译文:限流异常:请求过多,请在重试前等待。
2. 原因分析
这是一个AWS Bedrock服务的API限流错误。根本原因是:- 短时间内向Bedrock API发送了过多的请求
- 超出了服务端的请求速率限制(每分钟/每秒的请求配额)
- 系统自动重试3次后仍然无法成功
3. 解决方案
import time
import boto3
from botocore.exceptions import ClientError
def invoke_bedrock_with_retry(prompt, max_retries=5, base_delay=1):
"""
带指数退避重试的Bedrock调用函数
"""
bedrock_runtime = boto3.client('bedrock-runtime')
for attempt in range(max_retries):
try:
response = bedrock_runtime.invoke_model_with_response_stream(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
body=json.dumps({
"messages": [{"role": "user", "content": prompt}],
"max_tokens": 1000
})
)
return response
except ClientError as e:
if e.response['Error']['Code'] == 'ThrottlingException':
if attempt < max_retries - 1:
# 指数退避:等待时间逐渐增加
delay = base_delay * (2 ** attempt)
print(f"请求被限流,等待 {delay} 秒后重试...")
time.sleep(delay)
continue
else:
raise Exception(f"重试 {max_retries} 次后仍然失败")
else:
raise e
# 使用示例
try:
response = invoke_bedrock_with_retry("你的提示词")
# 处理响应...
except Exception as e:
print(f"调用失败: {e}")
4. ️ 预防措施
请求配额管理:
⚡ 重试策略优化:
推荐监控工具: AWS CloudWatch可以监控Bedrock服务的限流指标和配额使用情况。