Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attem...
🚨 错误信息
InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: de56674f-fc7f-4e75-996a-5cf01db44ae4, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251019204945998035773Y4oFrZC8) (request id: 20251019204940607720357uDdSy9XA) (request id: 20251019204939909851834dCtvqAyp) (request id: 20251019204830975898712T1afmgeN)
🤖 AI解决方案
InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: de56674f-fc7f-4e75-996a-5cf01db44ae4, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251019204945998035773Y4oFrZC8) (request id: 20251019204940607720357uDdSy9XA) (request id: 20251019204939909851834dCtvqAyp) (request id: 20251019204830975898712T1afmgeN)
错误翻译:
InvokeModelWithResponseStream: 操作错误 Bedrock Runtime: InvokeModelWithResponseStream,超过最大尝试次数 3 次,https 响应错误 状态码:429,请求 ID:de56674f-fc7f-4e75-996a-5cf01db44ae4,ThrottlingException:请求过多,请稍后再试。
原因分析:
请求频率超过了 Amazon Bedrock API 的速率限制。
解决方案:
在连续的 API 调用之间引入延迟。如果使用的是 AWS SDK,可以配置重试策略,但更重要的是在代码逻辑层面控制请求频率。
import time
# 示例:在多次调用 InvokeModelWithResponseStream 之间加入延迟
# 假设 original_call_function 是你原来的调用函数
def call_bedrock_with_delay(model_id, body, region_name="us-east-1"):
# ... (初始化 bedrock_runtime 客户端)
bedrock_runtime = boto3.client(
service_name="bedrock-runtime",
region_name=region_name,
# ...
)
try:
response = bedrock_runtime.invoke_model_with_response_stream(
modelId=model_id,
body=body
)
# ... 处理响应流
for chunk in response.get("body"):
# ...
pass
except Exception as e:
print(f"An error occurred: {e}")
# 根据具体错误类型(如 ThrottlingException)实现指数退避(Exponential Backoff)
# 这里仅作简单延迟示例
print("Throttling detected, waiting for 2 seconds before next attempt...")
time.sleep(2)
# 可以选择重试,或者记录日志后退出
# 在循环调用时,在每次调用后加入延迟
# for request_params in list_of_requests:
# call_bedrock_with_delay(request_params['model_id'], request_params['body'])
# time.sleep(1) # 在每次调用之间加入 1 秒延迟
预防措施:
- 实现指数退避(Exponential Backoff): 当遇到 429 错误时,不要立即重试,而是根据重试次数增加延迟时间,以允许服务恢复。
- 监控 API 调用速率: 密切关注你应用的 API 调用频率,并与 Amazon Bedrock 的服务配额进行对比,提前规划。