Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attem...

2025年10月19日 20:50 响应: 4.97秒 状态: success

🚨 错误信息

InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: de56674f-fc7f-4e75-996a-5cf01db44ae4, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251019204945998035773Y4oFrZC8) (request id: 20251019204940607720357uDdSy9XA) (request id: 20251019204939909851834dCtvqAyp) (request id: 20251019204830975898712T1afmgeN)

🤖 AI解决方案

InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: de56674f-fc7f-4e75-996a-5cf01db44ae4, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251019204945998035773Y4oFrZC8) (request id: 20251019204940607720357uDdSy9XA) (request id: 20251019204939909851834dCtvqAyp) (request id: 20251019204830975898712T1afmgeN)

错误翻译:

InvokeModelWithResponseStream: 操作错误 Bedrock Runtime: InvokeModelWithResponseStream,超过最大尝试次数 3 次,https 响应错误 状态码:429,请求 ID:de56674f-fc7f-4e75-996a-5cf01db44ae4,ThrottlingException:请求过多,请稍后再试。

原因分析:
请求频率超过了 Amazon Bedrock API 的速率限制。

解决方案:
在连续的 API 调用之间引入延迟。如果使用的是 AWS SDK,可以配置重试策略,但更重要的是在代码逻辑层面控制请求频率。

import time

# 示例:在多次调用 InvokeModelWithResponseStream 之间加入延迟
# 假设 original_call_function 是你原来的调用函数
def call_bedrock_with_delay(model_id, body, region_name="us-east-1"):
    # ... (初始化 bedrock_runtime 客户端)
    bedrock_runtime = boto3.client(
        service_name="bedrock-runtime",
        region_name=region_name,
        # ...
    )
    try:
        response = bedrock_runtime.invoke_model_with_response_stream(
            modelId=model_id,
            body=body
        )
        # ... 处理响应流
        for chunk in response.get("body"):
            # ...
            pass
    except Exception as e:
        print(f"An error occurred: {e}")
        # 根据具体错误类型(如 ThrottlingException)实现指数退避(Exponential Backoff)
        # 这里仅作简单延迟示例
        print("Throttling detected, waiting for 2 seconds before next attempt...")
        time.sleep(2)
        # 可以选择重试,或者记录日志后退出

# 在循环调用时,在每次调用后加入延迟
# for request_params in list_of_requests:
#     call_bedrock_with_delay(request_params['model_id'], request_params['body'])
#     time.sleep(1) # 在每次调用之间加入 1 秒延迟

预防措施:

  • 实现指数退避(Exponential Backoff): 当遇到 429 错误时,不要立即重试,而是根据重试次数增加延迟时间,以允许服务恢复。

  • 监控 API 调用速率: 密切关注你应用的 API 调用频率,并与 Amazon Bedrock 的服务配额进行对比,提前规划。