Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attem...

2025年10月19日 20:50 响应: 4.97秒状态: success

🚨 错误信息

InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: de56674f-fc7f-4e75-996a-5cf01db44ae4, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251019204945998035773Y4oFrZC8) (request id: 20251019204940607720357uDdSy9XA) (request id: 20251019204939909851834dCtvqAyp) (request id: 20251019204830975898712T1afmgeN)

🤖 AI解决方案

InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, exceeded maximum number of attempts, 3, https response error StatusCode: 429, RequestID: de56674f-fc7f-4e75-996a-5cf01db44ae4, ThrottlingException: Too many requests, please wait before trying again. (request id: 20251019204945998035773Y4oFrZC8) (request id: 20251019204940607720357uDdSy9XA) (request id: 20251019204939909851834dCtvqAyp) (request id: 20251019204830975898712T1afmgeN)

错误翻译:

InvokeModelWithResponseStream: 操作错误 Bedrock Runtime: InvokeModelWithResponseStream，超过最大尝试次数 3 次，https 响应错误 状态码：429，请求 ID：de56674f-fc7f-4e75-996a-5cf01db44ae4，ThrottlingException：请求过多，请稍后再试。

原因分析:
请求频率超过了 Amazon Bedrock API 的速率限制。

解决方案:
在连续的 API 调用之间引入延迟。如果使用的是 AWS SDK，可以配置重试策略，但更重要的是在代码逻辑层面控制请求频率。

import time

# 示例：在多次调用 InvokeModelWithResponseStream 之间加入延迟
# 假设 original_call_function 是你原来的调用函数
def call_bedrock_with_delay(model_id, body, region_name="us-east-1"):
    # ... （初始化 bedrock_runtime 客户端）
    bedrock_runtime = boto3.client(
        service_name="bedrock-runtime",
        region_name=region_name,
        # ...
    )
    try:
        response = bedrock_runtime.invoke_model_with_response_stream(
            modelId=model_id,
            body=body
        )
        # ... 处理响应流
        for chunk in response.get("body"):
            # ...
            pass
    except Exception as e:
        print(f"An error occurred: {e}")
        # 根据具体错误类型（如 ThrottlingException）实现指数退避（Exponential Backoff）
        # 这里仅作简单延迟示例
        print("Throttling detected, waiting for 2 seconds before next attempt...")
        time.sleep(2)
        # 可以选择重试，或者记录日志后退出

# 在循环调用时，在每次调用后加入延迟
# for request_params in list_of_requests:
#     call_bedrock_with_delay(request_params['model_id'], request_params['body'])
#     time.sleep(1) # 在每次调用之间加入 1 秒延迟

预防措施:

实现指数退避（Exponential Backoff）: 当遇到 429 错误时，不要立即重试，而是根据重试次数增加延迟时间，以允许服务恢复。

监控 API 调用速率: 密切关注你应用的 API 调用频率，并与 Amazon Bedrock 的服务配额进行对比，提前规划。

🔍 分析新问题 📝 查看更多问题