Guides

Error Handling

Handle errors gracefully and implement retries so your application stays resilient.

Error response format

All errors return a JSON body with a detail field:

JSON
{"detail": "insufficient wallet balance"}

Which errors to retry

429YesRate limited. Wait and retry with exponential backoff.
502YesProvider error. The upstream LLM returned an error. Retry after a short wait.
503YesGateway temporarily unavailable. Retry with backoff.
402NoWallet empty. Top up at the dashboard before retrying.
401NoInvalid API key. Check the key and do not retry.
422NoBad request body. Fix the payload before retrying.

Retry with exponential backoff

retry.py
from openai import OpenAI, APIStatusError, APIConnectionError
import time

client = OpenAI(
    api_key="sk-live-your-key",
    base_url="https://inferexapi.cloudvoice.in/v1",
)

def chat_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="default",
                messages=messages,
            )
        except APIStatusError as e:
            if e.status_code == 402:
                raise RuntimeError("Wallet empty — top up at the dashboard") from e
            if e.status_code == 429 or e.status_code >= 500:
                wait = 2 ** attempt          # 1s, 2s, 4s
                print(f"Retry {attempt + 1} after {wait}s...")
                time.sleep(wait)
                continue
            raise                            # 401, 422 — don't retry
    raise RuntimeError("Max retries exceeded")