Troubleshoot
Common issues and solutions when integrating OLLM with the OpenAI SDK.
This guide helps you diagnose and resolve common issues when integrating OLLM with the official OpenAI SDK.
If your requests fail, return unexpected responses, or behave inconsistently, use the sections below to isolate the problem.
Client Configuration Issues
Incorrect base_url
If requests fail immediately or appear to route to OpenAI instead of OLLM, verify that your client is initialized correctly.
It must be:
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key="your-api-key"
)Common mistakes include:
- Omitting
base_url - Adding a trailing endpoint such as
/chat/completions - Using
https://api.ollm.comwithout/v1
The correct base URL is:
https://api.ollm.com/v1API Key Not Loaded
If you receive authentication errors or requests fail silently, verify that your API key is being passed correctly.
If using environment variables:
echo $OLLM_API_KEYIf nothing prints, the variable is not set.
Ensure your client uses:
import os
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key=os.environ["OLLM_API_KEY"]
)If the environment variable is misconfigured, the SDK will send an empty or invalid key.
Authentication Errors (401)
If you receive:
401 Unauthorized
The request reached OLLM but was rejected.
Common causes:
- Invalid API key
- Revoked or rotated key
- Incorrect
base_url - Passing the wrong environment variable
Resolution Steps
- Regenerate the API key in the OLLM dashboard.
- Confirm the
base_urlis correct. - Restart your application after updating environment variables.
Do not retry repeatedly without correcting credentials.
Model Errors
Model Not Found
If the SDK returns a model-related error:
- Verify the model ID is correct (e.g.,
near/GLM-4.6) - Ensure the model is available in your OLLM account
- Check for typos or case mismatches
Example of correct usage:
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[{"role": "user", "content": "Test"}]
)If the model ID is invalid, the request will fail even if authentication succeeds.
Response Handling Issues
Attribute Errors When Accessing Response
If your code crashes at:
response.choices[0].message.contentThe likely causes are:
- The response contains an error envelope
choicesis empty- The request failed but you are not checking status
Always guard response access:
if hasattr(response, "choices") and response.choices:
print(response.choices[0].message.content)Do not assume the response always contains a valid completion.
Empty or Unexpected Output
If the response returns successfully but content is empty:
- Confirm the model received meaningful input
- Log the full response object for inspection
- Check token usage
print(response)
print(response.usage.total_tokens)If token usage is unusually low, the prompt may be malformed.
Streaming Issues
If streaming responses fail or produce no output:
stream = client.chat.completions.create(..., stream=True)Check:
- The model supports streaming
- You are iterating correctly over chunks
- You are checking for
deltabefore accessing content
Example guard:
for chunk in stream:
if chunk.choices and chunk.choices[0].delta:
print(chunk.choices[0].delta.get("content", ""), end="")If streaming hangs, test the same request without stream=True to isolate whether the issue is streaming-specific.
Token and Context Errors
If you encounter context length or token limit errors:
- Your prompt may exceed the model’s context window
- Large inputs may need truncation or chunking
Example safeguard:
MAX_CHARS = 20000
prompt = prompt[:MAX_CHARS]Monitor:
response.usage.total_tokensExcessive token usage may also increase latency.
Network or Timeout Errors
If requests time out or fail intermittently:
- Check network connectivity
- Add timeout controls
- Retry only on transient failures
Example with basic timeout:
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[{"role": "user", "content": "Test"}],
timeout=30
)Avoid retrying 401 or model-not-found errors.
Verification & Dashboard Cross-Check
If you are unsure whether the request reached OLLM:
- Check the OLLM dashboard
- Confirm the request appears in logs
- Verify status (Success / Failed / Verified)
If no request appears in the dashboard, the issue is likely local configuration.
Debugging Checklist
Before escalating issues, confirm:
base_urlis exactlyhttps://api.ollm.com/v1- The API key is valid and loaded
- The model ID is correct
- The request is reaching OLLM (visible in dashboard)
- You are guarding response parsing
- You are not exceeding token limits
Working through these checks isolates most integration problems quickly.