Use OLLM with the OpenAI SDK
How to configure the OpenAI SDK to use OLLM as a secure, verifiable model provider.
This guide explains how to use OLLM with the official OpenAI SDK.
Because OLLM exposes an OpenAI-compatible API, you can integrate it using the same SDK you would use for OpenAI, by changing only the base_url and api_key.
Prerequisites
- An OLLM account
- An OLLM API key
- Python 3.8+ (for the examples below)
Install the official OpenAI SDK:
pip install openaiBasic Configuration
To use OLLM, initialize the OpenAI client with:
-
base_url="https://api.ollm.com/v1" -
api_key="your-ollm-api-key"
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key="your-api-key"
)No additional configuration is required.
Make a Chat Completion Request
You must explicitly specify the model in each request.
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[
{"role": "user", "content": "Why is the sky blue?"}
]
)
print(response.choices[0].message.content)The response format follows the OpenAI-compatible schema.
Using System Messages
You can provide system instructions in the same way as standard OpenAI requests.
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[
{"role": "system", "content": "You are a concise technical assistant."},
{"role": "user", "content": "Explain TLS in two sentences."}
]
)
print(response.choices[0].message.content)Handling the Response Safely
In production systems, always validate the response structure before rendering output.
if response and response.choices:
content = response.choices[0].message.content
print(content)You can also access usage metadata for cost tracking:
print(response.usage.total_tokens)Streaming Responses
If you want to stream partial results:
stream = client.chat.completions.create(
model="near/GLM-4.6",
messages=[{"role": "user", "content": "Write a short paragraph about secure AI."}],
stream=True
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta:
print(chunk.choices[0].delta.get("content", ""), end="")Streaming works the same way as with OpenAI’s API.
Switching Models
To use a different model, change the model parameter:
response = client.chat.completions.create(
model="near/GLM-4.7",
messages=[{"role": "user", "content": "Summarize the concept of confidential computing."}]
)Ensure that the model ID is available in your OLLM account.
Environment Variable Configuration (Recommended)
Instead of hardcoding your API key:
export OLLM_API_KEY="your-api-key"Then initialize the client:
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key=os.environ["OLLM_API_KEY"]
)This prevents accidental key exposure in source code.
Common Errors
401 Unauthorized
If you receive a 401 response:
- Verify your API key
- Confirm
base_urlis set tohttps://api.ollm.com/v1 - Ensure the key has not been revoked
Model Not Found
If the request fails due to model errors:
- Verify the model ID is correct
- Ensure the model is available in your account