Integrate OLLM with the OpenAI SDK
Configure the OpenAI SDK to use OLLM as a drop-in secure model provider. Set up the base URL and API key to route inference through TEE-backed execution environments.
This guide explains how to use OLLM with the official OpenAI SDK.
Because OLLM exposes an OpenAI-compatible API, you can integrate it using the same SDK you would use for OpenAI, by changing only the base_url.
Prerequisites
- An OLLM account
- An OLLM API key
- Python 3.8+ (for the examples below)
Install the official OpenAI SDK:
pip install openaiBasic Configuration
To use OLLM, initialize the OpenAI client with:
-
base_url="https://api.ollm.com/v1" -
api_key="your-ollm-api-key"
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key="your-api-key"
)No additional configuration is required.
Make a Chat Completion Request
You must explicitly specify the model in each request.
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[
{"role": "user", "content": "Why is the sky blue?"}
]
)
print(response.choices[0].message.content)The response format follows the OpenAI-compatible schema.
Using System Messages
You can provide system instructions in the same way as standard OpenAI requests.
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[
{"role": "system", "content": "You are a concise technical assistant."},
{"role": "user", "content": "Explain TLS in two sentences."}
]
)
print(response.choices[0].message.content)Handling the Response Safely
In production systems, always validate the response structure before rendering output.
if response and response.choices:
content = response.choices[0].message.content
print(content)You can also access usage metadata for cost tracking:
print(response.usage.total_tokens)Streaming Responses
If you want to stream partial results:
stream = client.chat.completions.create(
model="near/GLM-4.6",
messages=[{"role": "user", "content": "Write a short paragraph about secure AI."}],
stream=True
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta:
print(chunk.choices[0].delta.get("content", ""), end="")Streaming works the same way as with OpenAI’s API.
Switching Models
To use a different model, change the model parameter:
response = client.chat.completions.create(
model="near/GLM-4.7",
messages=[{"role": "user", "content": "Summarize the concept of confidential computing."}]
)Ensure that the model ID is available in your OLLM account.
Environment Variable Configuration (Recommended)
Instead of hardcoding your API key:
export OLLM_API_KEY="your-api-key"Then initialize the client:
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key=os.environ["OLLM_API_KEY"]
)This prevents accidental key exposure in source code.
Common Errors
401 Unauthorized
If you receive a 401 response:
- Verify your API key
- Confirm
base_urlis set tohttps://api.ollm.com/v1 - Ensure the key has not been revoked
Model Not Found
If the request fails due to model errors:
- Verify the model ID is correct
- Ensure the model is available in your account
Troubleshoot OpenClaw with OLLM
Diagnose and fix common issues when configuring OpenClaw to use OLLM as a model provider, including agent configuration errors and connection troubleshooting.
Troubleshoot OLLM OpenAI SDK Integration
Diagnose and fix common issues when integrating OLLM with the OpenAI SDK, including connection errors, authentication failures, and response handling problems.