Migrate to OLLM from OpenAI Apps
Migrate your existing OpenAI application to OLLM with minimal code changes. Configure the OpenAI SDK to route requests through OLLM for hardware-attested, confidential inference.
This guide explains how to use OLLM with the official OpenAI SDK.
Because OLLM exposes an OpenAI-compatible API, you can integrate it using the same SDK you would use for OpenAI, by changing only the base_url and api_key.
Prerequisites
- An OLLM account
- An OLLM API key
- Python 3.8+ (for the examples below)
Install the official OpenAI SDK:
pip install openaiBasic Configuration
To use OLLM, initialize the OpenAI client with:
-
base_url="https://api.ollm.com/v1" -
api_key="your-ollm-api-key"
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key="your-api-key"
)No additional configuration is required.
Make a Chat Completion Request
You must explicitly specify the model in each request.
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[
{"role": "user", "content": "Why is the sky blue?"}
]
)
print(response.choices[0].message.content)The response format follows the OpenAI-compatible schema.
Using System Messages
You can provide system instructions in the same way as standard OpenAI requests.
response = client.chat.completions.create(
model="near/GLM-4.6",
messages=[
{"role": "system", "content": "You are a concise technical assistant."},
{"role": "user", "content": "Explain TLS in two sentences."}
]
)
print(response.choices[0].message.content)Handling the Response Safely
In production systems, always validate the response structure before rendering output.
if response and response.choices:
content = response.choices[0].message.content
print(content)You can also access usage metadata for cost tracking:
print(response.usage.total_tokens)Streaming Responses
If you want to stream partial results:
stream = client.chat.completions.create(
model="near/GLM-4.6",
messages=[{"role": "user", "content": "Write a short paragraph about secure AI."}],
stream=True
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta:
print(chunk.choices[0].delta.get("content", ""), end="")Streaming works the same way as with OpenAI’s API.
Switching Models
To use a different model, change the model parameter:
response = client.chat.completions.create(
model="near/GLM-4.7",
messages=[{"role": "user", "content": "Summarize the concept of confidential computing."}]
)Ensure that the model ID is available in your OLLM account.
Environment Variable Configuration (Recommended)
Instead of hardcoding your API key:
export OLLM_API_KEY="your-api-key"Then initialize the client:
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.ollm.com/v1",
api_key=os.environ["OLLM_API_KEY"]
)This prevents accidental key exposure in source code.
Common Errors
401 Unauthorized
If you receive a 401 response:
- Verify your API key
- Confirm
base_urlis set tohttps://api.ollm.com/v1 - Ensure the key has not been revoked
Model Not Found
If the request fails due to model errors:
- Verify the model ID is correct
- Ensure the model is available in your account
Troubleshoot OLLM OpenAI SDK Integration
Diagnose and fix common issues when integrating OLLM with the OpenAI SDK, including connection errors, authentication failures, and response handling problems.
Troubleshoot OLLM Migration Issues
Diagnose and fix common issues when migrating an existing OpenAI application to OLLM, including authentication errors, model compatibility, and SDK configuration problems.