OLLMModels

Language Models

Text generation and chat models available through OLLM, covering both TEE and ZDR execution environments and their use with the AI SDK chatModel() method.

Language models handle text generation and chat: single-prompt completions, multi-turn conversations, reasoning, and tool use. They are the most common model type and back most OLLM applications.

When to Use

Use a language model when you need to generate or transform text:

  • Chat assistants and multi-turn conversations
  • Single-prompt generation, summarization, and rewriting
  • Reasoning tasks (use a reasoning-capable model with reasoningEffort)
  • Code generation and analysis
  • Tool calling and structured output

For image input, see Vision. For turning speech into text, see Audio.

AI SDK Method

Language models are accessed with chatModel() and used with the AI SDK's generateText and streamText:

language-model.ts
import { createOLLM } from '@orgn/gateway';
import { generateText } from 'ai';

const ollm = createOLLM({ apiKey: process.env.OLLM_API_KEY });

const { text } = await generateText({
  model: ollm.chatModel('near_glm_5_1'),
  prompt: 'What is OLLM?',
});

For streaming, system messages, multi-turn conversations, and reasoning options, see the Vercel AI SDK integration.

The legacy /v1/completions endpoint is not supported. Every completion task can be expressed as a chat call with chatModel().

TEE Catalog

Language models running in Trusted Execution Environments, on NEAR and Phala infrastructure with Intel TDX + NVIDIA H100 confidential compute. Every request produces a cryptographic attestation receipt.

ModelProviderInfrastructureContext
DeepSeek V3.1DeepSeeknear128K
DeepSeek V3.1DeepSeekphala164K
GLM 4.7ZAInear205K
GLM 4.7ZAIphala203K
GLM 4.7 FlashZAIphala203K
GLM 5ZAInear203K
GLM 5.1ZAInear203K
Kimi K2.5Moonshotphala262K
GPT-OSS 120BOpenAInear131K
GPT-OSS 120BOpenAIphala131K
GPT-OSS 20BOpenAIphala131K
Qwen3 30BAlibabanear262K
Qwen3 30BAlibabaphala262K
Qwen 2.5 7BAlibabaphala32K
Qwen2.5 7B InstructAlibabaphala33K
Qwen3.5 122BAlibabanear131K
Qwen3.5 27BAlibabaphala262K
Venice Uncensored 24BVenicephala33K
Gemma 3 27BGooglephala53K
Llama 3.3 70BMetaphala131K

ZDR Catalog

Language models running on Vercel's AI infrastructure with zero data retention provider agreements. No attestation receipts are generated.

Anthropic

ModelContext
Claude 3 Haiku200K
Claude 3.5 Haiku200K
Claude 3.7 Sonnet200K
Claude Haiku 4.5200K
Claude Sonnet 41M
Claude Sonnet 4.51M
Claude Sonnet 4.61M
Claude Opus 4200K
Claude Opus 4.1200K
Claude Opus 4.5200K
Claude Opus 4.61M
Claude Opus 4.71M

OpenAI

ModelContext
GPT-4o8K
GPT-4o mini8K
GPT-4.18K
GPT-4.1 mini8K
GPT-4.1 nano1M
GPT-5400K
GPT-5 mini400K
GPT-5 nano400K
GPT-5 Codex400K
GPT-5.1 Instant128K
GPT-5.1-Codex400K
GPT 5 Chat128K
GPT 5.1 Codex Max400K
GPT 5.1 Codex Mini400K
GPT 5.1 Thinking400K
GPT 5.2400K
GPT 5.2 Chat128K
GPT 5.2 Codex400K
GPT 5.3 Codex400K
GPT 5.41.1M
GPT 5.4 Mini400K
GPT 5.4 Nano400K
GPT 5.4 Pro1.1M
GPT-OSS 20B131K
GPT-OSS 120B131K
GPT OSS Safeguard 20B131K
o1200K
o3-mini
o4-mini

Google

ModelContext
Gemini 2.0 Flash1M
Gemini 2.0 Flash-Lite1M
Gemini 2.5 Flash-Lite1M
Gemini 2.5 Flash1M
Gemini 2.5 Pro1M
Gemini 3 Flash1M
Gemini 3 Pro Preview1M
Gemini 3.1 Flash Lite Preview1M
Gemini 3.1 Pro Preview1M
Gemma 4 26B A4B IT262K
Gemma 4 31B IT262K

Meta

ModelContext
Llama 3.1 8B131K
Llama 3.1 70B131K
Llama 3.2 1B128K
Llama 3.2 3B128K
Llama 3.3 70B128K
Llama 4 Scout131K
Llama 4 Maverick524K

Mistral

ModelContext
Mistral Small32K
Mistral Medium128K
Mistral Large 3256K
Mistral Nemo131K
Ministral 3B128K
Ministral 8B128K
Ministral 14B256K
Mixtral MoE 8x22B Instruct66K
Magistral Small128K
Magistral Medium128K
Codestral128K
Devstral 2256K
Devstral Small128K
Devstral Small 2256K

Alibaba (Qwen)

ModelContext
Qwen 3 14B41K
Qwen 3 30B41K
Qwen 3 32B131K
Qwen 3 235B131K
Qwen3 235B Thinking262K
Qwen3 Coder262K
Qwen3 Coder 30B262K
Qwen3 Coder Next256K
Qwen3 Next 80B262K
Qwen 3.6 Plus1M

DeepSeek

ModelContext
DeepSeek R1164K
DeepSeek V3164K
DeepSeek V3.1164K
DeepSeek V3.2164K

Moonshot

ModelContext
Kimi K2131K
Kimi K2 Turbo256K
Kimi K2 0905256K
Kimi K2 Thinking262K
Kimi K2 Thinking Turbo262K
Kimi K2.5262K

ZAI

ModelContext
GLM 4.6205K
GLM 4.7205K
GLM 4.7 Flash200K
GLM 5203K
GLM 5.1203K

Other Language Models

ModelProviderContext
MiniMax M2.1MiniMax205K
MiniMax M2.5MiniMax205K
Minimax M2.7MiniMax205K
Morph V3 FastMorph82K
Morph V3 LargeMorph82K
INTELLECT 3PrimeIntellect131K
Nemotron 3 Nano 30BNVIDIA262K
Nemotron Nano 9B v2NVIDIA131K
NVIDIA Nemotron 3 Super 120B A12BNVIDIA256K
Nova 2 LiteAmazon1M
Nova LiteAmazon300K
Nova MicroAmazon128K
Nova ProAmazon300K

Several models in this catalog also accept image input. Models with vision capability are listed on the Vision page.

On this page