Image & Video Models

Image and video generation models on OLLM, reachable through the OpenAI-compatible API but not wired into the AI SDK provider interface.

Image and video generation models produce visual media from text prompts (and, for some models, from reference images or video).

When to Use

Image generation: creating, editing, or inpainting images from a text prompt
Video generation: text-to-video, image-to-video, and motion-controlled clips

To understand an existing image rather than generate one, use a Vision model instead.

How to Access

Image and video models are not available through the AI SDK provider. Calling ollm.imageModel() throws a NoSuchModelError, and there is no AI SDK helper for video generation.

Image-output and video-output models are reachable through the OpenAI-compatible OLLM API over raw HTTP. They appear in ollm.listModels() results (for example with 'image' in output_modalities), so you can discover IDs at runtime, but the request itself must be made directly against the gateway endpoint rather than through generateText or streamText.

TEE Catalog

Image generation models running in Trusted Execution Environments, on NEAR infrastructure with Intel TDX + NVIDIA H100 confidential compute.

Model	Provider	Infrastructure
Flux.2 Klein 4B	BFL	near

There are currently no video generation models in the TEE catalog.

ZDR Catalog

Image and video generation models running on Vercel's AI infrastructure with zero data retention provider agreements.

Image Generation

Model	Provider
Flux Schnell	BFL
FLUX.1 Fill [pro]	BFL
FLUX.1 Kontext Max	BFL
FLUX.1 Kontext Pro	BFL
FLUX.2 [flex]	BFL
FLUX.2 [klein] 4B	BFL
FLUX.2 [klein] 9B	BFL
FLUX.2 [max]	BFL
FLUX.2 [pro]	BFL
FLUX1.1 [pro]	BFL
FLUX1.1 [pro] Ultra	BFL
GPT Image 1	OpenAI
GPT Image 1 Mini	OpenAI
GPT Image 1.5	OpenAI
GPT Image 2	OpenAI
Imagen 4	Google
Imagen 4 Fast	Google
Imagen 4 Ultra	Google
Grok Imagine	xAI
Grok Imagine Image	xAI
Grok Imagine Image Pro	xAI
Recraft V2	Recraft
Recraft V3	Recraft
Recraft V4	Recraft
Recraft V4 Pro	Recraft
Seedream 4.0	ByteDance
Seedream 4.5	ByteDance
Seedream 5.0 Lite	ByteDance

Several Google Gemini models also produce image output (for example Gemini 3 Pro Image, Gemini 3.1 Flash Image Preview, and Nano Banana).

Video Generation

Model	Provider
Veo 3.0	Google
Veo 3.0 Fast Generate	Google
Veo 3.1	Google
Veo 3.1 Fast Generate	Google
Kling v2.5 Turbo Image-to-Video	Kuaishou
Kling v2.5 Turbo Text-to-Video	Kuaishou
Kling v2.6 Image-to-Video	Kuaishou
Kling v2.6 Motion Control	Kuaishou
Kling v2.6 Text-to-Video	Kuaishou
Kling v3.0 Image-to-Video	Kuaishou
Kling v3.0 Text-to-Video	Kuaishou
Seedance 2.0	ByteDance
Seedance 2.0 Fast	ByteDance
Seedance v1.0 Lite Image-to-Video	ByteDance
Seedance v1.0 Lite Text-to-Video	ByteDance
Seedance v1.0 Pro	ByteDance
Seedance v1.0 Pro Fast	ByteDance
Seedance v1.5 Pro	ByteDance
Wan v2.5 Text-to-Video Preview	Alibaba
Wan v2.6 Image-to-Video	Alibaba
Wan v2.6 Image-to-Video Flash	Alibaba
Wan v2.6 Reference-to-Video Flash	Alibaba
Wan v2.6 Text-to-Video	Alibaba