OLLM Security Model and Threat Boundaries

Learn how OLLM enforces privacy through two execution models, TEE (hardware-isolated, cryptographically verifiable) and ZDR (policy-enforced zero data retention via Vercel). Covers threat model, key management, and data handling policies for both.

The security model of the OLLM confidential gateway is designed around explicit trust boundaries and minimized assumptions. Rather than relying on policies or access controls alone, OLLM uses hardware-enforced isolation and cryptographic verification to reduce reliance on intermediaries.

This section defines what OLLM is designed to protect against and what it intentionally does not.

Security by Execution Type

OLLM exposes two types of model execution environments, each with a distinct security model.

Trusted Execution Environment (TEE)

TEE models run on NEAR and Phala infrastructure using Intel TDX confidential virtual machines and NVIDIA H100 GPU attestation. The security guarantee is hardware-enforced:

Prompts and responses are encrypted in memory, invisible to the host OS, hypervisor, cloud provider, and OLLM
Every request produces a cryptographic attestation receipt independently verifiable against Intel and NVIDIA public PKI
Zero data retention: no prompts or outputs stored or logged

TEE is the appropriate choice when you need verifiable, hardware-backed proof that inference ran inside an isolated environment and was not accessible to any third party.

Zero Data Retention (ZDR)

ZDR models run on Vercel's AI infrastructure. The security guarantee is policy-enforced:

Vercel's AI gateway enforces that model providers do not store, log, or use your inference data
No training on your prompts or responses
No attestation receipt or cryptographic proof of execution

ZDR provides strong contractual data handling guarantees and access to the broadest frontier model catalog, but without hardware-level isolation or independent verification capability.

Feature	TEE	ZDR
Hardware isolation	Yes	No
Attestation receipt	Yes	No
Data retention	None	None
Verification	Cryptographic, independent	Policy and contractual
Infrastructure	NEAR, Phala	Vercel

Threats OLLM is designed to mitigate

OLLM is built to mitigate the following classes of risk:

Unauthorized access to prompts and outputs

For TEE models: prompts and responses are processed exclusively inside hardware-isolated Trusted Execution Environments. They are never exposed to OLLM services, operators, or underlying infrastructure outside the enclave.

For ZDR models: zero data retention agreements with Vercel and the underlying model providers prevent storage or logging of any inference data.
Infrastructure-level compromise

Host operating systems, hypervisors, and cloud administrators cannot inspect inference data due to hardware-enforced memory isolation.
Malicious or curious insiders

OLLM operators and provider personnel do not have access to plaintext inference data, reducing insider risk to the control plane only.
Unverifiable provider behavior

Hardware attestation provides cryptographic proof that inference was executed in a trusted environment, removing blind reliance on provider claims.
Post-hoc trust failures

Each request produces verifiable artifacts, enabling after-the-fact audits and independent validation.

Threats outside OLLM’s scope

OLLM does not attempt to solve or guarantee protection against:

Compromised client environments or API key leakage
Malicious prompts or unsafe model behavior
Data exposure before requests are sent or after responses are received
Vulnerabilities inside model weights or training data
Non-attested execution paths outside supported TEEs

These exclusions are intentional and allow security teams to reason clearly about shared responsibility.

Insider, provider, and infrastructure risks

Risk surface	OLLM mitigation
OLLM operators	Confidential execution prevents access to prompts and outputs
Model providers	TEE-only execution with verifiable attestation
Cloud infrastructure	Hardware-enforced isolation and encryption-in-use
External attackers	Encrypted transport, authenticated access, and verified execution

Encryption & Key Management

OLLM enforces specific, verifiable encryption protections across the request lifecycle.

Encryption in transit

All communication between clients, OLLM, and model providers uses industry-standard encrypted transport (TLS).
Requests, responses, and attestation artifacts are protected from network-level interception.

Encryption in use (TEE)

Prompts and responses are encrypted in memory while being processed inside TEEs.
Data remains inaccessible to the host OS, hypervisor, and control plane services.
Decryption occurs only inside the trusted execution boundary.

Control plane protections

The OLLM control plane handles:

Authentication and authorization
Request routing to the user-selected model
Metadata and verification coordination

It does not process or store plaintext prompts or outputs. Control-plane access is restricted and auditable.

Key ownership and handling

Encryption and attestation keys are hardware-backed and managed by the underlying TEE platforms.
Signing keys used for attestation are rooted in vendor-specific hardware trust anchors.
OLLM does not manage or rotate keys that can decrypt customer inference data.

This design removes OLLM as a key custody risk.

Data Handling & Retention

OLLM follows a strict zero-retention model for inference data.

Prompts and responses

Prompts are not stored
Responses are not stored
No prompt or output content is logged
No replayable inference data is retained

Once a request completes, plaintext inference data exists only transiently inside the TEE and is then discarded.

Logging behavior

OLLM logs non-sensitive operational metadata only, such as:

Request identifiers
Model name
Provider
Execution status
Latency and token counts
Attestation and verification status

This metadata is sufficient for observability and auditing without exposing content.

Public dashboards and visibility

OLLM dashboards may display:

Aggregate request counts
Model and provider usage
Latency and cost metrics
Attestation status (e.g., Verified, Pending, Failed)
Cryptographic hashes and signatures for verification

They never display prompts or responses. Anyone viewing the dashboard can see verification outcomes and attestation artifacts, but cannot reconstruct or infer the underlying data.

Retention defaults and controls

Inference data: not retained
Attestation artifacts: retained for verification and audit purposes
Operational metadata: retained for monitoring and billing

This separation enables transparency and auditability without compromising confidentiality.

By combining zero data retention, hardware-backed encryption, and verifiable execution, OLLM provides a security model that is auditable, defensible, and suitable for regulated or high-risk environments without requiring customers to trust the platform blindly.

OLLM Security Model and Threat Boundaries

On this page