Security

OLLM Security Model and Threat Boundaries

Learn how OLLM enforces privacy through two execution models, TEE (hardware-isolated, cryptographically verifiable) and ZDR (policy-enforced zero data retention via Vercel). Covers threat model, key management, and data handling policies for both.

The security model of the OLLM confidential gateway is designed around explicit trust boundaries and minimized assumptions. Rather than relying on policies or access controls alone, OLLM uses hardware-enforced isolation and cryptographic verification to reduce reliance on intermediaries.

This section defines what OLLM is designed to protect against and what it intentionally does not.

Security by Execution Type

OLLM exposes two types of model execution environments, each with a distinct security model.

Trusted Execution Environment (TEE)

TEE models run on NEAR and Phala infrastructure using Intel TDX confidential virtual machines and NVIDIA H100 GPU attestation. The security guarantee is hardware-enforced:

  • Prompts and responses are encrypted in memory, invisible to the host OS, hypervisor, cloud provider, and OLLM
  • Every request produces a cryptographic attestation receipt independently verifiable against Intel and NVIDIA public PKI
  • Zero data retention: no prompts or outputs stored or logged

TEE is the appropriate choice when you need verifiable, hardware-backed proof that inference ran inside an isolated environment and was not accessible to any third party.

Zero Data Retention (ZDR)

ZDR models run on Vercel's AI infrastructure. The security guarantee is policy-enforced:

  • Vercel's AI gateway enforces that model providers do not store, log, or use your inference data
  • No training on your prompts or responses
  • No attestation receipt or cryptographic proof of execution

ZDR provides strong contractual data handling guarantees and access to the broadest frontier model catalog, but without hardware-level isolation or independent verification capability.

FeatureTEEZDR
Hardware isolationYesNo
Attestation receiptYesNo
Data retentionNoneNone
VerificationCryptographic, independentPolicy and contractual
InfrastructureNEAR, PhalaVercel

Threats OLLM is designed to mitigate

OLLM is built to mitigate the following classes of risk:

  • Unauthorized access to prompts and outputs

    For TEE models: prompts and responses are processed exclusively inside hardware-isolated Trusted Execution Environments. They are never exposed to OLLM services, operators, or underlying infrastructure outside the enclave.

    For ZDR models: zero data retention agreements with Vercel and the underlying model providers prevent storage or logging of any inference data.

  • Infrastructure-level compromise

    Host operating systems, hypervisors, and cloud administrators cannot inspect inference data due to hardware-enforced memory isolation.

  • Malicious or curious insiders

    OLLM operators and provider personnel do not have access to plaintext inference data, reducing insider risk to the control plane only.

  • Unverifiable provider behavior

    Hardware attestation provides cryptographic proof that inference was executed in a trusted environment, removing blind reliance on provider claims.

  • Post-hoc trust failures

    Each request produces verifiable artifacts, enabling after-the-fact audits and independent validation.

Threats outside OLLM’s scope

OLLM does not attempt to solve or guarantee protection against:

  • Compromised client environments or API key leakage
  • Malicious prompts or unsafe model behavior
  • Data exposure before requests are sent or after responses are received
  • Vulnerabilities inside model weights or training data
  • Non-attested execution paths outside supported TEEs

These exclusions are intentional and allow security teams to reason clearly about shared responsibility.

Insider, provider, and infrastructure risks

Risk surfaceOLLM mitigation
OLLM operatorsConfidential execution prevents access to prompts and outputs
Model providersTEE-only execution with verifiable attestation
Cloud infrastructureHardware-enforced isolation and encryption-in-use
External attackersEncrypted transport, authenticated access, and verified execution

Encryption & Key Management

OLLM enforces specific, verifiable encryption protections across the request lifecycle.

Encryption in transit

  • All communication between clients, OLLM, and model providers uses industry-standard encrypted transport (TLS).
  • Requests, responses, and attestation artifacts are protected from network-level interception.

Encryption in use (TEE)

  • Prompts and responses are encrypted in memory while being processed inside TEEs.
  • Data remains inaccessible to the host OS, hypervisor, and control plane services.
  • Decryption occurs only inside the trusted execution boundary.

Control plane protections

The OLLM control plane handles:

  • Authentication and authorization
  • Request routing to the user-selected model
  • Metadata and verification coordination

It does not process or store plaintext prompts or outputs. Control-plane access is restricted and auditable.

Key ownership and handling

  • Encryption and attestation keys are hardware-backed and managed by the underlying TEE platforms.
  • Signing keys used for attestation are rooted in vendor-specific hardware trust anchors.
  • OLLM does not manage or rotate keys that can decrypt customer inference data.

This design removes OLLM as a key custody risk.

Data Handling & Retention

OLLM follows a strict zero-retention model for inference data.

Prompts and responses

  • Prompts are not stored
  • Responses are not stored
  • No prompt or output content is logged
  • No replayable inference data is retained

Once a request completes, plaintext inference data exists only transiently inside the TEE and is then discarded.

Logging behavior

OLLM logs non-sensitive operational metadata only, such as:

  • Request identifiers
  • Model name
  • Provider
  • Execution status
  • Latency and token counts
  • Attestation and verification status

This metadata is sufficient for observability and auditing without exposing content.

Public dashboards and visibility

OLLM dashboards may display:

  • Aggregate request counts
  • Model and provider usage
  • Latency and cost metrics
  • Attestation status (e.g., Verified, Pending, Failed)
  • Cryptographic hashes and signatures for verification

They never display prompts or responses. Anyone viewing the dashboard can see verification outcomes and attestation artifacts, but cannot reconstruct or infer the underlying data.

Retention defaults and controls

  • Inference data: not retained
  • Attestation artifacts: retained for verification and audit purposes
  • Operational metadata: retained for monitoring and billing

This separation enables transparency and auditability without compromising confidentiality.

By combining zero data retention, hardware-backed encryption, and verifiable execution, OLLM provides a security model that is auditable, defensible, and suitable for regulated or high-risk environments without requiring customers to trust the platform blindly.

On this page