OLLM Security Model and Threat Boundaries
Learn how OLLM enforces privacy through two execution models, TEE (hardware-isolated, cryptographically verifiable) and ZDR (policy-enforced zero data retention via Vercel). Covers threat model, key management, and data handling policies for both.
The security model of the OLLM confidential gateway is designed around explicit trust boundaries and minimized assumptions. Rather than relying on policies or access controls alone, OLLM uses hardware-enforced isolation and cryptographic verification to reduce reliance on intermediaries.
This section defines what OLLM is designed to protect against and what it intentionally does not.
Security by Execution Type
OLLM exposes two types of model execution environments, each with a distinct security model.
Trusted Execution Environment (TEE)
TEE models run on NEAR and Phala infrastructure using Intel TDX confidential virtual machines and NVIDIA H100 GPU attestation. The security guarantee is hardware-enforced:
- Prompts and responses are encrypted in memory, invisible to the host OS, hypervisor, cloud provider, and OLLM
- Every request produces a cryptographic attestation receipt independently verifiable against Intel and NVIDIA public PKI
- Zero data retention: no prompts or outputs stored or logged
TEE is the appropriate choice when you need verifiable, hardware-backed proof that inference ran inside an isolated environment and was not accessible to any third party.
Zero Data Retention (ZDR)
ZDR models run on Vercel's AI infrastructure. The security guarantee is policy-enforced:
- Vercel's AI gateway enforces that model providers do not store, log, or use your inference data
- No training on your prompts or responses
- No attestation receipt or cryptographic proof of execution
ZDR provides strong contractual data handling guarantees and access to the broadest frontier model catalog, but without hardware-level isolation or independent verification capability.
| Feature | TEE | ZDR |
|---|---|---|
| Hardware isolation | Yes | No |
| Attestation receipt | Yes | No |
| Data retention | None | None |
| Verification | Cryptographic, independent | Policy and contractual |
| Infrastructure | NEAR, Phala | Vercel |
Threats OLLM is designed to mitigate
OLLM is built to mitigate the following classes of risk:
-
Unauthorized access to prompts and outputs
For TEE models: prompts and responses are processed exclusively inside hardware-isolated Trusted Execution Environments. They are never exposed to OLLM services, operators, or underlying infrastructure outside the enclave.
For ZDR models: zero data retention agreements with Vercel and the underlying model providers prevent storage or logging of any inference data.
-
Infrastructure-level compromise
Host operating systems, hypervisors, and cloud administrators cannot inspect inference data due to hardware-enforced memory isolation.
-
Malicious or curious insiders
OLLM operators and provider personnel do not have access to plaintext inference data, reducing insider risk to the control plane only.
-
Unverifiable provider behavior
Hardware attestation provides cryptographic proof that inference was executed in a trusted environment, removing blind reliance on provider claims.
-
Post-hoc trust failures
Each request produces verifiable artifacts, enabling after-the-fact audits and independent validation.
Threats outside OLLM’s scope
OLLM does not attempt to solve or guarantee protection against:
- Compromised client environments or API key leakage
- Malicious prompts or unsafe model behavior
- Data exposure before requests are sent or after responses are received
- Vulnerabilities inside model weights or training data
- Non-attested execution paths outside supported TEEs
These exclusions are intentional and allow security teams to reason clearly about shared responsibility.
Insider, provider, and infrastructure risks
| Risk surface | OLLM mitigation |
|---|---|
| OLLM operators | Confidential execution prevents access to prompts and outputs |
| Model providers | TEE-only execution with verifiable attestation |
| Cloud infrastructure | Hardware-enforced isolation and encryption-in-use |
| External attackers | Encrypted transport, authenticated access, and verified execution |
Encryption & Key Management
OLLM enforces specific, verifiable encryption protections across the request lifecycle.
Encryption in transit
- All communication between clients, OLLM, and model providers uses industry-standard encrypted transport (TLS).
- Requests, responses, and attestation artifacts are protected from network-level interception.
Encryption in use (TEE)
- Prompts and responses are encrypted in memory while being processed inside TEEs.
- Data remains inaccessible to the host OS, hypervisor, and control plane services.
- Decryption occurs only inside the trusted execution boundary.
Control plane protections
The OLLM control plane handles:
- Authentication and authorization
- Request routing to the user-selected model
- Metadata and verification coordination
It does not process or store plaintext prompts or outputs. Control-plane access is restricted and auditable.
Key ownership and handling
- Encryption and attestation keys are hardware-backed and managed by the underlying TEE platforms.
- Signing keys used for attestation are rooted in vendor-specific hardware trust anchors.
- OLLM does not manage or rotate keys that can decrypt customer inference data.
This design removes OLLM as a key custody risk.
Data Handling & Retention
OLLM follows a strict zero-retention model for inference data.
Prompts and responses
- Prompts are not stored
- Responses are not stored
- No prompt or output content is logged
- No replayable inference data is retained
Once a request completes, plaintext inference data exists only transiently inside the TEE and is then discarded.
Logging behavior
OLLM logs non-sensitive operational metadata only, such as:
- Request identifiers
- Model name
- Provider
- Execution status
- Latency and token counts
- Attestation and verification status
This metadata is sufficient for observability and auditing without exposing content.
Public dashboards and visibility
OLLM dashboards may display:
- Aggregate request counts
- Model and provider usage
- Latency and cost metrics
- Attestation status (e.g., Verified, Pending, Failed)
- Cryptographic hashes and signatures for verification
They never display prompts or responses. Anyone viewing the dashboard can see verification outcomes and attestation artifacts, but cannot reconstruct or infer the underlying data.
Retention defaults and controls
- Inference data: not retained
- Attestation artifacts: retained for verification and audit purposes
- Operational metadata: retained for monitoring and billing
This separation enables transparency and auditability without compromising confidentiality.
By combining zero data retention, hardware-backed encryption, and verifiable execution, OLLM provides a security model that is auditable, defensible, and suitable for regulated or high-risk environments without requiring customers to trust the platform blindly.
Team Management
Create and manage teams in the OLLM dashboard to share billing, invite collaborators, and control access with role-based permissions.
Verifiable Privacy and Attestation in OLLM
Explore how OLLM provides cryptographic proof of privacy and execution integrity for every inference request using Intel TDX and NVIDIA GPU hardware attestation.