Verifiable Privacy and Attestation in OLLM

Explore how OLLM provides cryptographic proof of privacy and execution integrity for every inference request using Intel TDX and NVIDIA GPU hardware attestation.

Cryptographic attestation and hardware verification apply to TEE models only (those running on near or phala infrastructure). ZDR models running on Vercel provide zero data retention guarantees through provider agreements but do not generate attestation receipts or support independent verification.

Verifiable Inference Explained

Verifiable inference is the core security guarantee provided by OLLM. It ensures that privacy is not based on policy, contractual assurances, or provider claims, but on cryptographic proof tied to each individual request.

What “verifiable privacy per request” means

For every inference request processed through OLLM:

The model executes only inside a hardware-backed Trusted Execution Environment (TEE)
The execution environment produces cryptographic evidence of how and where the request was processed
That evidence can be independently verified by the customer

This means privacy guarantees are request-scoped, not platform-scoped. Each response stands on its own, with its own proof.

What is proven cryptographically

Using hardware attestation, OLLM enables customers to verify that:

The inference ran inside a genuine TEE
The execution environment was not tampered with
The environment matched expected security measurements
The response was generated within that trusted boundary

These proofs are anchored in hardware root-of-trust mechanisms, not software assertions.

What is guaranteed

OLLM guarantees that:

Prompts and responses are processed inside TEEs
Data is encrypted while in use
Inference execution integrity can be verified per request
OLLM does not access plaintext inference data outside secure enclaves

What is out of scope

OLLM does not claim to guarantee:

Compromised client environments or API key leakage
Malicious prompts or unsafe model behavior
Data exposure before requests are sent or after responses are received
Vulnerabilities inside model weights or training data
Non-attested execution paths outside supported TEEs

These boundaries are intentional and explicit, allowing security teams to reason precisely about shared responsibility. For the full threat model, see the Security Model.

Attestation Flow

Attestation is the mechanism by which OLLM converts hardware trust guarantees into verifiable evidence that customers can inspect.

Request lifecycle and attestation timing

Request submission

The client submits an inference request to OLLM, explicitly specifying the model.

Secure execution

The request is forwarded to the selected model’s TEE-backed execution environment.

Measurement and attestation

During execution, the hardware records cryptographic measurements of the environment, including code and configuration state.

Signature generation

These measurements are signed using hardware-backed keys rooted in the platform’s trust anchor.

Response delivery

The model output is returned along with attestation artifacts tied to that specific request.

Hashes, measurements, and signatures

Attestation artifacts typically include:

Cryptographic hashes of the execution environment
Signed measurements attesting to environment integrity
Metadata linking the attestation to the specific inference request

These artifacts are tamper-evident and can be validated independently.

For a detailed breakdown of the attestation receipt structure, including Intel TDX quote fields, NVIDIA GPU evidence, message signatures, and external trust anchors, see the Attestation Data Reference.

What customers can independently verify

Using the attestation data, customers can verify that:

The inference ran inside a genuine TEE
The hardware platform is authentic
The execution environment matches expected security properties
The response was produced within that verified environment

This enables auditable, defensible trust without relying on OLLM or model providers as intermediaries.

Supported Attestation Technologies

OLLM supports multiple hardware attestation technologies to enable verifiable inference across different execution environments.

Unified TEE Architecture: Intel TDX + NVIDIA GPU

Every inference request runs inside a single Trusted Execution Environment that combines both CPU and GPU isolation. These are not separate execution paths, they work together as one unified architecture.

Intel TDX (Trust Domain Extensions) provides the secure virtual machine:

Hardware-enforced isolation from the host OS and hypervisor
Memory encryption and integrity for all VM contents
Verifiable measurements of VM state (code identity, runtime configuration)

NVIDIA GPU Attestation extends the trust boundary to GPU compute within that VM:

Verified GPU firmware and execution environment
Cryptographic proof of trusted GPU execution
Protection against unauthorized access during inference

How they bind together

The TDX quote and GPU evidence are cryptographically linked by a shared session nonce. The nonce appears in both the TDX REPORT_DATA and each GPU's SPDM evidence header, proving that all attestations were generated in the same session. This prevents an attacker from mixing attestations from different environments.

By combining hardware-backed execution, cryptographic attestation, and per-request verification, OLLM enables a security posture where privacy and integrity are provable, inspectable, and auditable, even in highly regulated or adversarial environments.

Verifiable Privacy and Attestation in OLLM

On this page