Skip to content

Commit 6d34abe

Browse files
Merge pull request #2 from nearai/feat/enhance-private-inference
feat: enhance private inference
2 parents 575f234 + e03de62 commit 6d34abe

File tree

8 files changed

+1508
-104
lines changed

8 files changed

+1508
-104
lines changed

docs/assets/tee.png

-327 KB
Binary file not shown.

docs/cloud/private-inference.mdx

Lines changed: 113 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -6,63 +6,105 @@ slug: /cloud/private-inference
66
description: "Learn how NEAR AI Cloud's private inference leverages TEEs to provide cryptographically verifiable AI computations with complete data privacy and security."
77
---
88

9+
import PrivateInferenceIcon from '@site/static/img/icons/private-inference.svg';
910
import VerificationIcon from '@site/static/img/icons/verification.svg';
11+
import PerformanceIcon from '@site/static/img/icons/performance.svg';
1012
import { FeatureCard, FeatureCardGrid } from '@site/src/components/FeatureCard';
1113

1214
# Private Inference
1315

14-
NEAR AI Cloud's private inference capabilities leverage Trusted Execution Environments (TEEs) to provide cryptographically verifiable AI computations while ensuring complete data privacy. This guide explains how private inference works, its architecture, and security guarantees.
16+
When you use traditional AI services, your data passes through systems controlled by cloud providers and AI companies. Your prompts, the AI's responses, and even the processing of your requests are all visible to these third parties. This creates serious privacy concerns for sensitive applications.
1517

16-
Private inference ensures that your AI model interactions are completely private and verifiable, with cryptographic proofs that guarantee the integrity of every computation.
18+
**Private inference solves this problem.** It ensures that AI computations happen in a completely isolated environment where no one—not the cloud provider, not the model provider, not even NEAR—can access your data. At the same time, you can independently verify that your requests were actually processed in this secure environment through cryptographic attestation.
19+
20+
This guide explains how NEAR AI Cloud implements private inference using Trusted Execution Environments (TEEs), the architecture that protects your data, and the security guarantees you can rely on.
1721

1822
---
1923

2024
## What is Private Inference?
2125

22-
Private inference in NEAR AI Cloud ensures that your AI model interactions are:
23-
24-
- **🔒 Private**: Your prompts, model weights, and outputs are never visible to the infrastructure provider
25-
- **🛡️ Verifiable**: Every computation is cryptographically signed and can be verified to have occurred in a secure environment
26-
- **⚡ Fast**: Optimized for high-throughput inference with minimal latency overhead
26+
Private inference is a method of running AI models where both your input data and the model's outputs remain completely hidden from everyone except you—even while the computation happens on remote servers you don't control.
27+
28+
Traditional cloud AI services require you to trust that providers won't access your data. Private inference eliminates this need for trust by using hardware-based security that makes it technically impossible for anyone to see your data, even with physical access to the servers.
29+
30+
NEAR AI Cloud's private inference provides three core guarantees:
31+
32+
<div className="feature-highlights">
33+
<div className="feature-highlight">
34+
<div className="feature-highlight-icon">
35+
<PrivateInferenceIcon />
36+
</div>
37+
<div className="feature-highlight-content">
38+
<h3>Complete Privacy</h3>
39+
<p>Your prompts, model weights, and outputs are encrypted and isolated in hardware-secured environments. Infrastructure providers, model providers, and NEAR cannot access your data at any point in the process.</p>
40+
</div>
41+
</div>
42+
43+
<div className="feature-highlight">
44+
<div className="feature-highlight-icon">
45+
<VerificationIcon />
46+
</div>
47+
<div className="feature-highlight-content">
48+
<h3>Cryptographic Verification</h3>
49+
<p>Every computation generates cryptographic proof that it occurred inside a genuine, secure TEE. You can independently verify that your AI requests were processed in a protected environment without trusting any third party.</p>
50+
</div>
51+
</div>
52+
53+
<div className="feature-highlight">
54+
<div className="feature-highlight-icon">
55+
<PerformanceIcon />
56+
</div>
57+
<div className="feature-highlight-content">
58+
<h3>Production Performance</h3>
59+
<p>Hardware-accelerated TEEs with NVIDIA H200 GPUs deliver high-throughput inference with minimal latency overhead, making private inference practical for real-world applications.</p>
60+
</div>
61+
</div>
62+
</div>
2763

2864
---
2965

3066
## How Private Inference Works
3167

3268
### Trusted Execution Environment (TEE)
3369

34-
NEAR AI Cloud uses a combination of Intel TDX and NVIDIA TEE technologies to create isolated, secure environments for AI computation:
70+
NEAR AI Cloud combines Intel TDX and NVIDIA TEE technologies to create isolated, secure environments for AI computation:
3571

36-
![TEE Architecture](../assets/tee.png)
72+
- **Intel TDX (Trust Domain Extensions)**
73+
Creates confidential virtual machines (CVMs) that isolate your AI workloads from the host system, preventing unauthorized access to data in memory.
3774

38-
1. **Intel TDX (Trust Domain Extensions)**: Creates confidential virtual machines (CVMs) that isolate your AI workloads from the host system
39-
2. **NVIDIA TEE**: Provides GPU-level isolation for model inference, ensuring model weights and computations remain private
40-
3. **Cryptographic Attestation**: Each TEE environment generates cryptographic proofs of its integrity and configuration
75+
- **NVIDIA TEE**
76+
Provides GPU-level isolation for model inference, ensuring model weights and computations remain completely private during processing.
4177

42-
### The Private Inference Process
78+
- **Cryptographic Attestation**
79+
Each TEE environment generates cryptographic proofs of its integrity and configuration, enabling independent verification of the secure execution environment.
4380

44-
```mermaid
45-
graph TD
46-
A[User Request] --> B[Secure Request Routing]
47-
B --> C[Secure Inference]
48-
C --> D[Cryptographic Signature]
49-
D --> E[Verifiable Response]
50-
51-
C --> F[Attestation Report]
52-
```
81+
### The Inference Process
82+
83+
When you make a request to NEAR AI Cloud, your data flows through a secure pipeline designed to maintain privacy at every step:
84+
85+
1. **Request Initiation:**
86+
You send chat completion requests to the LLM Gateway, which operates within a secure TEE environment and manages authentication.
87+
88+
2. **Secure Request Routing:**
89+
The LLM Gateway routes your request to the appropriate Private LLM Node based on the requested model, availability, and load balancing requirements.
90+
91+
3. **Secure Inference:**
92+
AI inference computations execute inside the Private LLM Node's TEE, where all data and model weights are protected by hardware-enforced isolation.
93+
94+
4. **Attestation Generation**
95+
The TEE generates CPU and GPU attestation reports that provide cryptographic proof of the environment's integrity and configuration.
96+
97+
5. **Cryptographic Signing:**
98+
The TEE cryptographically signs both your original request and the inference results to ensure authenticity and prevent tampering.
5399

54-
1. **Request Initiation**: Users send chat completion requests to the LLM Gateway, which operates within a secure TEE environment
55-
2. **Secure Request Routing**: The LLM Gateway securely routes requests to appropriate Private LLM Nodes based on model availability and load balancing
56-
3. **Secure Inference**: AI inference computations are performed inside the Private LLM Nodes, with all data and model weights protected by TEE isolation
57-
4. **Attestation Generation**: CPU and GPU attestation reports are generated, providing cryptographic proof of the TEE environment's integrity
58-
5. **Cryptographic Signing**: The TEE cryptographically signs both the original request and the inference results to ensure authenticity and prevent tampering
59-
6. **Verifiable Response**: Users receive the AI response along with cryptographic signatures for complete verification
100+
6. **Verifiable Response:**
101+
You receive the AI response along with cryptographic signatures and attestation data for independent verification.
60102

61103
---
62104

63105
## Architecture Overview
64106

65-
NEAR AI Cloud operates LLM Gateway and a network of Private LLM Nodes:
107+
NEAR AI Cloud operates through a distributed architecture consisting of an LLM Gateway and a network of Private LLM Nodes:
66108

67109
```
68110
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
@@ -71,7 +113,7 @@ NEAR AI Cloud operates LLM Gateway and a network of Private LLM Nodes:
71113
├─────────────────┤ ├─────────────────┤ ├─────────────────┤
72114
│ Intel TDX CVM │ │ Intel TDX CVM │ │ Intel TDX CVM │
73115
│ NVIDIA TEE │ │ NVIDIA TEE │ │ NVIDIA TEE │
74-
│ Private-ML-SDK │ │ Private-ML-SDK │ │ Private-ML-SDK │
116+
│ Private-ML-SDK │ │ Private-ML-SDK │ │ Private-ML-SDK │
75117
└─────────────────┘ └─────────────────┘ └─────────────────┘
76118
│ │ │
77119
└───────────────────────┼───────────────────────┘
@@ -83,41 +125,59 @@ NEAR AI Cloud operates LLM Gateway and a network of Private LLM Nodes:
83125
└─────────────────┘
84126
```
85127

86-
### Key Components
128+
### Private LLM Nodes
87129

88-
#### 1. **Private LLM Nodes**
89-
- Standardized hardware: 8x NVIDIA H200 GPUs per machine
90-
- Intel TDX-enabled CPUs for secure virtualization
91-
- Private-ML-SDK for secure model execution and attestation
92-
- Automated liveness monitoring and health checks
130+
Each Private LLM Node provides secure, isolated AI inference capabilities:
93131

94-
#### 2. **LLM Gateway**
95-
- Model registration and provider management
96-
- Request routing and load balancing across Private LLM Nodes
97-
- Attestation verification and storage
98-
- API key management and usage tracking
132+
- **Standardized Hardware**: 8x NVIDIA H200 GPUs per node, optimized for high-performance inference
133+
- **Intel TDX-enabled CPUs**: Enable secure virtualization with hardware-enforced isolation
134+
- **Private-ML-SDK**: Manages secure model execution, attestation generation, and cryptographic signing
135+
- **Health Monitoring**: Automated liveness checks and monitoring ensure continuous availability
136+
137+
### LLM Gateway
138+
139+
The LLM Gateway serves as the central orchestration layer:
140+
141+
- **Model Management**: Registers and manages available models across the Private LLM Node network
142+
- **Request Routing**: Intelligently routes requests to appropriate nodes based on model availability and load
143+
- **Attestation Verification**: Validates and stores TEE attestation reports for audit and verification
144+
- **Access Control**: Manages API keys, authentication, and usage tracking for billing and monitoring
99145

100146
---
101147

102148
## Security Guarantees
103149

104-
### Cryptographic Isolation
150+
### Defense in Depth
151+
152+
NEAR AI Cloud's private inference implements multiple layers of security to protect your data:
153+
154+
- **Hardware-Level Isolation**
155+
TEEs create isolated execution environments enforced at the hardware level, preventing unauthorized access to memory and computation even from privileged system administrators or cloud providers.
156+
157+
- **Secure Communication**
158+
All communication between your applications and the LLM infrastructure uses end-to-end encryption, protecting data in transit from network-level attacks.
159+
160+
- **Cryptographic Attestation**
161+
Every TEE environment generates cryptographic proofs that verify the integrity of the execution environment, allowing you to independently confirm your computations occurred in a genuine, unmodified TEE.
162+
163+
- **Result Authentication**
164+
All AI outputs are cryptographically signed inside the TEE before leaving the secure environment, ensuring the authenticity and integrity of responses.
165+
166+
### Threat Protection
105167

106-
Private inference provides multiple layers of security:
168+
NEAR AI Cloud's architecture protects against common attack vectors:
107169

108-
1. **Hardware-Level Isolation**: TEEs create isolated execution environments at the hardware level
109-
2. **Secure Communication**: End-to-end encryption between users and LLM
110-
3. **Attestation Verification**: Cryptographic proofs verify the integrity of the execution environment
111-
4. **Result Signing**: All AI outputs are cryptographically signed inside TEE
170+
- **Malicious Infrastructure Providers**
171+
Hardware-enforced TEE isolation prevents cloud infrastructure providers from accessing your prompts, model weights, or inference results, even with physical access to servers.
112172

113-
### Threat Model
173+
- **Network-Based Attacks**
174+
End-to-end encryption protects your data during transmission, preventing man-in-the-middle attacks and network eavesdropping.
114175

115-
NEAR AI Cloud's private inference protects against:
176+
- **Model Extraction Attempts**
177+
Model weights remain encrypted and isolated within the TEE, making extraction computationally infeasible even for attackers with privileged system access.
116178

117-
- **Malicious Infrastructure Providers**: TEEs prevent providers from accessing user data
118-
- **Network Attacks**: End-to-end encryption protects data in transit
119-
- **Model Extraction**: Model weights remain encrypted and inaccessible
120-
- **Result Tampering**: Cryptographic signatures ensure result integrity
179+
**Result Tampering**
180+
Cryptographic signatures generated inside the TEE ensure that responses cannot be modified in transit without detection, maintaining the integrity of AI outputs.
121181

122182
---
123183

docusaurus.config.js

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ const config = {
2222
url: 'https://nearai.github.io',
2323
// Set the /<baseUrl>/ pathname under which your site is served
2424
// For GitHub pages deployment, it is often '/<projectName>/'
25-
baseUrl: '/docs/',
25+
baseUrl: '/',
2626

2727
// GitHub pages deployment config.
2828
// If you aren't using GitHub pages, you don't need these.
@@ -59,6 +59,12 @@ const config = {
5959
],
6060
],
6161

62+
themes: ['@docusaurus/theme-mermaid'],
63+
64+
markdown: {
65+
mermaid: true,
66+
},
67+
6268
themeConfig:
6369
/** @type {import('@docusaurus/preset-classic').ThemeConfig} */
6470
({

0 commit comments

Comments
 (0)