You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/cloud/private-inference.mdx
+28-30Lines changed: 28 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,36 +69,36 @@ NEAR AI Cloud's private inference provides three core guarantees:
69
69
70
70
NEAR AI Cloud combines Intel TDX and NVIDIA TEE technologies to create isolated, secure environments for AI computation:
71
71
72
-
**Intel TDX (Trust Domain Extensions)**
73
-
Creates confidential virtual machines (CVMs) that isolate your AI workloads from the host system, preventing unauthorized access to data in memory.
72
+
-**Intel TDX (Trust Domain Extensions)**
73
+
Creates confidential virtual machines (CVMs) that isolate your AI workloads from the host system, preventing unauthorized access to data in memory.
74
74
75
-
**NVIDIA TEE**
76
-
Provides GPU-level isolation for model inference, ensuring model weights and computations remain completely private during processing.
75
+
-**NVIDIA TEE**
76
+
Provides GPU-level isolation for model inference, ensuring model weights and computations remain completely private during processing.
77
77
78
-
**Cryptographic Attestation**
79
-
Each TEE environment generates cryptographic proofs of its integrity and configuration, enabling independent verification of the secure execution environment.
78
+
-**Cryptographic Attestation**
79
+
Each TEE environment generates cryptographic proofs of its integrity and configuration, enabling independent verification of the secure execution environment.
80
80
81
81
### The Inference Process
82
82
83
83
When you make a request to NEAR AI Cloud, your data flows through a secure pipeline designed to maintain privacy at every step:
84
84
85
-
**1. Request Initiation**
86
-
You send chat completion requests to the LLM Gateway, which operates within a secure TEE environment and manages authentication.
85
+
1.**Request Initiation:**
86
+
You send chat completion requests to the LLM Gateway, which operates within a secure TEE environment and manages authentication.
87
87
88
-
**2. Secure Request Routing**
89
-
The LLM Gateway routes your request to the appropriate Private LLM Node based on the requested model, availability, and load balancing requirements.
88
+
2.**Secure Request Routing:**
89
+
The LLM Gateway routes your request to the appropriate Private LLM Node based on the requested model, availability, and load balancing requirements.
90
90
91
-
**3. Secure Inference**
92
-
AI inference computations execute inside the Private LLM Node's TEE, where all data and model weights are protected by hardware-enforced isolation.
91
+
3.**Secure Inference:**
92
+
AI inference computations execute inside the Private LLM Node's TEE, where all data and model weights are protected by hardware-enforced isolation.
93
93
94
-
**4. Attestation Generation**
95
-
The TEE generates CPU and GPU attestation reports that provide cryptographic proof of the environment's integrity and configuration.
94
+
4.**Attestation Generation**
95
+
The TEE generates CPU and GPU attestation reports that provide cryptographic proof of the environment's integrity and configuration.
96
96
97
-
**5. Cryptographic Signing**
98
-
The TEE cryptographically signs both your original request and the inference results to ensure authenticity and prevent tampering.
97
+
5.**Cryptographic Signing:**
98
+
The TEE cryptographically signs both your original request and the inference results to ensure authenticity and prevent tampering.
99
99
100
-
**6. Verifiable Response**
101
-
You receive the AI response along with cryptographic signatures and attestation data for independent verification.
100
+
6.**Verifiable Response:**
101
+
You receive the AI response along with cryptographic signatures and attestation data for independent verification.
102
102
103
103
---
104
104
@@ -125,9 +125,7 @@ NEAR AI Cloud operates through a distributed architecture consisting of an LLM G
125
125
└─────────────────┘
126
126
```
127
127
128
-
### Key Components
129
-
130
-
**Private LLM Nodes**
128
+
### Private LLM Nodes
131
129
132
130
Each Private LLM Node provides secure, isolated AI inference capabilities:
133
131
@@ -136,7 +134,7 @@ Each Private LLM Node provides secure, isolated AI inference capabilities:
136
134
-**Private-ML-SDK**: Manages secure model execution, attestation generation, and cryptographic signing
137
135
-**Health Monitoring**: Automated liveness checks and monitoring ensure continuous availability
138
136
139
-
**LLM Gateway**
137
+
### LLM Gateway
140
138
141
139
The LLM Gateway serves as the central orchestration layer:
142
140
@@ -153,29 +151,29 @@ The LLM Gateway serves as the central orchestration layer:
153
151
154
152
NEAR AI Cloud's private inference implements multiple layers of security to protect your data:
155
153
156
-
**Hardware-Level Isolation**
157
-
TEEs create isolated execution environments enforced at the hardware level, preventing unauthorized access to memory and computation even from privileged system administrators or cloud providers.
154
+
-**Hardware-Level Isolation**
155
+
TEEs create isolated execution environments enforced at the hardware level, preventing unauthorized access to memory and computation even from privileged system administrators or cloud providers.
158
156
159
-
**Secure Communication**
157
+
-**Secure Communication**
160
158
All communication between your applications and the LLM infrastructure uses end-to-end encryption, protecting data in transit from network-level attacks.
161
159
162
-
**Cryptographic Attestation**
160
+
-**Cryptographic Attestation**
163
161
Every TEE environment generates cryptographic proofs that verify the integrity of the execution environment, allowing you to independently confirm your computations occurred in a genuine, unmodified TEE.
164
162
165
-
**Result Authentication**
163
+
-**Result Authentication**
166
164
All AI outputs are cryptographically signed inside the TEE before leaving the secure environment, ensuring the authenticity and integrity of responses.
167
165
168
166
### Threat Protection
169
167
170
168
NEAR AI Cloud's architecture protects against common attack vectors:
171
169
172
-
**Malicious Infrastructure Providers**
170
+
-**Malicious Infrastructure Providers**
173
171
Hardware-enforced TEE isolation prevents cloud infrastructure providers from accessing your prompts, model weights, or inference results, even with physical access to servers.
174
172
175
-
**Network-Based Attacks**
173
+
-**Network-Based Attacks**
176
174
End-to-end encryption protects your data during transmission, preventing man-in-the-middle attacks and network eavesdropping.
177
175
178
-
**Model Extraction Attempts**
176
+
-**Model Extraction Attempts**
179
177
Model weights remain encrypted and isolated within the TEE, making extraction computationally infeasible even for attackers with privileged system access.
0 commit comments