Skip to content

Commit 85c679d

Browse files
authored
Update draft-xia-ipsecme-eesp-stateless-encryption.md
-2 version initial udpate for internal review.
1 parent 577198a commit 85c679d

File tree

1 file changed

+95
-19
lines changed

1 file changed

+95
-19
lines changed

draft-xia-ipsecme-eesp-stateless-encryption.md

Lines changed: 95 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ category: std
55
submissionType: IETF
66
ipr: trust200902
77

8-
docname: draft-xia-ipsecme-eesp-stateless-encryption-00
8+
docname: draft-xia-ipsecme-eesp-stateless-encryption-02
99
submissiontype: IETF # also: "independent", "editorial", "IAB", or "IRTF"
1010
number:
1111
date:
@@ -14,9 +14,9 @@ consensus: true
1414
# area: AREA
1515
workgroup: IPSECME Working Group
1616
keyword:
17-
-
18-
-
19-
-
17+
-
18+
-
19+
-
2020
venue:
2121
# group: WG
2222
# type: Working Group
@@ -49,37 +49,37 @@ normative:
4949
informative:
5050
PSP:
5151
title: PSP Architecture Specification
52-
author:
52+
author:
5353
org: Google
54-
date:
54+
date:
5555
target: https://github.com/google/psp/blob/main/doc/PSP_Arch_Spec.pdf
5656
5757
UEC TSS:
5858
title: Ultra Ethernet Specification v1.0
59-
author:
59+
author:
6060
org: Ultra Ethernet Consortium
61-
date:
61+
date:
6262
target: https://ultraethernet.org/wp-content/uploads/sites/20/2025/06/UE-Specification-6.11.25.pdf
6363

6464
--- abstract
6565

66-
This draft first introduces several use cases for stateless encryption, analyzes and compares some existing stateless encryption schemes in the industry, and then attempts to propose a general and flexible stateless encryption scheme based on the summarized requirements.
66+
This draft first introduces several use cases for stateless encryption, analyzes and compares some existing stateless encryption schemes in the industry, and then attempts to propose a general and flexible stateless encryption scheme based on the summarized requirements.
6767

6868
--- middle
6969

7070
# Introduction {#intro}
7171

7272
Recently, with the emergence of more new scenarios such as high-performance cloud services, AI large model computing, and 5G mobile backhaul networks, higher requirements have been put forward for the hardware friendliness, performance, and flexibility of the IPsec ESP protocol. A new protocol design, EESP {{?I-D.ietf-ipsecme-eesp}} {{?I-D.ietf-ipsecme-eesp-ikev2}}, is being discussed and formulated. EESP focuses on solving issues such as introducing more fine-grained sub-child-SAs, adapting the ESP header and trailer format, and allowing parts of the transport layer header to be unencrypted, and implementing flexible expansion of EESP new features through options.
7373

74-
In addition to the issues listed above that are being addressed, stateless encryption is also a very important point. Its basic idea is to dynamically calculate data keys based on a small number of master keys (for AES-GCM, the encryption key and authentication key are combined), which helps optimize hardware resource limitations, performance optimization, and key negotiation complexity in large-scale IPSec session scenarios. This draft first introduces several use cases for stateless encryption, analyzes and compares some existing stateless encryption schemes in the industry, and then attempts to propose a general and flexible stateless encryption scheme based on the summarized requirements.
74+
In addition to the issues listed above that are being addressed, stateless encryption is also a very important point. Its basic idea is to dynamically calculate data keys based on a small number of master keys (for AES-GCM, the encryption key and authentication key are combined), which helps optimize hardware resource limitations, performance optimization, and key negotiation complexity in large-scale IPsec session scenarios. This draft first introduces several use cases for stateless encryption, analyzes and compares some existing stateless encryption schemes in the industry, and then attempts to propose a general and flexible stateless encryption scheme based on the summarized requirements.
7575

7676

7777
# Use Cases
7878

7979

8080
## General Computing of Cloud Service
8181

82-
Public cloud services provide IPSec VPN access for massive users, and the servers in their infrastructure need to support massive IPSec session access. If hardware supports IPSec, the hardware should support session-based encryption and decryption, and the data keys of different sessions are isolated. The server needs to maintain the security connection context between the server and a large number of clients, and the hardware with limited memory cannot store the huge context. Note that the client and server do not belong to the same trusted domain in this case.
82+
Public cloud services provide IPsec VPN access for massive users, and the servers in their infrastructure need to support massive IPsec session access. If hardware supports IPsec, the hardware should support session-based encryption and decryption, and the data keys of different sessions are isolated. The server needs to maintain the security connection context between the server and a large number of clients, and the hardware with limited memory cannot store the huge context. Note that the client and server do not belong to the same trusted domain in this case.
8383

8484
The stateless encryption scheme in the {{PSP}} solution proposed by Google is used to address the above hardware memory overhead problem. Its main principle is to derive a data key based on the master key on the server side, and the client side obtains the data key through an out-of-band method. It has:
8585

@@ -133,8 +133,8 @@ As shown in the below figure, encrypted communication is required between differ
133133
The stateless encryption scheme defined by {{UEC TSS}} can be used to solve the above problem. The main principle is that all communication instances of a HPC job belong to the same trust domain and share the same master key for both receiving and sending directions. It has:
134134

135135
- Pros:
136-
- Better than Google PSP,it saves all security session contexts;
137-
- The communication parties do not need to store data keys, and the increase of the number of instances and connections of the HPC job does not affect the number of security contexts;
136+
- Better than Google PSP,it saves all security session contexts;
137+
- The communication parties do not need to store data keys, and the increase of the number of instances and connections of the HPC job does not affect the number of security contexts;
138138
- Without out of band slow path data key negotiation, the first packet delay is small;
139139
- Data keys can be updated through the TSC.epoch.
140140
- Cons:
@@ -184,7 +184,7 @@ Similarly, the NIC resource pool can also be used for east-west traffic access b
184184
## AI Computing
185185

186186

187-
187+
As shown in the figure below, in a AI computing network, a computing task is collaboratively executed by a group of CPUs & XPUs located in the same trust domain or across trust domains (in the case of cross-trust domains, they are interconnected as proxies through DPU). For CPUs & XPUs within the same trust domain, stateless encryption sharing the same master key can eliminate the complexity and latency of key negotiation between chips. For interconnection across trust domains, the DPU needs to perform encryption connection proxy functions between two trust domains (local trusted domain and global trusted domain). At this time, the DPU simultaneously possesses the master keys of the two trust domains, calculates the data key for intra-domain communication in each domain based on its context, and then uses the calculated two data keys to complete the secure connection proxy across trust domains.
188188

189189
~~~
190190

@@ -228,22 +228,98 @@ Similarly, the NIC resource pool can also be used for east-west traffic access b
228228

229229
Based on the above use cases, the requirements for a general and flexible stateless encryption scheme are as follows:
230230

231-
- Support nodes within a trusted trust domain to share the same master key;
232-
- Master key supports multi-level combination design. In a trust domain, the master key is composed of multiple root keys of different types and levels, such as trust domain root key, tenant root key, task group root key, etc. This enhances the overall security of the master key and supports fine-grained encryption traffic isolation (e.g., all nodes in a trust domain, nodes of the same tenant in a trust domain, nodes of the same computing task in a trust domain, etc.);
231+
- Support entities within a trust group to share the same master key;
232+
- Master key supports multi-level combination design. In a trust group, the master key is composed of multiple root keys of different types and levels, such as trust region root key, user group root key, task group root key, etc. This enhances the overall security of the master key and supports fine-grained encryption traffic isolation (e.g., all entities in a trust region, entities of the same user group in a trust region, entities of the same task group in a trust region, etc.);
233233
- Different types of root keys have different security levels and lifecycles, and corresponding key rotation mechanisms need to be defined. The master key update will trigger the data key update;
234234
- The key rotation of each type of root key should support multiple key rotations, such as pre_key, current_key, and next_key, to support rapid rotation while ensuring that real-time encryption and decryption are not affected;
235235
- The key derivation of the data key is based on the master key, context, and KDF. KDF must support packet-by-packet data key calculation in most cases (except when the data key is cached in memory), which requires extremely high performance and must support cryptographically secure, hardware-concurrent high-performance algorithms;
236236
- To support real-time derivation of the Data Key, context information and IV information need to be carried with the message. To support different scenarios and different granularities of data key calculation and encryption traffic isolation (based on stream, based on source IP, based on source ID, etc.), multiple combinations of context and IV need to be supported, and different combination algorithms need to be distinguished through specific fields in the message;
237237
- Context information enables dynamic updates of the data key, such as carrying an epoch in the context. When the epoch changes, the data key is also refreshed accordingly;
238-
- It is necessary to support encryption proxy capabilities across trust domains. At the edge nodes across trust domains (such as DPU, Switch, etc.), support for master keys and stateless encryption of two trust domains (local trust domain and global trust domain) is required, and proxy conversion of message encryption and decryption between the two trust domains must be completed.
238+
- It is necessary to support encryption proxy capabilities across trust regions. At the edge nodes across trust regions (such as DPU, Switch, etc.), support for master keys and stateless encryption of two trust groups (one is in local trust region and the other is in global trust region) is required, and proxy conversion of message encryption and decryption between the two trust groups must be completed.
239239

240240
# EESP Stateless Encryption Scheme
241-
TBD.
241+
Stateless Encryption is designed for large-scale general-purpose computing, AI computing, and pooled networks. It addresses the challenges of storing and managing security contexts by using computation to replace storage (key derivation) and flexible encryption and decryption, thereby enabling secure communication between nodes within and across domains. Therefore, to ensure that the endpoint can perform correct encryption and decryption without the need to store and manage security contexts, the stateless encryption extension must include the necessary fields required for calculating data key and performing the follow up encryption and decryption:
242+
- Key Derivation Fields: Used to calculate the data key for data packets;
243+
- Initial Vector Fields: Since AES-GCM is the primary data encryption algorithm, per-packet initialization vector (IV) should never be repeated for the same encryption key. A single duplicate IV can undermine the encryption of the entire stream;
244+
- Confidentiality and integrity protection range Fields: Provide flexibility in the range of message confidentiality and integrity protection.
245+
246+
## Master Key Management
247+
Each trust group shares a master key. The master key supports being composed of multiple root keys, including: the trust zone root key, the user group root key, and the task group root key. This mechanism enhances the overall security of the master key and supports fine-grained encryption traffic isolation. The multiple root keys that make up the group key are securely distributed by different controllers (infrastructure providers, user group administrators, task group administrators) through different controllers/KMS. An example of the data structure definition for the root key is as follows:
248+
249+
~~~
250+
251+
RootKeyStruct ::= SEQUENCE {
252+
root_key_id OCTET STRING,
253+
root_keys_index SEQUENCE (SIZE(3)) OF INTEGER
254+
root_keys_value SEQUENCE (SIZE(3)) OF OCTET STRING
255+
}
256+
257+
~~~
258+
259+
Based on the trust region, use group, and task group under the trust group, the corresponding root_key_id can be found respectively. Then, within the structure corresponding to this ID, the combination of the root_keys_index and root_key_value arrays forms three sets of root_key information (pre_key, current_key, and next_key) used for key rotation. This three-key rotation ensures the timely update of the root key (when the root key is rotated, it is replaced with the latest current_key) and guarantees that real-time encryption and decryption are not affected.
260+
The specific method for key rotation is as follows: a new next_key is generated, the original next_key is replaced with the new current_key, and the original current_key is replaced with the new pre_key.
261+
262+
##Data key Derivation at Both Ends of the Communication
263+
When secure communication is required within a trust group, the source point performs the following processing:
264+
- data key derivation:
265+
- Obtain the master key: Based on the trust group information, combine the relevant root keys (e.g., through XOR calculation) to derive it;
266+
- Calculate the context information: Based on the source point IP/ID, or connection ID, etc., along with Epoch, the context is calculated using a specific algorithm. Using the source point IP/ID to calculate the context ensures that different secure sessions at the destination point have different data keys, thereby preventing the compromise of encryption security that could occur if different sessions had the same data key and the IV was also the same;
267+
- Execute KDF to derive the data key: use the aforementioned master key and context as inputs to the KDF;
268+
- IV Calculation: Based on the source point IP/ID or connection ID, along with Epoch, random numbers, and counters, the IV is computed using a specific algorithm;
269+
- Determine the scope of confidentiality and integrity protection: COffset and IOffset respectively;
270+
- Encrypt the message using the data key and IV, and construct the security header: The security header field contains all the information mentioned above. The example diagram is as follows:
271+
272+
~~~
273+
274+
0 1 2 3
275+
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
276+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
277+
|Version| HL | V | Reserve | COffset |IOffset|
278+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
279+
| DeviceID/ConnectionID (4B-8B) |
280+
| |
281+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
282+
| Master Key Options (variable, optional) |
283+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
284+
| Epoch | Counter |
285+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
286+
| |
287+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
288+
289+
~~~
290+
{: #fig-ipsecme-eesp-stateless-security-header title="Example of the Security Header Format for Stateless Encryption"}
242291

292+
~~~
293+
294+
0 1 2 3
295+
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
296+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
297+
| Option Type | Option Length |Root Key Index | Padding |
298+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
299+
| Root Key ID (16B-32B) |
300+
| |
301+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
302+
~~~
303+
{: #fig-ipsecme-eesp-stateless-security-header-option title="Example of the Master Key Option of Security Header Format for Stateless Encryption"}
304+
305+
306+
Correspondingly, the destination node is processed as follows:
307+
- Read the security header: Obtain all parameters required for key derivation;
308+
- Data key derivation:
309+
- Obtain the master key: Based on the master key option in the security header, combine the relevant root keys (e.g., through XOR calculation) to obtain it;
310+
- Calculate the context information: Based on the source point IP/ID or connection ID in the security header, along with Epoch, compute the context using a specific algorithm;
311+
- Execute KDF to derive the data key: use the aforementioned master key and context as inputs to the KDF;
312+
- IV Calculation: Based on the source point IP/ID in the security header, or connection ID, etc., along with Epoch, random numbers, and counters, the IV is calculated according to a specific algorithm;
313+
- Determine the scope of confidentiality and integrity protection: COffset and IOffset respectively;
314+
- Decrypt the message using the data key and IV.
243315

244316
# Security Considerations
245317

246-
TBD.
318+
- A highly secure control plane is required to ensure that the master keys managed by users/systems are not leaked or lost;
319+
-
320+
The control channel establishment phase requires two-way authentication and authorization to ensure the integrity and confidentiality of the master key during the master key distribution phase. At the same time, it ensures that the group master key is only distributed to the corresponding group members;
321+
- The endpoint requires secure storage of the master key and data key locally.
322+
247323

248324

249325
# IANA Considerations

0 commit comments

Comments
 (0)