Skip to content

[KeyManager] Add key claims retrieval to KPS and WSD#681

Open
NilanjanDaw wants to merge 15 commits intogoogle:mainfrom
NilanjanDaw:add-key-claims-channel
Open

[KeyManager] Add key claims retrieval to KPS and WSD#681
NilanjanDaw wants to merge 15 commits intogoogle:mainfrom
NilanjanDaw:add-key-claims-channel

Conversation

@NilanjanDaw
Copy link
Collaborator

@NilanjanDaw NilanjanDaw commented Feb 27, 2026

This commit introduces new functionality for retrieving key claims and provides
more comprehensive key management operations across both the Key Protection
Service (KPS) and Workload Service Daemon (WSD).

Key changes include:

  • KPS (key_custody_core):
    • Added FFI definitions and implementations for key_manager_get_kem_key.
    • Implemented GetKemKey in kps_key_custody_core_cgo.go for Go-FFI bridging.
    • Added integration tests for KEM key retrieval.
  • WSD (workload_service):
    • Added FFI definitions and implementations for key_manager_get_binding_key.
    • Implemented GetBindingKey in ws_key_custody_core_cgo.go for Go-FFI bridging.
    • Added integration tests for binding key retrieval.
    • Introduced key claims processing mechanism:
      • Added KeyClaimsProvider interface.
      • Server now includes a claimsChan and a processClaims` goroutine.
      • Implemented GetBindingKeyClaims and GetKemKeyClaims for generating protobuf claims.
    • Updated server constructors and integration tests to support the new claims and interfaces.
  • Added comprehensive integration tests and unit tests for the new key management
    and claims functionalities in both KPS and WSD.

This PR depends on #671. Please review it before this one.

atulpatildbz and others added 13 commits February 27, 2026 12:07
adds the `KeyClaims` proto definition in `keymanager/km_common/proto/`
which includes `VmAttestation`, `KeyAttestation`, and related messages
for key protection and attestation.

It also adds `proto/gen_keymanager.sh` to generate Go code for
keymanager protos.
…for algorithms.proto

Removed VM attestation related messages from key_claims.proto, they will
be added in github.com/GoogleCloudPlatform/confidential-space separately

Since 2 proto files are in the same directory, i changed the package
name in algorithms.proto to keymanager
…ce image build

This commit integrates the KeyManager Workload Service into the Confidential Space launcher and ensures it is properly built and included in the resulting images.

Key changes:
- Corrected the argument order in `NewServer` calls within `keymanager/workload_service/server_test.go` and `integration_test.go` to match the expected interface definitions.
- Updated `launcher/cloudbuild.yaml` to install Rust, CMake, and the C++ toolchain in the `golang:1.24` build environment, enabling the `CGO_ENABLED=1` build of the launcher to successfully link the KeyManager Rust libraries via FFI.
- Hooked the KeyManager Server into `launcher/container_runner.go`, configured to start listening on `kmaserver.sock` when the `EnableKeyManager` experiment is active.
- Added the `EnableKeyManager` flag to the `Experiments` struct and updated relevant tests and configurations to support the new feature toggle.
1. Restricts the CGO build of KeyManager exclusively to linux/amd64 to avoid compilation errors on macOS (missing SYS_MEMFD_SECRET).

2. Updates CI workflow to only build Rust libraries on Linux x64.

3. Updates releaser workflow to install the Rust toolchain and build the KeyManager.

4. Sets CGO_ENABLED=1 in .goreleaser.yaml to ensure CGO components are included in the release.
… WSD

This commit introduces new functionality for retrieving key claims and provides
more comprehensive key management operations across both the Key Protection
Service (KPS) and Workload Service Daemon (WSD).

Key changes include:
- **KPS (key_custody_core):**
  - Added FFI definitions and implementations for `key_manager_get_kem_key`.
  - Implemented `GetKemKey` in `kps_key_custody_core_cgo.go` for Go-FFI bridging.
  - Added integration tests for KEM key retrieval.
- **WSD (workload_service):**
  - Added FFI definitions and implementations for `key_manager_get_binding_key`.
  - Implemented `GetBindingKey` in `ws_key_custody_core_cgo.go` for Go-FFI bridging.
  - Added integration tests for binding key retrieval.
  - Introduced key claims processing mechanism:
    - Added `KeyClaimsProvider` interface.
    - `Server` now includes a `claimsChan` and a processClaims` goroutine.
    - Implemented `GetBindingKeyClaims` and `GetKemKeyClaims` for generating protobuf claims.
  - Updated server constructors and integration tests to support the new claims and interfaces.
- Added comprehensive integration tests and unit tests for the new key management
  and claims functionalities in both KPS and WSD.
Copy link
Collaborator

@atulpatildbz atulpatildbz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this up.
Added some initial comments

Comment on lines +470 to +471
let out_kem_pubkey_slice =
unsafe { std::slice::from_raw_parts_mut(out_kem_pubkey, out_kem_pubkey_len) };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're checking the out_kem_pubkey_len value much later after we're already using it to create a slice.

from_raw_parts_mut asserts to the compiler that valid len bytes exists. And this could result in undefined behaviour.

check the length first and create the slice only after we know the length is correct.

// Call Safe Internal Function
match get_kem_key_internal(uuid) {
Ok((kem_pubkey, binding_pubkey, delete_after)) => {
if out_kem_pubkey_len != kem_pubkey.as_bytes().len()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've followed != size enforcement at many places which is fine for now since we own the cgo layer too.

But would < be a more generalized check? that way caller can pass larger buffer too if they don't know the exact size.

)
};
assert_eq!(result, -1);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test test_get_kem_key_invalid_buffer_len that passes retrieved_kem_pubkey_bytes.len() + 1 or - 1 to ensure the C-API correctly rejects it with a -2 status code

t.Fatalf("expected KEM pubkey length %d, got %d", len(pubKey), len(retrievedKemPK))
}
for i := range pubKey {
if pubKey[i] != retrievedKemPK[i] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of iterating and comparing, we can use bytes.Equal

import "bytes"

// ...

if !bytes.Equal(pubKey, retrievedKemPK) {
	t.Fatalf("KEM pubkey mismatch: expected %x, got %x", pubKey, retrievedKemPK)
}

if !bytes.Equal(bindingPK, retrievedBindingPK) {
	t.Fatalf("binding pubkey mismatch: expected %x, got %x", bindingPK, retrievedBindingPK)
}

_, _, _, err := GetKemKey(uuid.New())
if err == nil {
t.Fatal("expected error for non-existent UUID")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also check the expected error string:

expectedErrMsg := "key_manager_get_kem_key failed with code -1"
if !strings.Contains(err.Error(), expectedErrMsg) {
    t.Fatalf("expected error containing %q, got: %v", expectedErrMsg, err)
}

if you decide to export typed sentinel errors ErrKeyNotFound then you could also use errors.Is but comparing string is also fine.


// GetKemKey retrieves KEM and binding public keys and delete_after timestamp via Rust FFI.
func GetKemKey(id uuid.UUID) ([]byte, []byte, uint64, error) {
uuidBytes, err := id.MarshalBinary()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know the size of uuid, should we create a fixed size array?

var uuidBytes [16]byte
copy(uuidBytes[:], id[:])

Comment on lines +88 to +93
if len(kemPubKey) != 32 {
t.Fatalf("expected 32-byte KEM public key, got %d", len(kemPubKey))
}
if len(bindingPubKey) != 32 {
t.Fatalf("expected 32-byte binding public key, got %d", len(bindingPubKey))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have just checked the length of the keys, we should compare it against expectedKemPubKey and expectedBindingPubKey, right?

Also it would be good to have non-zero values in expectedKemPubKey and expectedBindingPubKey

something like:

for i := range expectedKemPubKey {
    expectedKemPubKey[i] = byte(i + 10)
}
for i := range expectedBindingPubKey {
    expectedBindingPubKey[i] = byte(i + 20)
}

// Check equality of the actual sequence of bytes
if !bytes.Equal(kemPubKey, expectedKemPubKey) {
    t.Fatalf("KEM public key mismatch")
}
if !bytes.Equal(bindingPubKey, expectedBindingPubKey) {
    t.Fatalf("binding public key mismatch")
}

/// ## Returns
/// * `0` on success.
/// * `-1` if arguments are invalid or key is not found.
/// * `-2` if either public key buffer is too small.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're checking != so it fails with -2 even if buffer is too large


let uuid = match Uuid::from_slice(uuid_slice) {
Ok(u) => u,
Err(_) => return -1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this dead code? from_raw_parts guarantees slice is exactly 16 bytes. so Uuid::from_slice will never fail

@NilanjanDaw NilanjanDaw changed the title Add key claims retrieval and enhanced key management to KPS and WSD [KeyManager] Add key claims retrieval to KPS and WSD Feb 27, 2026
@NilanjanDaw NilanjanDaw self-assigned this Feb 27, 2026
Copy link
Collaborator

@atulpatildbz atulpatildbz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments

};

assert_eq!(result, 0);
assert_eq!(generated_pubkey_bytes, retrieved_pubkey_bytes);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also destroy for cleanup?

if we don't cleanup, enumerate tests may become flaky test_key_manager_enumerate_binding_keys_success since KEY_REGISTRY sits in a global LazyLock

}

func TestIntegrationGetBindingKey(t *testing.T) {
id, pubKey, err := GenerateBindingKeypair(defaultAlgo, 3600)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here, destroy the keys when the test completes

	defer func() {
		_ = DestroyBindingKey(id)
	}()

Comment on lines +61 to +69
if len(retrievedPubKey) != len(pubKey) {
t.Fatalf("expected pubkey length %d, got %d", len(pubKey), len(retrievedPubKey))
}

for i := range pubKey {
if pubKey[i] != retrievedPubKey[i] {
t.Fatalf("mismatch at index %d: expected %d, got %d", i, pubKey[i], retrievedPubKey[i])
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use bytes.Equal here

	if !bytes.Equal(pubKey, retrievedPubKey) {
		t.Fatalf("retrieved public key does not match generated public key.\nExpected: %x\nGot:      %x", pubKey, retrievedPubKey)
	}

if err != nil {
t.Fatalf("failed to create server: %v", err)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Cleanup(func() {
close(srv.claimsChan)
})

Comment on lines +320 to +324
Algorithm: &keymanager.HpkeAlgorithm{
Kem: keymanager.KemAlgorithm_KEM_ALGORITHM_DHKEM_X25519_HKDF_SHA256,
Kdf: keymanager.KdfAlgorithm_KDF_ALGORITHM_HKDF_SHA256,
Aead: keymanager.AeadAlgorithm_AEAD_ALGORITHM_AES_256_GCM,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are your plans when we support multiple algorithms?

Claims: &keymanager.KeyClaims_VmBindingClaims{
VmBindingClaims: &keymanager.KeyClaims_VmProtectionBindingClaims{
BindingPubKey: &keymanager.HpkePublicKey{
Algorithm: &keymanager.HpkeAlgorithm{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we define a package-level variable for this

var defaultHpkeAlgo = &keymanager.HpkeAlgorithm{
	Kem:  keymanager.KemAlgorithm_KEM_ALGORITHM_DHKEM_X25519_HKDF_SHA256,
	Kdf:  keymanager.KdfAlgorithm_KDF_ALGORITHM_HKDF_SHA256,
	Aead: keymanager.AeadAlgorithm_AEAD_ALGORITHM_AES_256_GCM,
}

deleteAfter: uint64(time.Now().Add(1 * time.Hour).Unix()),
}

srv, err := NewServer(kps, ws, filepath.Join(t.TempDir(), "test.sock"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add cleanup?
t.Cleanup(func() { close(srv.claimsChan) })

}

t.Run("BindingClaims", func(t *testing.T) {
respChan := make(chan *ClaimsResult)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use buffered channel.

the test fails for an unrelated reason before reaching the select block, it could deadlock the test suite.

Suggested change
respChan := make(chan *ClaimsResult)
respChan := make(chan *ClaimsResult, 1)

if err != nil {
t.Fatalf("failed to create test server: %v", err)
}
return srv
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this might be a good place to add t.Cleanup

	t.Cleanup(func() {
		srv.listener.Close() // Or srv.Shutdown(context.Background())
		close(srv.claimsChan) // Stop the background processClaims goroutine
	})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants