Skip to content

Conversation

@gladjohn
Copy link
Contributor

@gladjohn gladjohn commented Nov 3, 2025

Fixes - Adds MSI v2 cert to the store (persistent cache)

Changes proposed in this request
The IMDSv2 mTLS PoP flow currently relies only on a process‑local cache. That means new processes (or app restarts) must re‑mint the binding certificate even when a valid one already exists on the machine. This PR layers a best‑effort, cross‑process persisted cache on top of the existing in‑memory cache to reduce reminting, while keeping token acquisition robust (never blocked by persistence).

Testing
unit testing

Performance impact
none

Documentation

  • All relevant documentation is updated.

@gladjohn gladjohn requested a review from a team as a code owner November 3, 2025 02:54
/// Open the cert store and look at FriendlyName to see examples.
/// Wish we could paste a screenshot here... Maybe I can show it in code walkthroughs.
/// </summary>
internal static class FriendlyNameCodec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: MsiCertificateFriendlyNameEncoder ?

/// Encodes/decodes the persisted X.509 FriendlyName for MSAL mTLS certs.
/// Format: "MSAL|alias=cacheKey|ep=endpointBase"
/// Open the cert store and look at FriendlyName to see examples.
/// Wish we could paste a screenshot here... Maybe I can show it in code walkthroughs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

{
/// <summary>
/// Encodes/decodes the persisted X.509 FriendlyName for MSAL mTLS certs.
/// Format: "MSAL|alias=cacheKey|ep=endpointBase"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give more examples here? This format is not clear.

/// <param name="endpointBase"></param>
/// <param name="friendlyName"></param>
/// <returns></returns>
public static bool TryEncode(string alias, string endpointBase, out string friendlyName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Avoid using these TryXYZ methods for cases where exceptions should be thrown.

{
/// <summary>
/// Encodes/decodes the persisted X.509 FriendlyName for MSAL mTLS certs.
/// Format: "MSAL|alias=cacheKey|ep=endpointBase"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is alias, cachekey, ep and endpoint base?


/// <summary>
/// Checks for illegal characters in alias/endpointBase.
/// Endpoint itself comes from IMDS and is well-formed, but we still validate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not throw?

string cachedClientId;

// 1) Only lookup by cacheKey
// 1) In-memory cache first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 1) 2) 3) logic should be encapsualted in a different object, some sort of Cache abstraction.

namespace Microsoft.Identity.Client.ManagedIdentity.V2
{
/// <summary>
/// Cross-process lock based on a per-alias named mutex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an alias?

Action<string> logVerbose = null)
{
var nameGlobal = GetMutexNameForAlias(alias, preferGlobal: true);
var nameLocal = GetMutexNameForAlias(alias, preferGlobal: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain in comment why you need 2 mutexes


public static string GetMutexNameForAlias(string alias, bool preferGlobal = true)
{
string suffix = HashAlias(Canonicalize(alias));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to hash? How long are the aliases? Maybe you can avoid hashing here.

/// - No throws; persistence must not block token acquisition.
/// - Windows-only; FriendlyName semantics are undefined elsewhere.
/// </summary>
internal static class PersistentCertificateStore
Copy link
Member

@bgavrilMS bgavrilMS Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you overusing static? Everything seems static in this design. Does this not need mocking? And a logger?

/// </summary>
internal static class PersistentCertificateStore
{
public static bool TryFind(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good interface for a cache - use Read / Write / Delete for caching.

{
value = default;

if (!DesktopOsHelper.IsWindows())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The X509 stores exist on Linux as well.

string bestEndpoint = null;
DateTime bestNotAfter = DateTime.MinValue;

foreach (var c in items)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use more meaningful variable names. Not c

items = new X509Certificate2[store.Certificates.Count];
store.Certificates.CopyTo(items, 0);
}
catch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these try / catch should log

// CN (GUID) → canonical client_id
string cn = null;
try
{ cn = best.GetNameInfo(X509NameType.SimpleName, false); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please format all docs and use proper indentation.

string cn = null;
try
{ cn = best.GetNameInfo(X509NameType.SimpleName, false); }
catch { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log all failures.

/// - No throws; persistence must not block token acquisition.
/// - Windows-only; FriendlyName semantics are undefined elsewhere.
/// </summary>
internal static class PersistentCertificateStore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, a better design would have been:

  • a local in-memory cache logic
  • a cert store logic
  • an orchestrator

string aliasCacheKey,
X509Certificate2 cert,
string endpointBase,
string clientId, // unused; CN is the source of truth on read
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's unused, why do you add it?

// We only persist certs that can actually be used for mTLS later.
if (!cert.HasPrivateKey)
{
logger?.Verbose(() => "[PersistentCert] Not persisting: certificate has no private key.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a warning to me? Or even an exception? Why would you persist certs without private key?

/// <summary>
/// Removes expired entries (NotAfter less than now) for an alias from an open X509 store.
/// </summary>
internal static class StorePruner
Copy link
Member

@bgavrilMS bgavrilMS Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem worth having another (static!) class for 1 mehtod. Might as well move it to PersistentCertificaetStore? It's a tpyical cache op.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants