You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/service-fabric/cluster-security-certificate-management.md
+13-11Lines changed: 13 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,11 @@
2
2
title: Manage certificates in a Service Fabric cluster
3
3
description: Learn about managing certificates in a Service Fabric cluster that's secured with X.509 certificates.
4
4
ms.topic: conceptual
5
-
ms.date: 04/10/2020
6
-
ms.custom: sfrev, devx-track-azurepowershell
5
+
ms.author: tomcassidy
6
+
author: tomvcassidy
7
+
ms.service: service-fabric
8
+
services: service-fabric
9
+
ms.date: 07/11/2022
7
10
---
8
11
# Manage certificates in Service Fabric clusters
9
12
@@ -28,7 +31,7 @@ We describe *certificate management* as the processes and procedures that are us
28
31
29
32
Some management operations, such as enrollment, policy setting, and authorization controls, are beyond the scope of this article. Other operations, such as provisioning, renewal, re-keying, or revocation, are related only incidentally to Service Fabric. Nonetheless, the article addresses them somewhat, because understanding these operations can help you secure your cluster properly.
30
33
31
-
Your immediate goal is likely to be to automate certificate management as much as possible to ensure uninterrupted availability of the cluster. Because the process is user-touch-free, you'll also want to offer security assurances. With Service Fabric clusters, this goal is attainable.
34
+
Your immediate goal is likely to be to automate certificate management as much as possible to ensure uninterrupted availability of the cluster. Because the process is user-touch-free, you'll also want to offer security assurances. With Service Fabric clusters, this goal is attainable.
32
35
33
36
The rest of the article first deconstructs certificate management, and later focuses on enabling autorollover.
34
37
@@ -87,7 +90,7 @@ Let's quickly outline the progression of a certificate from issuance to consumpt
87
90
88
91
For the purposes of this article, the first two steps in the preceding sequence are mostly unrelated. Their only connection is that the subject common name of the certificate is the DNS name that's declared in the cluster definition.
89
92
90
-
Certificate issuance and provisioning flow is illustrated in the following diagrams:
93
+
Certificate issuance and provisioning flow are illustrated in the following diagrams:
91
94
92
95
**For certificates that are declared by thumbprint**
93
96
@@ -113,14 +116,13 @@ Continuing with Azure as the context, and using Key Vault as the secret-manageme
113
116
- Under `{vaultUri}/secrets/{name}`: The certificate, including its private key, available for downloading as an unprotected PFX or PEM file.
114
117
115
118
Recall that a certificate in the key vault contains a chronological list of certificate instances that share a policy. Certificate versions will be created according to the lifetime and renewal attributes of this policy. We highly recommend that vault certificates not share subjects or domains or DNS names, because it can be disruptive in a cluster to provision certificate instances from different vault certificates, with identical subjects but substantially different other attributes, such as issuer, key usages, and so on.
116
-
117
119
At this point, a certificate exists in the key vault, ready for consumption. Now let's explore the rest of the process.
118
120
119
121
### Certificate provisioning
120
122
121
-
We mentioned a *provisioning agent*, which is the entity that retrieves the certificate, including its private key, from the key vault and installs it on each of the hosts of the cluster. (Recall that Service Fabric doesn't provision certificates.)
123
+
We mentioned a *provisioning agent*, which is the entity that retrieves the certificate, including its private key, from the key vault and installs it on each of the hosts of the cluster. (Recall that Service Fabric doesn't provision certificates.)
122
124
123
-
In the context of this article, the cluster will be hosted on a collection of Azure virtual machines (VMs) or virtual machine scale sets (VMSS). In Azure, you can provision a certificate from a vault to a VM/VMSS by using the following mechanisms. This assumes, as before, that the provisioning agent was previously granted *secret get* permissions on the key vault by the key vault owner.
125
+
In the context of this article, the cluster will be hosted on a collection of Azure virtual machines (VMs) or virtual machine scale sets. In Azure, you can provision a certificate from a vault to a VM/VMSS by using the following mechanisms. This assumes, as before, that the provisioning agent was previously granted *secret get* permissions on the key vault by the key vault owner.
124
126
125
127
- Ad-hoc: An operator retrieves the certificate from the key vault (as PFX/PKCS #12 or PEM) and installs it on each node.
126
128
@@ -133,9 +135,9 @@ In the context of this article, the cluster will be hosted on a collection of Az
133
135
134
136
- By using the [Key Vault VM extension](../virtual-machines/extensions/key-vault-windows.md). This lets you provision certificates by using version-less declarations, with periodic refreshing of observed certificates. In this case, the VM/VMSS is expected to have a [managed identity](../virtual-machines/security-policy.md#managed-identities-for-azure-resources), an identity that has been granted access to the key vaults that contain the observed certificates.
135
137
136
-
VMSS/compute-based provisioning presents security and availability advantages, but it also presents restrictions. It requires, by design, that you declare certificates as versioned secrets, which makes it suitable only for clusters secured with certificates declared by thumbprint.
138
+
VMSS/compute-based provisioning presents security and availability advantages, but it also presents restrictions. It requires, by design, that you declare certificates as versioned secrets. This requirement makes VMSS/compute-based provisioning suitable only for clusters secured with certificates declared by thumbprint.
137
139
138
-
In contrast, Key Vault VM extension-based provisioning always installs the latest version of each observed certificate, which makes it suitable only for clusters secured with certificates declared by subject common name. To emphasize, do not use an autorefresh provisioning mechanism (such as the Key Vault VM extension) for certificates that are declared by instance (that is, by thumbprint). The risk of losing availability is considerable.
140
+
In contrast, Key Vault VM extension-based provisioning always installs the latest version of each observed certificate. which makes it suitable only for clusters secured with certificates declared by subject common name. To emphasize, do not use an autorefresh provisioning mechanism (such as the Key Vault VM extension) for certificates that are declared by instance (that is, by thumbprint). The risk of losing availability is considerable.
139
141
140
142
Other provisioning mechanisms exist, but the approaches mentioned here are the currently accepted options for Azure Service Fabric clusters.
141
143
@@ -276,7 +278,7 @@ Next, let's set up the additional resources that are needed to ensure the autoro
276
278
277
279
### Set up the prerequisite resources
278
280
279
-
As mentioned earlier, a certificate that's provisioned as a virtual machine scale set secret is retrieved from the key vault by the Microsoft.Compute Resource Provider service. It does so by using its first-party identity on behalf of the deployment operator. For autorollover, that will change. You'll switch to using a managed identity that's assigned to the virtual machine scale set and that has been granted GET permissions on the secrets in that vault.
281
+
As mentioned earlier, a certificate that's provisioned as a virtual machine scale set secret is retrieved from the key vault by the MicrosoftCompute Resource Provider service. It does so by using its first-party identity on behalf of the deployment operator. That process will change for autorollover. You'll switch to using a managed identity that's assigned to the virtual machine scale set and that has been granted GET permissions on the secrets in that vault.
280
282
281
283
You should deploy the next excerpts at the same time. They're listed individually only for play-by-play analysis and explanation.
282
284
@@ -500,7 +502,7 @@ This indicates to the Key Vault VM extension that, on the first run (after deplo
500
502
501
503
#### Certificate linking, explained
502
504
503
-
You might have noticed the Key Vault VM extension `linkOnRenewal` flag, and the fact that it is set to false. This setting addresses in depth the behavior controlled by this flag and its implications on the functioning of a cluster. This behavior is specific to Windows.
505
+
You might have noticed the Key Vault VM extension `linkOnRenewal` flag, and the fact that it is set to false. This setting addresses the behavior controlled by this flag and its implications on the functioning of a cluster. This behavior is specific to Windows.
504
506
505
507
According to its [definition](../virtual-machines/extensions/key-vault-windows.md#extension-schema):
Copy file name to clipboardExpand all lines: articles/service-fabric/cluster-security-certificates.md
+11-6Lines changed: 11 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,13 @@
2
2
title: X.509 Certificate-based Authentication in a Service Fabric Cluster
3
3
description: Learn about certificate-based authentication in Service Fabric clusters, and how to detect, mitigate and fix certificate-related problems.
4
4
ms.topic: conceptual
5
-
ms.date: 03/16/2020
5
+
ms.author: tomcassidy
6
+
author: tomvcassidy
7
+
ms.service: service-fabric
8
+
services: service-fabric
9
+
ms.date: 07/11/2022
6
10
---
11
+
7
12
# X.509 Certificate-based authentication in Service Fabric clusters
8
13
9
14
This article complements the introduction to [Service Fabric cluster security](service-fabric-cluster-security.md), and goes into the details of certificate-based authentication in Service Fabric clusters. We assume the reader is familiar with fundamental security concepts, and also with the controls that Service Fabric exposes to control the security of a cluster.
@@ -43,7 +48,7 @@ As alluded to above, the Service Fabric runtime defines two levels of privilege
43
48
The security settings of a Service Fabric cluster describe, in principle, the following aspects:
44
49
- the authentication type; this is a creation-time, immutable characteristic of the cluster. Examples of such settings are 'ClusterCredentialType', 'ServerCredentialType', and allowed values are 'none', 'x509' or 'windows'. This article focuses on the x509-type authentication.
45
50
- the (authentication) validation rules; these settings are set by the cluster owner and describe which credentials shall be accepted for a given role. Examples will be examined in depth immediately below.
46
-
- settings used to tweak or subtly alter the result of authentication; examples here include flags (de-)restricting enforcement of certificate revocation lists etc.
51
+
- settings used to tweak or subtly alter the result of authentication; examples here include flags restricting or unrestricting enforcement of certificate revocation lists, etc.
47
52
48
53
> [!NOTE]
49
54
> Cluster configuration examples provided below are excerpts from the cluster manifest in XML format, as the most-digested format which supports directly the Service Fabric functionality described in this article. The same settings can be expressed directly in the JSON representations of a cluster definition, whether a standalone json cluster manifest, or an Azure Resource Mangement template.
@@ -133,7 +138,7 @@ The declarations above correspond to the admin and user identities, respectively
133
138
Tying it all together, upon receiving a request for a connection in a cluster secured with X.509 certificates, the Service Fabric runtime will use the cluster's security settings to validate the credentials of the remote party as described above; if successful, the caller/remote party is considered to be authenticated. If the credential matches multiple validation rules, the runtime will grant the caller the highest-privileged role of any of the matched rules.
134
139
135
140
### Presentation rules
136
-
The previous section described how authentication works in a certificate-secured cluster; this section will explain how the Service Fabric runtime itself discovers and loads the certificates it uses for in-cluster communication; we call these the "presentation" rules.
141
+
The previous section described how authentication works in a certificate-secured cluster; this section will explain how the Service Fabric runtime itself discovers and loads the certificates it uses for in-cluster communication; we call these "presentation" rules.
137
142
138
143
As with the validation rules, the presentation rules specify a role and the associated credential declaration, expressed either by thumbprint or common name. Unlike the validation rules, common name-based declarations do not have provisions for issuer pinning; this allows for greater flexibility as well as improved performance. The presentation rules are declared in the 'NodeType' section(s) of the cluster manifest, for each distinct node type; the settings are split from the Security sections of the cluster to allow each node type to have its full configuration in a single section. In Azure Service Fabric clusters, the node type certificate declarations default to their corresponding settings in the Security section of the definition of the cluster.
139
144
@@ -177,7 +182,7 @@ Note that, for common-name based presentation declarations, a certificate is con
It was mentioned previously that the security settings of a Service Fabric cluster also allow for subtly changing the behavior of the authentication code. While the article on [Service Fabric cluster settings](service-fabric-cluster-fabric-settings.md) represents the comprehensive and most up to date list of settings, we'll expand on the meaning of a select few of the security settings here, to complete the full expose on certificate-based authentication. For each setting, we'll explain the intent, default value/behavior, how it affects authentication and which values are acceptable.
179
184
180
-
As mentioned, certificate validation always implies building and evaluating the certificate's chain. For CA-issued certificates, this apparently-simple OS API call typically entails several outbound calls to various endpoints of the issuing PKI, caching of responses and so on. Given the prevalence of certificate validation calls in a Service Fabric cluster, any issues in the PKI's endpoints can result in reduced availability of the cluster, or outright breakdown. While the outbound calls cannot be suppressed (see below in the FAQ section for more on this), the following settings can be used to mask out validation errors caused by failing CRL calls.
185
+
As mentioned, certificate validation always implies building and evaluating the certificate's chain. For CA-issued certificates, this apparentlysimple OS API call typically entails several outbound calls to various endpoints of the issuing PKI, caching of responses and so on. Given the prevalence of certificate validation calls in a Service Fabric cluster, any issues in the PKI's endpoints can result in reduced availability of the cluster, or outright breakdown. While the outbound calls cannot be suppressed (see below in the FAQ section for more on this), the following settings can be used to mask out validation errors caused by failing CRL calls.
181
186
182
187
* CrlCheckingFlag - under the "Security" section, string converted to UINT. The value of this setting is used by Service Fabric to mask out certificate chain status errors by changing the behavior of chain building; it is passed in to the Win32 CryptoAPI [CertGetCertificateChain](/windows/win32/api/wincrypt/nf-wincrypt-certgetcertificatechain) call as the 'dwFlags' parameter, and can be set to any valid combination of flags accepted by the function. A value of 0 forces the Service Fabric runtime to ignore any trust status errors - this is not recommended, as its use would constitute a significant security exposure. The default value is 0x40000000 (CERT_CHAIN_REVOCATION_CHECK_CHAIN_EXCLUDE_ROOT).
183
188
@@ -263,9 +268,9 @@ Typical symptoms that manifest themselves in a cluster experiencing authenticati
263
268
- connection attempts are rejected
264
269
- connection attempts are timing out
265
270
266
-
Each of the symptoms may be caused by different problems, and the same root cause may show different manifestations; as such, we'll just list a small sample of typical problems, with recommendations for fixing them.
271
+
Each of the symptoms may be caused by different problems, and the same root cause may show different manifestations; as such, we'll just list a small sample of typical problems, with recommendations for fixing them.
267
272
268
-
* Nodes can exchange messages but cannot connect. A possible cause for connection attempts to be terminated is the 'certificate not matched' error - one of the parties in a Service Fabric-to-Service Fabric connections is presenting a certificate which fails the recipient's validation rules. May be accompanied by either of the following errors:
273
+
* Nodes can exchange messages but cannot connect. A possible cause for connection attempts to be terminated is the 'certificate not matched' error - one of the parties in a Service Fabric-to-Service Fabric connection is presenting a certificate which fails the recipient's validation rules. May be accompanied by either of the following errors:
0 commit comments