Skip to content

Commit a22e761

Browse files
authored
Update safe-upgrade-practices.md
updates for blockers
1 parent 4bc3acd commit a22e761

File tree

1 file changed

+35
-49
lines changed

1 file changed

+35
-49
lines changed

articles/operator-service-manager/safe-upgrade-practices.md

Lines changed: 35 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -8,23 +8,23 @@ ms.topic: upgrade-and-migration-article
88
ms.service: azure-operator-service-manager
99
---
1010

11-
# Getting Started with Safe Upgrade Practices
11+
# Get started with safe upgrade practices
1212

13-
This article introduces Azure Operator Service Manager (AOSM) safe upgrade practices (SUP). This feature-set enables an end-user to safely execute complex upgrades of CNF workloads hosted on Azure Operator Nexus, in compliance with partner ISSU requirements, where applicable. Look for future articles in these services to expand on SUP features and capabilities.
13+
This article introduces Azure Operator Service Manager (AOSM) safe upgrade practices (SUP). This feature set enables an end user to safely execute complex upgrades of CNF workloads hosted on Azure Operator Nexus, in compliance with partner ISSU requirements, where applicable. Look for future articles in these services to expand on SUP features and capabilities.
1414

15-
## What Are Safe Upgrade Practices
15+
## Introduction
1616

17-
A given network service supported by Azure Operator Service Manager will be composed of one-to-many container-based network functions (CNFs) which, over time, will require software updates. For each update, it is necessary to run one-to-many helm operations, upgrading dependent network function applications (NfApps), in a particular order, in a manner which least impacts the network service. At Azure Operator Service Manager, Safe Upgrade Practices represents a set of features, which can automate the CNF operations required to update a network service on Azure Operator Nexus.
17+
A given network service supported by Azure Operator Service Manager will be composed of one to many container based network functions (CNFs) which, over time, will require software updates. For each update, it is necessary to run one to many helm operations, upgrading dependent network function applications (NfApps), in a particular order, in a manner which least impacts the network service. At Azure Operator Service Manager, Safe Upgrade Practices represents a set of features, which can automate the CNF operations required to update a network service on Azure Operator Nexus.
1818

1919
* SNS Reput update - Execute helm upgrade operation across all NfApps in NFDV.
2020
* Nexus Platform - Support SNS reput operations on Nexus platform targets.
2121
* Operation Timeouts - Ability to set operational timeouts for each NfApp operation.
2222
* Synchronous Operations - Ability to run one serial NfApp operation at a time.
23-
* Pause-On-Failure - Based on flag, set failure behavior to rollback only last NfApp operation.
23+
* Pause On Failure - Based on flag, set failure behavior to rollback only last NfApp operation.
2424
* Single Chart Test Validation - Running a helm test operation after a create or update.
25-
* Refactored SNS Re-put - Improved methods, adds update order and cleanup check.
25+
* Refactored SNS Reput - Improved methods, adds update order and cleanup check.
2626

27-
## Safe Upgrade Practices Overview
27+
## Overview
2828

2929
To update an existing Azure Operator Service Manager site network service (SNS), the Operator executes a reput update request against the deployed SNS resource. Where the SNS contains CNFs with multiple NfApps, the request is fanned out across all NfApps defined in the network function definition version (NFDV). By default, in the order, which they appear, or optionally in the order defined by UpdateDependsOn parameter.
3030

@@ -36,14 +36,15 @@ For each NfApp, the reput update request supports increasing a helm chart versio
3636

3737
To ensure outcomes, NfApp testing is supported using helm, either helm upgrade pre/post tests, or standalone helm tests. For pre/post tests failures, the atomic parameter is honored. With atomic/true, the failed chart is rolled back. With atomic/false, no rollback is executed. For standalone helm tests, the rollbackOnTestFailure parameter us honored. With rollbackOnTestFailure/true, the failed chart is rolled back. With rollbackOnTestFailure/false, no rollback is executed.
3838

39-
## Safe Upgrade Practices - Prerequisites
39+
## Prerequisites
40+
4041
When planning for an upgrade using Azure Operator Service Manager, address the following requirements in advance of upgrade execution to optimize the time spent attempting the upgrade.
4142

4243
- Onboard updated artifacts using publisher and/or designer workflows.
4344
- Publisher, store, network service design (NSDg), and network function design group (NFDg) are static and do not need to change.
4445
- A new artifact manifest is needed to store the new charts and images. For more information, see onboarding documentation for details on uploading new charts and images.
4546
- New NFDV and network service design version (NSDV) are needed, under existing NFDg and NSDg.
46-
- We cover basic changes to the NFDV in the step-by-step section.
47+
- We cover basic changes to the NFDV in the step by step section.
4748
- New NSDV is only required if a new configuration group schema (CGS) version is being introduced.
4849
- If necessary, new CGS.
4950
- Required if an upgrade introduces new exposed configuration parameters.
@@ -55,51 +56,52 @@ When planning for an upgrade using Azure Operator Service Manager, address the f
5556
- Update templates to ensure that upgrade parameters are set based on confidence in the upgrade and desired failure behavior.
5657
- Settings used for production may suppress failures details, while settings used for debugging, or testing, may choose to expose these details.
5758

58-
### Safe Upgrade Practices - Step-by-Step Upgrade Procedure
59+
### Step by step upgrade procedure
5960
Follow the following process to trigger an upgrade with Azure Operator Service Manager.
6061

61-
#### Create new NFDV template with higher version.
62-
63-
For new NFDV versions, it must be a valid SemVer format, where only incrementing values of patch and minor versions updates are allowed. A lower NFDV version is not allowed. Given a CNF deployed using NFDV 2.0.0, the new NFDV can be of version 2.0.1, or 2.1.0, but not 1.0.0, or 3.0.0.
64-
65-
#### Update new NFDV Helm parameters to desired target version.
62+
#### Create new NFDV template
63+
For new NFDV versions, it must be in a valid SemVer format, where only higher incrementing values of patch and minor versions updates are allowed. A lower NFDV version is not allowed. Given a CNF deployed using NFDV 2.0.0, the new NFDV can be of version 2.0.1, or 2.1.0, but not 1.0.0, or 3.0.0.
6664

65+
#### Update new NFDV parameters
6766
Helm chart versions can be updated, or Helm values can be updated or parameterized as necessary. New NfApps can also be added where they did not exist in deployed version.
6867

69-
#### Update NFDV for desired NfApp order using UpdateDependsOn
70-
68+
#### Update NFDV for desired NfApp order
7169
UpdateDependsOn is an NFDV parameter used to specify ordering of NfApps during update operations. If UpdateDependsOn is not provided, serial ordering of CNF applications, as appearing in the NFDV is used.
7270

73-
#### Update NFDV roleOverrideValues for desired upgrade behavior.
74-
71+
#### Update NFDV for desired upgrade behavior
7572
Make sure to set any desired CNF application timeouts, the atomic parameter, and rollbackOnTestFailure parameter. It may be useful to change these parameters over time as more confidence is gained in the upgrade.
7673

77-
#### Issue SNS Re-Put
78-
79-
With onboarding complete, the Re-Put operation is submitted. Depending on the number, size and complexity of the NfApps, the reput operation could take some time to complete (multiple hours).
80-
81-
#### Examine Re-Put Results
74+
#### Issue SNS reput
75+
With onboarding complete, the reput operation is submitted. Depending on the number, size and complexity of the NfApps, the reput operation could take some time to complete (multiple hours).
8276

77+
#### Examine reput results
8378
If the reput is reporting a successful result, the upgrade is complete and the user should validate the state and availability of the service. If the reput is reporting a failure, follow the steps in the upgrade failure recovery section to continue.
8479

85-
### Safe Upgrade Practices - Step-by-Step Retry Procedure
86-
80+
### Step by Step retry procedure
8781
In cases where a reput update fails, the following process can be followed to retry the operation.
8882

89-
#### In there is failure, diagnose failed NfApp.
90-
83+
#### Diagnose failed NfApp
9184
Resolve the root cause for NfApp failure by analyzing logs and other debugging information.
9285

93-
#### Manually skip completed charts using applicationEnablement parameter.
86+
#### Manually skip completed charts
87+
After fixing the failed NfApp, but before attempting an upgrade retry, consider changing the applicationEnablement parameter to accelerate retry behavior. This parameter can be set false, where an NfApp should be skipped. This parameter can be useful where an NfApp does not require an upgraded.
9488

95-
After fixing the failed NfApp, but before attempting an upgrade retry, consider changing the applicationEnablement parameter to accelerate retry behavior. This parameter can be set false, where an NfApp should be skipped. This parameter can be useful where an NfApp does not require an upgraded. See the appendix for more information on manipulating the applicationEnablement flag.
89+
#### Issue SNS reput retry (repeat until success)
90+
By default, the reput retries NfApps in the declared update order, unless they are skipped using applicationEnablement flag.
91+
92+
## How to use applicationEnablement
93+
In the NFDV resource, under deployParametersMappingRuleProfile there is the property applicationEnablement of type enum, which takes values: Unknown, Enabled, or disabled. It can be used to exclude NfApp operations during NF deployment.
9694

97-
#### Issue SNS Re-Put retry (repeat until success)
95+
### Publisher changes
96+
For the applicationEnablement property, the publisher has two options: either provide a default value or parameterize it.
9897

99-
By default, the Re-Put retries NfApps in the declared update order, unless they are skipped using applicationEnablement flag.
98+
### Operator changes
99+
Operators specify applicationEnablement as defined by the NFDV. If applicationEnablement for specific application is parameterized, then it must be passed through the deploymentValues property at runtime.
100100

101-
## Safe Upgrade Practices - Next Steps
101+
## Support for in service upgrades (ISSU)
102+
Azure Operator Service Manager, where possible, supports in service upgrades, an upgrade method which advances a deployment version without interrupting the service. However, the ability for a given service to be upgraded without interruption is a feature of the service itself. Consult further with the service publisher to understand the in-service upgrade capabilities.
102103

104+
## Forwarding looking objectives
103105
Azure Operator Service Manager continues to grow the Safe Upgrade Practice feature set and drive improvements into offered update services. The following features are presently under consideration for future availability:
104106

105107
* Improve Upgrade Options Control - Expose parameters more effectively.
@@ -108,19 +110,3 @@ Azure Operator Service Manager continues to grow the Safe Upgrade Practice featu
108110
* Operate Asynchronously - Ability to run multiple NfApp operations at a time.
109111
* Download Images- Ability to preload images to edge repository.
110112
* Target Charts for Validation - Ability to run a helm test only on a specific NfApp.
111-
112-
## Appendix A - Using applicationEnablement
113-
114-
In the NFDV resource, under deployParametersMappingRuleProfile there is the property applicationEnablement of type enum, which takes values: Unknown, Enabled, or disabled. It can be used to exclude NfApp operations during NF deployment.
115-
116-
### Publisher
117-
118-
For the applicationEnablement property, the publisher has two options: either provide a default value or parameterize it.
119-
120-
### Operator
121-
122-
Operators specify applicationEnablement as defined by the NFDV. If applicationEnablement for specific application is parameterized, then it must be passed through the deploymentValues property at run-time.
123-
124-
## Appendix B - Support for In-Service upgrades (ISSU)
125-
126-
Azure Operator Service Manager, where possible, supports in-service upgrades, an upgrade method which advances a deployment version without interrupting the service. However, the ability for a given service to be upgraded without interruption is a feature of the service itself. Consult further with the service publisher to understand the in-service upgrade capabilities.

0 commit comments

Comments
 (0)