Skip to content

Commit aa37ffe

Browse files
committed
Content update
1 parent 4ce63ee commit aa37ffe

File tree

1 file changed

+6
-7
lines changed

1 file changed

+6
-7
lines changed

articles/storage/common/tape-migration-guide.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,8 @@ Resources are the most critical part of the tape migration process, and we divid
5757
Hardware is usually the most challenging part. If we're migrating existing tape generations, hardware is available, but used as part of the existing production. But for older tape generations, hardware is often end-of-life, and it's harder to acquire. With older tape generation, using a tape migration partner is a preferred, and simpler option.
5858
When production hardware is used for migrations, careful planning is needed to make sure migration doesn't interfere with the production workloads. Here we can apply three different models:
5959

60-
1. Use dedicated hardware for migration: simplest migration model, it is easy to schedule, and plan with no impact to production. It includes additional cost for acquiring the hardware (if not available already), and causes a low hardware utilization post-migration.
61-
1. Run migration off-hours on production hardware: migration model with no impact to production. Requires complex scheduling, execution, and people working off-hours. Possible only if production hardware is not utilized 24x7.
60+
1. Use dedicated hardware for migration: simplest migration model, it's easy to schedule, and plan with no impact to production. It adds cost for acquiring the hardware (if not available already), and causes a low hardware utilization post-migration.
61+
1. Run migration off-hours on production hardware: migration model with no impact to production. Requires complex scheduling, execution, and people working off-hours. Possible only if production hardware isn't utilized 24x7.
6262
1. Run production, and migration together: least-preferred migration model as it can easily impact production. This model reduces hardware available for production, requires complex scheduling, and planning. If this model is used, processes around reducing impact to production are critical to keep the migration timeline under control. This model is recommended only when production hardware has low utilization.
6363

6464
### Data transfer options
@@ -80,7 +80,7 @@ Second option is easier, and more commonly used. Tape migration partners have fa
8080

8181
Several partners can perform tape migrations to Azure. The full list of partners can be found on [offline media import](https://azure.microsoft.com/products/databox/offline-media-import/).
8282

83-
Here is a simple flowchart to ease the selection process.
83+
Here's a simple flowchart to ease the selection process.
8484
![Chart showing tape migration selection process](./media/tape-migration-guide/tape-migration-chart.png)
8585

8686
### Data format
@@ -96,7 +96,7 @@ Main criteria for deciding the format is how do we plan to use the migrated data
9696

9797
## Migration process
9898

99-
Once we made decisions on migration execution, and prefeered file format, we can start with the migration. Migration goes through several phases.
99+
Once we made decisions on migration execution, and preferred file format, we can start with the migration. Migration goes through several phases.
100100
![Picture showing tape migration phases](./media/tape-migration-guide/tape-migration-steps.png)
101101

102102
### Information phase
@@ -122,7 +122,7 @@ Information phase is critical for gathering key requirements. Gathered informati
122122

123123
After we gathered basic information, we can prepare for the migration. Preparation phase can include many different steps, but there are some common steps most migrations go through:
124124

125-
1. **Data analysis** provides information on the data that needs to be migrated. Information is critical to estimate how fast data can be read from tapes, and how much parallelism we need to achieve to successfully finish the migration before the deadline. It impacts estimates on the required hardware (libraries, robots, drives). Data analysis is done by sampling multiple tapes that represent the data set to be migrated. Typical information we are looking for is:
125+
1. **Data analysis** provides information on the data that needs to be migrated. Information is critical to estimate how fast data can be read from tapes, and how much parallelism we need to achieve to successfully finish the migration before the deadline. It impacts estimates on the required hardware (libraries, robots, drives). Data analysis is done by sampling multiple tapes that represent the data set to be migrated. Typical information we're looking for is:
126126
- file sizes,
127127
- amount of data stored per tape,
128128
- number of files per tape,
@@ -141,8 +141,7 @@ After we gathered basic information, we can prepare for the migration. Preparati
141141

142142
### Migration phase
143143

144-
Once the migration design is final, we start the migration process. Before ramping up to full migration pace, we always perform a test with a smaller sample. Goal for the test is to make sure that end-to-end process works. It allows us to make tweaks, and improve the process. Once the test is successful, we ramp up fully till the migration is done.
145-
For each file we migrate, we need to perform data validation to make sure that data wasn't corrupted during the migration process. In ideal situation, source data already contains hash values that can be easily compared to hash values post-migration. If hashes don't exist, they must be calculated before the file is migrated. If hashes match, file is marked as migrated. If not, file is discarded, and migrated again. Sometimes the data is corrupted on the source tapes. Having the original hash values helps with catching those rare cases. If they happen, we can read the data from secondary copy if it exists. Data validation process is a critical component for a migration design. Process for handling failed validation must be defined. Migration phase is also constantly monitored to make sure we can react to unpredictable situation, and adapt to it. Regular reporting to main stakeholders is important to keep the migration on track.
144+
Once the migration design is final, we start the migration process. Before ramping up to full migration pace, we always perform a test with a smaller sample. Goal for the test is to make sure that end-to-end process works. It allows us to make tweaks, and improve the process. Once the test is successful, and we are happy with the results, we execute the migration. For each file we migrate, we need to perform data validation to make sure that data wasn't corrupted during the migration process. In ideal situation, source data already contains hash values that can be easily compared to hash values post-migration. If hashes don't exist, they must be calculated before the file is migrated. If hashes match, file is marked as migrated. If not, file is discarded, and migrated again. Sometimes the data is corrupted on the source tapes. Having the original hash values helps with catching those rare cases. If they happen, we can read the data from secondary copy if it exists. Data validation process is a critical component for a migration design. Process for handling failed validation must be defined. Migration phase is also constantly monitored to make sure we can react to unpredictable situation, and adapt to it. Regular reporting to main stakeholders is important to keep the migration on track.
146145

147146
### Post-migration phase
148147

0 commit comments

Comments
 (0)