You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/storage/common/tape-migration-guide.md
+6-7Lines changed: 6 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,8 +57,8 @@ Resources are the most critical part of the tape migration process, and we divid
57
57
Hardware is usually the most challenging part. If we're migrating existing tape generations, hardware is available, but used as part of the existing production. But for older tape generations, hardware is often end-of-life, and it's harder to acquire. With older tape generation, using a tape migration partner is a preferred, and simpler option.
58
58
When production hardware is used for migrations, careful planning is needed to make sure migration doesn't interfere with the production workloads. Here we can apply three different models:
59
59
60
-
1. Use dedicated hardware for migration: simplest migration model, it is easy to schedule, and plan with no impact to production. It includes additional cost for acquiring the hardware (if not available already), and causes a low hardware utilization post-migration.
61
-
1. Run migration off-hours on production hardware: migration model with no impact to production. Requires complex scheduling, execution, and people working off-hours. Possible only if production hardware is not utilized 24x7.
60
+
1. Use dedicated hardware for migration: simplest migration model, it's easy to schedule, and plan with no impact to production. It adds cost for acquiring the hardware (if not available already), and causes a low hardware utilization post-migration.
61
+
1. Run migration off-hours on production hardware: migration model with no impact to production. Requires complex scheduling, execution, and people working off-hours. Possible only if production hardware isn't utilized 24x7.
62
62
1. Run production, and migration together: least-preferred migration model as it can easily impact production. This model reduces hardware available for production, requires complex scheduling, and planning. If this model is used, processes around reducing impact to production are critical to keep the migration timeline under control. This model is recommended only when production hardware has low utilization.
63
63
64
64
### Data transfer options
@@ -80,7 +80,7 @@ Second option is easier, and more commonly used. Tape migration partners have fa
80
80
81
81
Several partners can perform tape migrations to Azure. The full list of partners can be found on [offline media import](https://azure.microsoft.com/products/databox/offline-media-import/).
82
82
83
-
Here is a simple flowchart to ease the selection process.
83
+
Here's a simple flowchart to ease the selection process.
@@ -122,7 +122,7 @@ Information phase is critical for gathering key requirements. Gathered informati
122
122
123
123
After we gathered basic information, we can prepare for the migration. Preparation phase can include many different steps, but there are some common steps most migrations go through:
124
124
125
-
1.**Data analysis** provides information on the data that needs to be migrated. Information is critical to estimate how fast data can be read from tapes, and how much parallelism we need to achieve to successfully finish the migration before the deadline. It impacts estimates on the required hardware (libraries, robots, drives). Data analysis is done by sampling multiple tapes that represent the data set to be migrated. Typical information we are looking for is:
125
+
1.**Data analysis** provides information on the data that needs to be migrated. Information is critical to estimate how fast data can be read from tapes, and how much parallelism we need to achieve to successfully finish the migration before the deadline. It impacts estimates on the required hardware (libraries, robots, drives). Data analysis is done by sampling multiple tapes that represent the data set to be migrated. Typical information we're looking for is:
126
126
- file sizes,
127
127
- amount of data stored per tape,
128
128
- number of files per tape,
@@ -141,8 +141,7 @@ After we gathered basic information, we can prepare for the migration. Preparati
141
141
142
142
### Migration phase
143
143
144
-
Once the migration design is final, we start the migration process. Before ramping up to full migration pace, we always perform a test with a smaller sample. Goal for the test is to make sure that end-to-end process works. It allows us to make tweaks, and improve the process. Once the test is successful, we ramp up fully till the migration is done.
145
-
For each file we migrate, we need to perform data validation to make sure that data wasn't corrupted during the migration process. In ideal situation, source data already contains hash values that can be easily compared to hash values post-migration. If hashes don't exist, they must be calculated before the file is migrated. If hashes match, file is marked as migrated. If not, file is discarded, and migrated again. Sometimes the data is corrupted on the source tapes. Having the original hash values helps with catching those rare cases. If they happen, we can read the data from secondary copy if it exists. Data validation process is a critical component for a migration design. Process for handling failed validation must be defined. Migration phase is also constantly monitored to make sure we can react to unpredictable situation, and adapt to it. Regular reporting to main stakeholders is important to keep the migration on track.
144
+
Once the migration design is final, we start the migration process. Before ramping up to full migration pace, we always perform a test with a smaller sample. Goal for the test is to make sure that end-to-end process works. It allows us to make tweaks, and improve the process. Once the test is successful, and we are happy with the results, we execute the migration. For each file we migrate, we need to perform data validation to make sure that data wasn't corrupted during the migration process. In ideal situation, source data already contains hash values that can be easily compared to hash values post-migration. If hashes don't exist, they must be calculated before the file is migrated. If hashes match, file is marked as migrated. If not, file is discarded, and migrated again. Sometimes the data is corrupted on the source tapes. Having the original hash values helps with catching those rare cases. If they happen, we can read the data from secondary copy if it exists. Data validation process is a critical component for a migration design. Process for handling failed validation must be defined. Migration phase is also constantly monitored to make sure we can react to unpredictable situation, and adapt to it. Regular reporting to main stakeholders is important to keep the migration on track.
0 commit comments