Skip to content

Commit 0134784

Browse files
committed
Acrolins updates
1 parent b2862d6 commit 0134784

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

articles/storage/common/tape-migration-guide.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,11 @@ This article focuses on tape migrations. It aims to simplify, provide guidance,
1515

1616
## Overview
1717

18-
Tape stores a large portion of worlds data, and remains one of the dominant types of storage media. Tape media has existed for decades, and is still heavily used with hundreds of exabytes of new tapes shipped every year.
18+
Tape stores a large portion of worlds data, and remains one of the dominant types of storage media. Tape media exists for decades, and is still heavily used with hundreds of exabytes of new tapes shipped every year.
1919

2020
Tapes are a great medium for storing cold data. They're fast in sequential reading, but stages requiring mechanical movements (like loading, and unloading of tapes, tape seeks, etc.) are slower. That makes tapes unusable for traditional, random based access, and is the main reason that even today data stored on tapes is rarely used. In addition, tapes are a magnetic medium that require special handling. They're sensitive to environment, particularly temperature, and humidity. If kept within their operating environmental range, they can achieve high durability, and good restore success rate. However, when kept in unfriendly environment, deterioration happens often, and renders the tape unreadable.
2121

22-
Large portion of tapes store dark data (data that is created, and stored, but not used for any purpose). Dark data brings no value to the data owner. With the increase in AI capability, and accessibility, the trend is changing. Customers are looking into how dark data can help them to increase efficiency, open new revenue streams, or increase their competitive advantage. To take advantage of dark data, many organizations are considering migrating the data from tapes to cloud storage. Cloud storage provides an easy way to analyze the data, extract business value, and consume services like AI, Machine Learning, Azure Search, etc.
22+
Large portions of tapes store dark data (data that is created, and stored, but not used for any purpose). Dark data brings no value to the data owner. With the increase in AI capability, and accessibility, the trend is changing. Customers are looking into how dark data can help them to increase efficiency, open new revenue streams, or increase their competitive advantage. To take advantage of dark data, many organizations are considering migrating the data from tapes to cloud storage. Cloud storage provides an easy way to analyze the data, extract business value, and consume services like AI, Machine Learning, Azure Search, etc.
2323

2424
Some of the major reasons we're seeing increase in tape to cloud migrations are:
2525

@@ -146,7 +146,7 @@ Once the migration design is final, we start the migration process. Before rampi
146146
![Flowchart that shows migration phase](./media/tape-migration-guide/tape-migration-phase.png)
147147

148148
#### Data validation
149-
For each file we migrate, we need to perform data validation to make sure that data wasn't corrupted during the migration process. Data validation is done by comparing hash values before the migration, and after the migration. There are many types of hashing algorithms that can be used. A common approach is to use MD5 since Azure Storage contains a pre-defined metadata field Content-MD5 that can be filled during the migration. This allows referring to that MD5 value when ever we interact with the data to validate the data hasn't been changed, or corrupted.In ideal situation, source data already contains hash values that can be easily compared to hash values post-migration. If hashes don't exist, they must be calculated before the file is migrated. If hashes match, file is marked as migrated. If not, file is discarded, and migrated again. Sometimes the data is corrupted on the source tapes. Having the original hash values helps with catching those rare cases. If they happen, we can read the data from secondary copy if it exists. Data validation process is a critical component for a migration design. Process for handling failed validation must be defined. Migration phase is also constantly monitored to make sure we can react to unpredictable situation, and adapt to it. Regular reporting to main stakeholders is important to keep the migration on track.
149+
For each file we migrate, we need to perform data validation to make sure that data wasn't corrupted during the migration process. Data validation is done by comparing hash values before the migration, and after the migration. There are many types of hashing algorithms that can be used. A common approach is to use MD5 since Azure Storage contains a pre-defined metadata field Content-MD5 that can be filled during the migration. This approach allows checking the same MD5 value when we access the data to validate the data is not changed, or corrupted. In ideal situation, source data already contains hash values that can be easily compared to hash values post-migration. If hashes don't exist, they must be calculated before the file is migrated. If hashes match, file is marked as migrated. If not, file is discarded, and migrated again. Sometimes the data is corrupted on the source tapes. Having the original hash values helps with catching those rare cases. If they happen, we can read the data from secondary copy if it exists. Data validation process is a critical component for a migration design. Process for handling failed validation must be defined. Migration phase is also constantly monitored to make sure we can react to unpredictable situation, and adapt to it. Regular reporting to main stakeholders is important to keep the migration on track.
150150

151151
### Post-migration phase
152152

0 commit comments

Comments
 (0)