Skip to content

Commit 5bef029

Browse files
authored
Merge pull request #95984 from xyh1/patch-84
Update storage-blob-rehydration.md
2 parents cd26139 + 509744d commit 5bef029

File tree

3 files changed

+36
-27
lines changed

3 files changed

+36
-27
lines changed

articles/storage/blobs/storage-blob-change-feed.md

Lines changed: 20 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,24 +14,21 @@ ms.reviewer: sadodd
1414

1515
The purpose of the change feed is to provide transaction logs of all the changes that occur to the blobs and the blob metadata in your storage account. The change feed provides **ordered**, **guaranteed**, **durable**, **immutable**, **read-only** log of these changes. Client applications can read these logs at any time, either in streaming or in batch mode. The change feed enables you to build efficient and scalable solutions that process change events that occur in your Blob Storage account at a low cost.
1616

17-
> [!NOTE]
18-
> The change feed is in public preview, and is available in the **westcentralus** and **westus2** regions. See the [conditions](#conditions) section of this article. To enroll in the preview, see the [Register your subscription](#register) section of this article.
19-
2017
The change feed is stored as [blobs](https://docs.microsoft.com/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs) in a special container in your storage account at standard [blob pricing](https://azure.microsoft.com/pricing/details/storage/blobs/) cost. You can control the retention period of these files based on your requirements (See the [conditions](#conditions) of the current release). Change events are appended to the change feed as records in the [Apache Avro](https://avro.apache.org/docs/1.8.2/spec.html) format specification: a compact, fast, binary format that provides rich data structures with inline schema. This format is widely used in the Hadoop ecosystem, Stream Analytics, and Azure Data Factory.
2118

22-
You can process these logs asynchronously, incrementally or in-full. Any number of client applications can independently read the change feed, in parallel, and at their own pace. Analytics applications such as [Apache Drill](https://drill.apache.org/docs/querying-avro-files/) or [Apache Spark](https://spark.apache.org/docs/latest/sql-data-sources-avro.html) can consume logs directly as Avro files which lets you process them at a low-cost, with high-bandwidth, and without having to write a custom application.
19+
You can process these logs asynchronously, incrementally or in-full. Any number of client applications can independently read the change feed, in parallel, and at their own pace. Analytics applications such as [Apache Drill](https://drill.apache.org/docs/querying-avro-files/) or [Apache Spark](https://spark.apache.org/docs/latest/sql-data-sources-avro.html) can consume logs directly as Avro files, which let you process them at a low-cost, with high-bandwidth, and without having to write a custom application.
2320

2421
Change feed support is well-suited for scenarios that process data based on objects that have changed. For example, applications can:
2522

26-
- Update a secondary index, synchronize with a cache, search-engine or any other content-management scenarios.
23+
- Update a secondary index, synchronize with a cache, search-engine, or any other content-management scenarios.
2724

2825
- Extract business analytics insights and metrics, based on changes that occur to your objects, either in a streaming manner or batched mode.
2926

30-
- Store, audit and analyze changes to your objects, over any period of time, for security, compliance or intelligence for enterprise data management.
27+
- Store, audit, and analyze changes to your objects, over any period of time, for security, compliance or intelligence for enterprise data management.
3128

32-
- Build solutions to backup, mirror or replicate object state in your account for disaster management or compliance.
29+
- Build solutions to backup, mirror, or replicate object state in your account for disaster management or compliance.
3330

34-
- Build connected application pipelines which react to change events or schedule executions based on created or changed object.
31+
- Build connected application pipelines that react to change events or schedule executions based on created or changed object.
3532

3633
> [!NOTE]
3734
> [Blob Storage Events](storage-blob-event-overview.md) provides real-time one-time events which enable your Azure Functions or applications to react to changes that occur to a blob. The change feed provides a durable, ordered log model of the changes. Changes in your change feed are made available in your change feed at within an order of a few minutes of the change. If your application has to react to events much quicker than this, consider using [Blob Storage events](storage-blob-event-overview.md) instead. Blob Storage events enable your Azure Functions or applications to react individual events in real-time.
@@ -50,6 +47,9 @@ Here's a few things to keep in mind when you enable the change feed.
5047

5148
- Only GPv2 and Blob storage accounts can enable Change feed. GPv1 storage accounts, Premium BlockBlobStorage accounts, and hierarchical namespace enabled accounts are not currently supported.
5249

50+
> [!IMPORTANT]
51+
> The change feed is in public preview, and is available in the **westcentralus** and **westus2** regions. See the [conditions](#conditions) section of this article. To enroll in the preview, see the [Register your subscription](#register) section of this article. You must register your subscription before you can enable change feed on your storage accounts.
52+
5353
### [Portal](#tab/azure-portal)
5454

5555
To deploy the template by using Azure portal:
@@ -239,11 +239,11 @@ See [Process change feed logs in Azure Blob Storage](storage-blob-change-feed-ho
239239

240240
- Values in the `storageDiagnonstics` property bag are for internal use only and not designed for use by your application. Your applications shouldn't have a contractual dependency on that data. You can safely ignore those properties.
241241

242-
- The time represented by the segment is **approximate** with bounds of 15 minutes. So to ensure consumption of all records within an specified time, consume the consecutive previous and next hour segment.
242+
- The time represented by the segment is **approximate** with bounds of 15 minutes. So to ensure consumption of all records within a specified time, consume the consecutive previous and next hour segment.
243243

244-
- Each segment can have a different number of `chunkFilePaths`. This is due to internal partitioning of the log stream to manage publishing throughput. The log files in each `chunkFilePath` are guaranteed to contain mutually-exclusive blobs, and can be consumed and processed in parallel without violating the ordering of modifications per blob during the iteration.
244+
- Each segment can have a different number of `chunkFilePaths`. This is due to internal partitioning of the log stream to manage publishing throughput. The log files in each `chunkFilePath` are guaranteed to contain mutually exclusive blobs, and can be consumed and processed in parallel without violating the ordering of modifications per blob during the iteration.
245245

246-
- The Segments start out in `Publishing` status. Once the appending of the records to the segment are complete, it will be `Finalized`. Log files in any segment that is dated after the date of the `LastConsumable` property in the `$blobchangefeed/meta/Segments.json` file, should not be consumed by your application. Here's an example of the `LastConsumable`property in a `$blobchangefeed/meta/Segments.json` file:
246+
- The Segments start out in `Publishing` status. Once the appending of the records to the segment is complete, it will be `Finalized`. Log files in any segment that is dated after the date of the `LastConsumable` property in the `$blobchangefeed/meta/Segments.json` file, should not be consumed by your application. Here's an example of the `LastConsumable`property in a `$blobchangefeed/meta/Segments.json` file:
247247

248248
```json
249249
{
@@ -297,7 +297,16 @@ This section describes known issues and conditions in the current public preview
297297
- The `url` property of the log file is always empty.
298298
- The `LastConsumable` property of the segments.json file does not list the very first segment that the change feed finalizes. This issue occurs only after the first segment is finalized. All subsequent segments after the first hour are accurately captured in the `LastConsumable` property.
299299

300+
## FAQ
301+
302+
### What is the difference between Change feed and Storage Analytics logging?
303+
Change feed is optimized for application development as only successful blob creation, modification, and deletion events are recorded in the change feed log. Analytics logging records all successful and failed requests across all operations, including read and list operations. By leveraging change feed, you do not have to worry about filtering out the log noise on a transaction heavy account and focus only on the blob change events.
304+
305+
### Should I use Change feed or Storage events?
306+
You can leverage both features as change feed and [Blob storage events](storage-blob-event-overview.md) are similar in nature, with the main difference being the latency, ordering, and storage of event records. Change feed writes records to the change feed log in bulk every few minutes while guaranteeing the order of blob change operations. Storage events are pushed in real time and might not be ordered. Change feed events are durably stored inside your storage account while storage events are transient and consumed by the event handler unless you explicitly store them.
307+
300308
## Next steps
301309

302310
- See an example of how to read the change feed by using a .NET client application. See [Process change feed logs in Azure Blob Storage](storage-blob-change-feed-how-to.md).
303311
- Learn about how to react to events in real time. See [Reacting to Blob Storage events](storage-blob-event-overview.md)
312+
- Learn more about detailed logging information for both successful and failed operations for all requests. See [Azure Storage analytics logging](../common/storage-analytics-logging.md)

articles/storage/blobs/storage-blob-rehydration.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: storage
55
author: mhopkins-msft
66

77
ms.author: mhopkins
8-
ms.date: 08/07/2019
8+
ms.date: 11/14/2019
99
ms.service: storage
1010
ms.subservice: blobs
1111
ms.topic: conceptual
@@ -16,8 +16,8 @@ ms.reviewer: hux
1616

1717
While a blob is in the archive access tier, it's considered offline and can't be read or modified. The blob metadata remains online and available, allowing you to list the blob and its properties. Reading and modifying blob data is only available with online tiers such as hot or cool. There are two options to retrieve and access data stored in the archive access tier.
1818

19-
1. [Rehydrate an archived blob to an online tier](#rehydrate-an-archived-blob-to-an-online-tier) - Rehydrate an archived blob to hot or cool by changing its tier using the [Set Blob Tier](https://docs.microsoft.com/rest/api/storageservices/set-blob-tier) operation.
20-
2. [Copy an archived blob to an online tier](#copy-an-archived-blob-to-an-online-tier) - Create a new copy of an archived blob by using the [Copy Blob](https://docs.microsoft.com/rest/api/storageservices/copy-blob) operation. Specify a different blob name and a destination tier of hot or cool.
19+
1. [Rehydrate an archived blob to an online tier](#rehydrate-an-archived-blob-to-an-online-tier) - Rehydrate an archive blob to hot or cool by changing its tier using the [Set Blob Tier](https://docs.microsoft.com/rest/api/storageservices/set-blob-tier) operation.
20+
2. [Copy an archived blob to an online tier](#copy-an-archived-blob-to-an-online-tier) - Create a new copy of an archive blob by using the [Copy Blob](https://docs.microsoft.com/rest/api/storageservices/copy-blob) operation. Specify a different blob name and a destination tier of hot or cool.
2121

2222
For more information on tiers, see [Azure Blob storage: hot, cool, and archive access tiers](storage-blob-storage-tiers.md).
2323

@@ -27,17 +27,17 @@ While a blob is in the archive access tier, it's considered offline and can't be
2727

2828
## Copy an archived blob to an online tier
2929

30-
If you don't want to rehydrate a blob, you can choose a [Copy Blob](https://docs.microsoft.com/rest/api/storageservices/copy-blob) operation. Your original blob will remain unmodified in archive while you work on the new blob in the hot or cool tier. You can set the optional *x-ms-rehydrate-priority* property to Standard or High (preview) when using the copy process.
30+
If you don't want to rehydrate your archive blob, you can choose to do a [Copy Blob](https://docs.microsoft.com/rest/api/storageservices/copy-blob) operation. Your original blob will remain unmodified in archive while a new blob is created in the online hot or cool tier for you to work on. In the Copy Blob operation, you may also set the optional *x-ms-rehydrate-priority* property to Standard or High (preview) to specify the priority at which you want your blob copy created.
3131

32-
Archive blobs can only be copied to online destination tiers. Copying an archive blob to another archive blob isn't supported.
32+
Archive blobs can only be copied to online destination tiers within the same storage account. Copying an archive blob to another archive blob is not supported.
3333

34-
Copying a blob from Archive takes time. Behind the scenes, the **Copy Blob** operation temporarily rehydrates your archive source blob to create a new online blob in the destination tier. This new blob is not available until the temporary rehydration from archive is complete and the data is written to the new blob.
34+
Copying a blob from archive can take hours to complete depending on the rehydrate priority selected. Behind the scenes, the **Copy Blob** operation reads your archive source blob to create a new online blob in the selected destination tier. The new blob may be visible when you list blobs but the data is not available until the read from the source archive blob is complete and data is written to the new online destination blob. The new blob is as an independent copy and any modification or deletion to it does not affect the source archive blob.
3535

3636
## Pricing and billing
3737

38-
Rehydrating blobs out of archive into hot or cool tiers are charged as read operations and data retrieval. Using High priority (preview) has higher operation and data retrieval costs compared to standard priority. High-priority rehydration shows up as a separate line item on your bill. If a high-priority request to return an archive blob of a few gigabytes takes over 5 hours, you won't be charged the high-priority retrieval rate. However, standard retrieval rates still apply.
38+
Rehydrating blobs out of archive into hot or cool tiers are charged as read operations and data retrieval. Using High priority (preview) has higher operation and data retrieval costs compared to standard priority. High priority rehydration shows up as a separate line item on your bill. If a high priority request to return an archive blob of a few gigabytes takes over 5 hours, you won't be charged the high priority retrieval rate. However, standard retrieval rates still apply as the rehydration was prioritized over other requests.
3939

40-
Copying blobs from archive into hot or cool tiers are charged as read operations and data retrieval. A write operation is charged for the creation of the new copy. Early deletion fees don't apply when you copy to an online blob because the source blob remains unmodified in the archive tier. High priority charges do apply.
40+
Copying blobs from archive into hot or cool tiers are charged as read operations and data retrieval. A write operation is charged for the creation of the new blob copy. Early deletion fees don't apply when you copy to an online blob because the source blob remains unmodified in the archive tier. High priority retrieval charges do apply if selected.
4141

4242
Blobs in the archive tier should be stored for a minimum of 180 days. Deleting or rehydrating archived blobs before 180 days will incur early deletion fees.
4343

0 commit comments

Comments
 (0)