|
| 1 | +# Data Exports and Legacy CUR |
| 2 | + |
| 3 | +## Table of Contents |
| 4 | +- [Introduction](#introduction) |
| 5 | +- [Data Exports](#data-exports) |
| 6 | + - [Basic Architecture](#basic-architecture-of-data-exports) |
| 7 | + - [Advanced Architecture](#advanced-architecture-of-data-exports) |
| 8 | +- [Legacy Cost and Usage Report](#legacy-cost-and-usage-report) |
| 9 | +- [FAQ](#faq) |
| 10 | + |
| 11 | +## Introduction |
| 12 | +This document describes AWS Data Exports functionality and Legacy Cost and Usage Reports (CUR) for cost management and analysis purposes. |
| 13 | + |
| 14 | +## Data Exports |
| 15 | + |
| 16 | +For deployment instructions, please refer to the documentation at: https://catalog.workshops.aws/awscid/data-exports |
| 17 | + |
| 18 | +### Basic Architecture of Data Exports |
| 19 | + |
| 20 | + |
| 21 | +1. [AWS Data Exports](https://aws.amazon.com/aws-cost-management/aws-data-exports/) delivers daily Cost & Usage Report (CUR2) and other reports to an [Amazon S3 Bucket](https://aws.amazon.com/s3/) in the Management Account. |
| 22 | +2. [Amazon S3](https://aws.amazon.com/s3/) replication rule copies Export data to a dedicated Data Collection Account S3 bucket automatically. |
| 23 | +3. [Amazon Athena](https://aws.amazon.com/athena/) allows querying data directly from the S3 bucket using an [AWS Glue](https://aws.amazon.com/glue/) table schema definition. |
| 24 | +4. [Amazon QuickSight](https://aws.amazon.com/quicksight/) datasets can read from [Amazon Athena](https://aws.amazon.com/athena/). Check Cloud Intelligence Dashboards for more details. |
| 25 | + |
| 26 | +### Advanced Architecture of Data Exports |
| 27 | +For customers with additional requirements, an enhanced architecture is available: |
| 28 | + |
| 29 | + |
| 30 | + |
| 31 | +1. [AWS Data Exports](https://aws.amazon.com/aws-cost-management/aws-data-exports/) service delivers [Cost & Usage Report (CUR2)](https://docs.aws.amazon.com/cur/latest/userguide/what-is-cur.html) daily to an [Amazon S3](https://aws.amazon.com/s3/) Bucket in your AWS Account (either in Management/Payer Account or a regular Linked Account). In us-east-1 region, the CloudFormation creates native resources; in other regions, CloudFormation uses AWS Lambda and Custom Resource to provision Data Exports in us-east-1. |
| 32 | + |
| 33 | +2. [Amazon S3 replication](https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication.html) rules copy Export data to a dedicated Data Collection Account automatically. This replication filters out all metadata and makes the file structure on the S3 bucket compatible with [Amazon Athena](https://aws.amazon.com/athena/) and [AWS Glue](https://aws.amazon.com/glue/) requirements. |
| 34 | + |
| 35 | +3. A [Bucket Policy](https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-policies.html) controls which accounts can replicate data to the destination bucket. |
| 36 | + |
| 37 | +4. [AWS Glue Crawler](https://docs.aws.amazon.com/glue/latest/dg/components-overview.html#crawling-component) runs every midnight UTC to update the partitions of the table definition in [AWS Glue Data Catalog](https://docs.aws.amazon.com/glue/latest/dg/components-overview.html#data-catalog-component). |
| 38 | + |
| 39 | +5. [Amazon QuickSight](https://aws.amazon.com/quicksight/) pulls data from Amazon Athena to its SPICE (Super-fast, Parallel, In-memory Calculation Engine). |
| 40 | + |
| 41 | +6. When collecting data exports for Linked accounts (not for Management Accounts), you may also want to collect data exports for the Data Collection account itself. In this case, specify the Data Collection account as the first in the list of Source Accounts. Replication is still required to remove metadata. |
| 42 | + |
| 43 | +7. Athena's reading process can be affected by writing operations. When replication arrives, it might fail to update datasets, especially with high volumes of data. In such cases, consider scheduling temporary disabling and re-enabling of the Amazon S3 bucket policy that allows replication. Since exports typically arrive three times daily, this temporary deactivation has minimal side effects. |
| 44 | + |
| 45 | +8. Some customers might need to store data exports to secondary destinations for archiving or reporting at a higher organizational level. You can specify a secondary bucket to receive the data in these cases. |
| 46 | + |
| 47 | +## Legacy Cost and Usage Report |
| 48 | +Legacy AWS Cost and Usage Reports (Legacy CUR) can still be used for Cloud Intelligence Dashboards and other use cases. |
| 49 | + |
| 50 | +The CID project provides a CloudFormation template for Legacy CUR. Unlike the Data Exports CloudFormation template, it does not provide AWS Glue tables. You can use this template to replicate CUR and aggregate CUR from multiple source accounts (Management or Linked). |
| 51 | + |
| 52 | +## FAQ |
| 53 | + |
| 54 | +### Why replicate data instead of providing cross-account access? |
| 55 | +Cross-account access is possible but can be difficult to maintain, considering the many different roles that require this access, especially when dealing with multiple accounts. |
| 56 | + |
| 57 | +### We only have one AWS Organization. Do we still need this? |
| 58 | +Yes. Throughout an organization's lifecycle, mergers and acquisitions may occur, so this approach prepares you for potential future scenarios. |
0 commit comments