Skip to content

Commit 987b250

Browse files
authored
Update README.md
1 parent bdcd2f3 commit 987b250

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

data-exports/README.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,4 +72,16 @@ Check code here: [cur-aggregation.yaml](deploy/cur-aggregation.yaml)
7272
Cross-account access is possible but can be difficult to maintain, considering the many different roles that require this access, especially when dealing with multiple accounts.
7373

7474
### We only have one AWS Organization. Do we still need this?
75-
Yes. Throughout an organization's lifecycle, mergers and acquisitions may occur, so this approach prepares you for potential future scenarios.
75+
Yes. Throughout an organization's lifecycle, mergers and acquisitions may occur, so this approach prepares you for potential future scenarios.
76+
77+
### Can I use S3 Intelligent Tiering or S3 Infrequent Access (IA) for my CUR data connected to Athena?
78+
We strongly recommend **against** using S3 IA or other storage tiers for CUR data that is connected to Athena, especially if you have active FinOps users querying this data. Here's why:
79+
- CUDOS typically only retrieves data for the last 7 months, so theoretically older data could be moved to S3 IA or managed with Intelligent Tiering.
80+
- Moving older CUR parquet files to IA could potentially reduce storage costs by up to 45%.
81+
- **However**, this only saves money if the data isn't frequently accessed. With S3 IA, you're charged $0.01 per GB retrieved.
82+
- Athena uses multiple computational nodes in parallel, and complex queries can multiply data reads dramatically. For every 1GB of data you want to scan, Athena might perform up to 75GB of S3 reads.
83+
- If someone runs a query without properly limiting it to specific billing periods, the retrieval costs can be astronomical. For example:
84+
* Scanning a full CUR of 600GB: `600GB × 75 × $0.01/GB` = `$450.00` for just one query!
85+
- Due to this risk of human error, we do not use storage tiering as a default and strongly advise against it for CUR data connected to Athena.
86+
87+
If you still want to optimize storage costs, consider replicating your CUR to another bucket and configuring a lifecycle policy to delete older data from the main bucket connected to Athena.

0 commit comments

Comments
 (0)