This Terraform module exports Azure cost-related data and forwards to AWS S3. The supported data sets are described below:
- Cost Data: Daily parquet files containing standardized cost and usage details in FOCUS format
- Utilisation Data: Daily CSV files with resource usage metrics and recommendations
- Carbon Emissions Data: Monthly JSON reports with carbon footprint metrics across Scope 1 and Scope 3 emissions
Note
There is currently an issue with publishing Function App code on the Flex Consumption Plan using a managed identity. We have had to revert to using the storage account connection string for now. More details can be found here (behind a paywall, sadly).
This module creates a fully integrated solution for exporting multiple Azure datasets and forwarding them to AWS S3. The following diagram illustrates the data flow and component architecture for all three export types:
graph TD
subgraph "Data Sources"
CMF[Cost Management<br/>FOCUS Export]
CMU[Cost Management<br/>Utilization Export]
COA[Carbon Optimization API<br/>Monthly Timer]
end
subgraph "Azure Storage"
SA[Storage Account]
end
subgraph "Processing"
QF[Queue: FOCUS]
QU[Queue: Utilization]
FAF[CostExportProcessor<br/>Function App]
FAU[UtilizationProcessor<br/>Function App]
FAC[CarbonExporter<br/>Function App]
end
subgraph "AWS"
S3[S3 Bucket]
APP[Entra ID App<br/>Registration<br/>for Upload Auth]
end
%% Data Flow
CMF -->|Daily Parquet| SA
CMU -->|Daily CSV.GZ| SA
COA -->|Monthly Timer| FAC
SA -->|Blob Event| QF
SA -->|Blob Event| QU
QF -->|Trigger| FAF
QU -->|Trigger| FAU
%% Upload Flow with App Registration Authentication
FAF -->|Upload via<br/>App Registration| S3
FAU -->|Upload via<br/>App Registration| S3
FAC -->|Upload via<br/>App Registration| S3
FAF -.->|Uses for Auth| APP
FAU -.->|Uses for Auth| APP
FAC -.->|Uses for Auth| APP
%% Styling
classDef datasource fill:#4285f4,color:#fff
classDef storage fill:#4285f4,color:#fff
classDef queue fill:#00d4aa,color:#fff
classDef function fill:#4285f4,color:#fff
classDef aws fill:#ff9900,color:#fff
classDef auth fill:#28a745,color:#fff
class CMF,CMU,COA datasource
class SA storage
class QF,QU queue
class FAF,FAU,FAC function
class S3 aws
class APP auth
The module creates three distinct export pipelines for each of the data sets:
- Daily Export: Cost Management exports daily FOCUS-format cost data (Parquet files) to Azure Storage
- Event Trigger: Blob creation events trigger the
CostExportProcessor
function via storage queue - Processing: Function processes and transforms the data (removes sensitive columns, restructures paths)
- Upload: Processed data uploaded to S3 in partitioned structure:
billing_period=YYYYMMDD/
- Daily Export: Cost Management exports daily usage data (compressed CSV files) to Azure Storage
- Event Trigger: Blob creation events trigger the
UtilizationExportProcessor
function via storage queue - Processing: Function processes CSV.GZ files and transforms file paths
- Upload: Raw data uploaded to S3 in partitioned structure:
billing_period=YYYYMMDD/
- Monthly Trigger:
CarbonEmissionsExporter
function runs monthly on the 20th (timer trigger) - API Call: Function calls Azure Carbon Optimization API for previous month's Scope 1 & 3 emissions
- Processing: Response data formatted as JSON with date range validation (2024-06-01 to 2025-06-01)
- Upload: JSON data uploaded to S3 in partitioned structure:
billing_period=YYYYMMDD/
- Function Apps use Managed Identity to authenticate with Entra ID Application
- Entra ID Application uses OIDC federation to assume AWS IAM Role
- All data transfers secured with cross-cloud federation (no long-lived AWS credentials)
- Application Insights provides telemetry and monitoring for all pipelines
- Private Networking: All components use private endpoints and VNet integration
- Zero Trust: No public network access (except during deployment if
deploy_from_external_network=true
) - Managed Identity: Azure resources authenticate using system-assigned managed identities
- Cross-Cloud Federation: OIDC federation eliminates need for long-lived AWS credentials
This example assumes you have an existing virtual network with two subnets, one of which has a delegation for Microsoft.App.environments:
provider "azurerm" {
# These need to be explicitly registered
resource_providers_to_register = ["Microsoft.CostManagementExports", "Microsoft.App"]
features {}
}
module "example" {
source = "git::https://github.com/co-cddo/terraform-azure-focus?ref=<ref>" # TODO: Add commit SHA
aws_account_id = "<aws-account-id>"
report_scope = "/providers/Microsoft.Billing/billingAccounts/<billing-account-id>:<billing-profile-id>_2019-05-31"
subnet_id = "/subscriptions/<subscription-id>/resourceGroups/existing-infra/providers/Microsoft.Network/virtualNetworks/existing-vnet/subnets/default"
function_app_subnet_id = "/subscriptions/<subscription-id>/resourceGroups/existing-infra/providers/Microsoft.Network/virtualNetworks/existing-vnet/subnets/functionapp"
virtual_network_name = "existing-vnet"
virtual_network_resource_group_name = "existing-infra"
resource_group_name = "rg-cost-export"
# Setting to false or omitting this argument assumes that you have private GitHub runners configured in the existing virtual network. It is not recommended to set this to true in production.
deploy_from_external_network = false
}
Tip
If you don't have a suitable existing Virtual Network with two subnets (one of which has a delegation to Microsoft.App.environments), please refer to the example configuration here, which provisions the prerequisite baseline infrastructure before consuming the module.
The terraform-docs
utility is used to generate this README. Follow the below steps to update:
- Make changes to the
.terraform-docs.yml
file - Fetch the
terraform-docs
binary (https://terraform-docs.io/user-guide/installation/) - Run
terraform-docs markdown table --output-file ${PWD}/README.md --output-mode inject .
Name | Version |
---|---|
archive | >= 2.0 |
azapi | >= 1.7.0 |
azuread | > 2.0 |
azurerm | > 4.0 |
null | >= 3.0 |
random | >= 3.0 |
time | >= 0.7.0 |
Name | Description | Type | Default | Required |
---|---|---|---|---|
aws_account_id | AWS account ID to use for the S3 bucket | string |
n/a | yes |
function_app_subnet_id | ID of the subnet to connect the function app to. This subnet must have delegation configured for Microsoft.App/environments and must be in the same virtual network as the private endpoints | string |
n/a | yes |
report_scope | Scope of the cost report Eg '/providers/Microsoft.Billing/billingAccounts/00000000-0000-0000-0000-000000000000' | string |
n/a | yes |
resource_group_name | Name of the new resource group | string |
n/a | yes |
subnet_id | ID of the subnet to deploy the private endpoints to. Must be a subnet in the existing virtual network | string |
n/a | yes |
virtual_network_name | Name of the existing virtual network | string |
n/a | yes |
virtual_network_resource_group_name | Name of the existing resource group where the virtual network is located | string |
n/a | yes |
aws_region | AWS region for the S3 bucket | string |
"eu-west-2" |
no |
aws_s3_bucket_name | Name of the AWS S3 bucket to store cost data | string |
"uk-gov-gds-cost-inbound-azure" |
no |
deploy_from_external_network | If you don't have existing GitHub runners in the same virtual network, set this to true. This will enable 'public' access to the function app during deployment. This is added for convenience and is not recommended in production environments | bool |
false |
no |
focus_dataset_version | Version of the cost and usage details (FOCUS) dataset to use | string |
"1.0r2" |
no |
location | The Azure region where resources will be created | string |
"uksouth" |
no |
Name | Description |
---|---|
aws_app_client_id | The aws app client id |
backfill_export_names | The names of the backfill FOCUS cost exports for historical data |
carbon_container_name | The storage container name for carbon data (not used - carbon data goes directly to S3) |
carbon_export_name | The name of the carbon optimization export (timer-triggered function) |
focus_container_name | The storage container name for FOCUS cost data |
focus_export_name | The name of the FOCUS cost export |
utilization_container_name | The storage container name for utilization data |
utilization_export_name | The name of the cost utilization export |