Skip to content

co-cddo/terraform-azure-focus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Github Actions

terraform-azure-focus

Description

This Terraform module exports Azure cost-related data and forwards to AWS S3. The supported data sets are described below:

  • Cost Data: Daily parquet files containing standardized cost and usage details in FOCUS format
  • Utilisation Data: Daily CSV files with resource usage metrics and recommendations
  • Carbon Emissions Data: Monthly JSON reports with carbon footprint metrics across Scope 1 and Scope 3 emissions

Note

There is currently an issue with publishing Function App code on the Flex Consumption Plan using a managed identity. We have had to revert to using the storage account connection string for now. More details can be found here (behind a paywall, sadly).

Architecture

This module creates a fully integrated solution for exporting multiple Azure datasets and forwarding them to AWS S3. The following diagram illustrates the data flow and component architecture for all three export types:

graph TD
    subgraph "Data Sources"
        CMF[Cost Management<br/>FOCUS Export]
        CMU[Cost Management<br/>Utilization Export]
        COA[Carbon Optimization API<br/>Monthly Timer]
    end
    
    subgraph "Azure Storage"
        SA[Storage Account]
    end
    
    subgraph "Processing"
        QF[Queue: FOCUS]
        QU[Queue: Utilization]
        
        FAF[CostExportProcessor<br/>Function App]
        FAU[UtilizationProcessor<br/>Function App]
        FAC[CarbonExporter<br/>Function App]
    end
    
    subgraph "AWS"
        S3[S3 Bucket]
        APP[Entra ID App<br/>Registration<br/>for Upload Auth]
    end
    
    %% Data Flow
    CMF -->|Daily Parquet| SA
    CMU -->|Daily CSV.GZ| SA
    COA -->|Monthly Timer| FAC
    
    SA -->|Blob Event| QF
    SA -->|Blob Event| QU
    
    QF -->|Trigger| FAF
    QU -->|Trigger| FAU
    
    %% Upload Flow with App Registration Authentication
    FAF -->|Upload via<br/>App Registration| S3
    FAU -->|Upload via<br/>App Registration| S3
    FAC -->|Upload via<br/>App Registration| S3
    
    FAF -.->|Uses for Auth| APP
    FAU -.->|Uses for Auth| APP
    FAC -.->|Uses for Auth| APP
    
    %% Styling
    classDef datasource fill:#4285f4,color:#fff
    classDef storage fill:#4285f4,color:#fff
    classDef queue fill:#00d4aa,color:#fff
    classDef function fill:#4285f4,color:#fff
    classDef aws fill:#ff9900,color:#fff
    classDef auth fill:#28a745,color:#fff
    
    class CMF,CMU,COA datasource
    class SA storage
    class QF,QU queue
    class FAF,FAU,FAC function
    class S3 aws
    class APP auth
Loading

Data Flow

The module creates three distinct export pipelines for each of the data sets:

FOCUS Cost Data Pipeline

  1. Daily Export: Cost Management exports daily FOCUS-format cost data (Parquet files) to Azure Storage
  2. Event Trigger: Blob creation events trigger the CostExportProcessor function via storage queue
  3. Processing: Function processes and transforms the data (removes sensitive columns, restructures paths)
  4. Upload: Processed data uploaded to S3 in partitioned structure: billing_period=YYYYMMDD/

Utilization Data Pipeline

  1. Daily Export: Cost Management exports daily usage data (compressed CSV files) to Azure Storage
  2. Event Trigger: Blob creation events trigger the UtilizationExportProcessor function via storage queue
  3. Processing: Function processes CSV.GZ files and transforms file paths
  4. Upload: Raw data uploaded to S3 in partitioned structure: billing_period=YYYYMMDD/

Carbon Emissions Pipeline

  1. Monthly Trigger: CarbonEmissionsExporter function runs monthly on the 20th (timer trigger)
  2. API Call: Function calls Azure Carbon Optimization API for previous month's Scope 1 & 3 emissions
  3. Processing: Response data formatted as JSON with date range validation (2024-06-01 to 2025-06-01)
  4. Upload: JSON data uploaded to S3 in partitioned structure: billing_period=YYYYMMDD/

Common Authentication Flow

  • Function Apps use Managed Identity to authenticate with Entra ID Application
  • Entra ID Application uses OIDC federation to assume AWS IAM Role
  • All data transfers secured with cross-cloud federation (no long-lived AWS credentials)
  • Application Insights provides telemetry and monitoring for all pipelines

Security Features

  • Private Networking: All components use private endpoints and VNet integration
  • Zero Trust: No public network access (except during deployment if deploy_from_external_network=true)
  • Managed Identity: Azure resources authenticate using system-assigned managed identities
  • Cross-Cloud Federation: OIDC federation eliminates need for long-lived AWS credentials

Usage

This example assumes you have an existing virtual network with two subnets, one of which has a delegation for Microsoft.App.environments:

provider "azurerm" {
  # These need to be explicitly registered
  resource_providers_to_register = ["Microsoft.CostManagementExports", "Microsoft.App"]
  features {}
}

module "example" {
  source                              = "git::https://github.com/co-cddo/terraform-azure-focus?ref=<ref>" # TODO: Add commit SHA

  aws_account_id                      = "<aws-account-id>"
  report_scope                        = "/providers/Microsoft.Billing/billingAccounts/<billing-account-id>:<billing-profile-id>_2019-05-31"
  subnet_id                           = "/subscriptions/<subscription-id>/resourceGroups/existing-infra/providers/Microsoft.Network/virtualNetworks/existing-vnet/subnets/default"
  function_app_subnet_id              = "/subscriptions/<subscription-id>/resourceGroups/existing-infra/providers/Microsoft.Network/virtualNetworks/existing-vnet/subnets/functionapp"
  virtual_network_name                = "existing-vnet"
  virtual_network_resource_group_name = "existing-infra"
  resource_group_name                 = "rg-cost-export"
  # Setting to false or omitting this argument assumes that you have private GitHub runners configured in the existing virtual network. It is not recommended to set this to true in production.
  deploy_from_external_network        = false
}

Tip

If you don't have a suitable existing Virtual Network with two subnets (one of which has a delegation to Microsoft.App.environments), please refer to the example configuration here, which provisions the prerequisite baseline infrastructure before consuming the module.

Update Documentation

The terraform-docs utility is used to generate this README. Follow the below steps to update:

  1. Make changes to the .terraform-docs.yml file
  2. Fetch the terraform-docs binary (https://terraform-docs.io/user-guide/installation/)
  3. Run terraform-docs markdown table --output-file ${PWD}/README.md --output-mode inject .

Providers

Name Version
archive >= 2.0
azapi >= 1.7.0
azuread > 2.0
azurerm > 4.0
null >= 3.0
random >= 3.0
time >= 0.7.0

Inputs

Name Description Type Default Required
aws_account_id AWS account ID to use for the S3 bucket string n/a yes
function_app_subnet_id ID of the subnet to connect the function app to. This subnet must have delegation configured for Microsoft.App/environments and must be in the same virtual network as the private endpoints string n/a yes
report_scope Scope of the cost report Eg '/providers/Microsoft.Billing/billingAccounts/00000000-0000-0000-0000-000000000000' string n/a yes
resource_group_name Name of the new resource group string n/a yes
subnet_id ID of the subnet to deploy the private endpoints to. Must be a subnet in the existing virtual network string n/a yes
virtual_network_name Name of the existing virtual network string n/a yes
virtual_network_resource_group_name Name of the existing resource group where the virtual network is located string n/a yes
aws_region AWS region for the S3 bucket string "eu-west-2" no
aws_s3_bucket_name Name of the AWS S3 bucket to store cost data string "uk-gov-gds-cost-inbound-azure" no
deploy_from_external_network If you don't have existing GitHub runners in the same virtual network, set this to true. This will enable 'public' access to the function app during deployment. This is added for convenience and is not recommended in production environments bool false no
focus_dataset_version Version of the cost and usage details (FOCUS) dataset to use string "1.0r2" no
location The Azure region where resources will be created string "uksouth" no

Outputs

Name Description
aws_app_client_id The aws app client id
backfill_export_names The names of the backfill FOCUS cost exports for historical data
carbon_container_name The storage container name for carbon data (not used - carbon data goes directly to S3)
carbon_export_name The name of the carbon optimization export (timer-triggered function)
focus_container_name The storage container name for FOCUS cost data
focus_export_name The name of the FOCUS cost export
utilization_container_name The storage container name for utilization data
utilization_export_name The name of the cost utilization export

About

This Terraform module exports Azure cost-related data and forwards to AWS S3

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •