Skip to content

Update Airflow DAG Cadence #724

@hunterpack

Description

@hunterpack

Description

This story aims to reduce Airflow "on-call noise" by optimizing task schedules and sensor timeouts. Recent on-call rotations highlighted several intermittent failures caused by overly sensitive sensors and misaligned DAG start times, particularly regarding the MGI file arrival (15:10 UTC).

Tasks

1. Schedule & Sensor Optimizations

  • Partner Pipeline DAG: Update start time to 15:00 UTC. (Polling for two hours prior to file arrival at 15:10 is unnecessary).
  • DBT Stellar Marts DAG: Increase the sensor timeout to 2 hours 30 minutes (currently too sensitive at 30 mins).
  • Generate Avro Files Daily: Increase sensor timeout to 6 hours OR shift start time to 16:00 UTC with a 1-hour timeout to account for late upstream data.
  • DBT Enriched Base Tables: Change cadence from 30-minute intervals to hourly to reduce long-running DAG overlaps and resource contention.

2. Dependency Management

  • Verify that dbt_stellar_marts remains at 13:00 UTC to support early TZ data availability for non-MGI dependent tasks (Asset Prices, Balances).
  • Audit individual sensors within the Marts DAG to ensure they correctly gate only MGI-dependent tasks.

Acceptance Criteria

  • DAG failure alerts for "Sensor Timeout" on MGI/Partner pipelines are significantly reduced during the 13:00–15:00 UTC window.
  • dbt_enriched_base_tables runs successfully on an hourly cadence without intermittent overlaps.
  • Data availability for Eastern Time users (Asset Prices/Balances) is not delayed by the MGI schedule shift.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions