-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
What happened?
Environment:
Beam SDK: 2.66.0 (stable) vs 2.71.0 (problematic)
Runner: Google Cloud Dataflow (Streaming)
Language: Java
Source: KafkaIO
Pipeline Structure
KafkaIO
→ MapElements (KafkaRecordToKV)
→ Window.into(...)
→ GroupByKey
→ Sort
→ BigtableIO read/ write
→ Write to Kafka
Window / Trigger configuration
window
.triggering(
AfterWatermark.pastEndOfWindow()
.withEarlyFirings(
AfterPane.elementCountAtLeast(elementCount))
.withLateFirings(
AfterPane.elementCountAtLeast(lateElementCount)))
.discardingFiredPanes()
.withAllowedLateness(Duration.standardSeconds(allowedLateness));
window duration: 1 sec
allowed lateness: 600 sec
early count: 100
late count: 1
Observed behavior
Beam 2.66
Stable message delay, not a big accumulation at GroupAndSort step. Data Freshness graph shows a stable pattern
Beam 2.71
Delay shows saw-tooth pattern (buffer → flush cycles). Seems accumulate significantly larger panes before flushing
Important observation
No code or traffic change — only Beam version upgrade.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner