-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add Kafka ingestion support for subset partitions #17587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
xiangfu0
commented
Jan 27, 2026
- pinot-kafka-base: Add stream.kafka.partition.ids config and KafkaPartitionSubsetUtils parser
- pinot-kafka-2.0/3.0: Override fetchPartitionCount, fetchPartitionIds, computePartitionGroupMetadata to respect partition subset; validate configured IDs against topic
- Add unit tests for subset parsing and metadata provider (KafkaPartitionSubsetUtilsTest, KafkaPartitionLevelConsumerTest)
- Add stream subset example (subsetPartitions) and Kafka 2.0 README docs
- QuickStart: Add fineFoodReviews-part-0 and fineFoodReviews-part-1 realtime tables, each consuming one partition of fineFoodReviews topic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for configuring Kafka ingestion to consume only a subset of topic partitions via the stream.kafka.partition.ids configuration property. This enables multiple tables to share a single Kafka topic by consuming different partitions.
Changes:
- Added
stream.kafka.partition.idsconfiguration property and parsing utilities - Modified Kafka metadata providers (2.0 and 3.0) to validate and respect partition subsets
- Updated instance assignment logic to support non-contiguous partition IDs
- Added comprehensive unit tests and example configurations
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pinot-kafka-base/KafkaStreamConfigProperties.java | Defines new PARTITION_IDS constant for subset configuration |
| pinot-kafka-base/KafkaPartitionSubsetUtils.java | Implements parsing, validation, and deduplication of partition ID lists |
| pinot-kafka-base/KafkaPartitionSubsetUtilsTest.java | Comprehensive unit tests for partition ID parsing |
| pinot-kafka-base/KafkaPartitionLevelStreamConfig.java | Exposes stream config map for partition subset utilities |
| pinot-kafka-2.0/KafkaStreamMetadataProvider.java | Overrides partition methods to validate and return subset partitions |
| pinot-kafka-3.0/KafkaStreamMetadataProvider.java | Mirrors 2.0 implementation for Kafka 3.0 compatibility |
| pinot-kafka-2.0/KafkaPartitionLevelConsumerTest.java | Tests subset validation and partition count/ID fetching |
| pinot-kafka-2.0/README.md | Documents the partition subset feature |
| InstanceReplicaGroupPartitionSelector.java | Supports explicit partition IDs in instance assignment |
| ImplicitRealtimeTablePartitionSelector.java | Fetches and uses stream partition IDs for instance assignment |
| RealtimeSegmentAssignment.java | Updates segment assignment to handle non-contiguous partition IDs |
| InstanceAssignmentTest.java | Tests single-partition subset with non-zero ID |
| QuickStartBase.java | Adds fineFoodReviews-part-0 and fineFoodReviews-part-1 examples |
| examples/stream/subsetPartitions/* | Example configuration and documentation |
| examples/stream/fineFoodReviews-part-/ | Demo tables consuming single partitions |
...e/pinot/controller/helix/core/assignment/instance/InstanceReplicaGroupPartitionSelector.java
Outdated
Show resolved
Hide resolved
...e/pinot/controller/helix/core/assignment/instance/InstanceReplicaGroupPartitionSelector.java
Outdated
Show resolved
Hide resolved
...ava/org/apache/pinot/controller/helix/core/assignment/segment/RealtimeSegmentAssignment.java
Outdated
Show resolved
Hide resolved
❌ 2 Tests Failed:
View the top 2 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
82ba313 to
ab92eca
Compare
- pinot-kafka-base: Add stream.kafka.partition.ids config and KafkaPartitionSubsetUtils parser - pinot-kafka-2.0/3.0: Override fetchPartitionCount, fetchPartitionIds, computePartitionGroupMetadata to respect partition subset; validate configured IDs against topic - Add unit tests for subset parsing and metadata provider (KafkaPartitionSubsetUtilsTest, KafkaPartitionLevelConsumerTest) - Add stream subset example (subsetPartitions) and Kafka 2.0 README docs - QuickStart: Add fineFoodReviews-part-0 and fineFoodReviews-part-1 realtime tables, each consuming one partition of fineFoodReviews topic
ab92eca to
3760d06
Compare