Skip to content

Conversation

@anshul98ks123
Copy link
Contributor

@anshul98ks123 anshul98ks123 commented Jan 30, 2026

Issue(s)

Inconsistency between FileFormat enum and PluginManager input format mappings causing protobuf record reader lookup failures

Description

This pull request fixes a discrepancy between two parallel record reader registration mechanisms in Pinot.

Background - Two Registration Systems:

Pinot has two places that maintain record reader class name mappings:

  1. RecordReaderFactory (uses FileFormat enum with UPPERCASE keys)

    • Used by: SegmentIndexCreationDriverImpl, RecordReaderFileConfig, IngestionUtils, CLI tools (CreateSegmentCommand), and most test classes
    • Purpose: Direct RecordReader instantiation via getRecordReader(FileFormat, ...)
  2. PluginManager (uses lowercase string keys like "protobuf")

    • Used by: SegmentGenerationAndPushTaskGenerator, FileIngestionTaskConfigUtils, BatchTableSampler, DeltaTableIngestionTaskExecutor
    • Purpose: Dynamic plugin loading and class name resolution for task configs

The Problem:

Component Protobuf Key Lookup Method
FileFormat enum PROTO N/A (enum constant)
PluginManager "protobuf" .toLowerCase()
RecordReaderFactory "PROTO" (from enum) .toUpperCase()

When code receives input format as "protobuf" or "PROTOBUF" (following PluginManager's convention) and tries to convert it to FileFormat via FileFormat.valueOf(), it fails because FileFormat only had PROTO as the enum constant.

This caused issues in components like BatchTableSampler where the input format string comes from task configs (using PluginManager's "protobuf" convention) but needs to be converted to FileFormat enum.

Solution:

Added PROTOBUF as an enum constant in FileFormat and registered it in RecordReaderFactory to ensure consistency with PluginManager's naming convention.

Changes:

  • Added PROTOBUF to FileFormat enum
  • Registered PROTOBUF in RecordReaderFactory static initializer mapping to ProtoBufRecordReader and ProtoBufRecordReaderConfig

@anshul98ks123 anshul98ks123 force-pushed the add-protobuf-entry-in-recordreaderfactory branch 2 times, most recently from ebfaac1 to 4921201 Compare January 30, 2026 07:10
@anshul98ks123 anshul98ks123 force-pushed the add-protobuf-entry-in-recordreaderfactory branch from 4921201 to 8ef26aa Compare January 30, 2026 07:43
@codecov-commenter
Copy link

codecov-commenter commented Jan 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.16%. Comparing base (0a951fa) to head (8ef26aa).

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17601      +/-   ##
============================================
+ Coverage     63.11%   63.16%   +0.05%     
- Complexity     1478     1479       +1     
============================================
  Files          3173     3173              
  Lines        189915   189918       +3     
  Branches      29064    29064              
============================================
+ Hits         119872   119971      +99     
+ Misses        60711    60631      -80     
+ Partials       9332     9316      -16     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.14% <100.00%> (-36.86%) ⬇️
java-21 63.12% <100.00%> (+<0.01%) ⬆️
temurin 63.16% <100.00%> (+0.05%) ⬆️
unittests 63.16% <100.00%> (+0.05%) ⬆️
unittests1 55.54% <100.00%> (+0.05%) ⬆️
unittests2 34.02% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Jackie-Jiang
Copy link
Contributor

This is not very robust. I'd suggest adding a map from input format to FileFormat enum into PluginManager

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants