Skip to content

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Nov 30, 2025

Summary

Add first-class support for querying system tables through the broker, with an SPI/registry for providers and initial built-in tables (system.tables, system.instances).

Motivation

Today, cluster/table/instance introspection largely requires controller APIs. System tables provide a SQL-native way to inspect Pinot metadata and operational stats.

Key Changes

  • Add SystemTableProvider SPI + @SystemTable marker and SystemTableRegistry for discovery/lifecycle management at broker start/stop.
  • Wire the broker to detect system.* references (Calcite AST traversal via SystemTableQueryDetector) and route them to a new SystemTableBrokerRequestHandler.
  • Execute system table queries fully on the broker (single-stage engine) by planning against an in-memory IndexSegment and reducing via BrokerReduceService.
  • Add new broker config pinot.broker.systemtable.executor.pool.size (default: 2) for system table query execution.
  • Add pinot-system-table plugin with initial providers:
    • system.tables: table metadata from broker TableCache + controller /tables/{table}/size stats (segments/totalDocs/reportedSize/estimatedSize), with TTL caching and configurable controller timeout.
    • system.instances: instance metadata/state/tags from controller APIs, with configurable controller timeout.
    • InMemorySystemTableSegment: lightweight in-memory segment used by providers.
  • Allow system.* table names to bypass database-prefix validation in DatabaseUtils.translateTableName so system tables can be queried regardless of database header; access is still governed by broker AccessControl (providers must avoid exposing sensitive data).
  • Extend pinot-java-client:
    • Add PinotAdminClient#getTableSize / PinotTableAdminClient#getTableSize.
    • URL-encode path segments for segment-related endpoints to safely handle names.

Limitations

  • System tables are only supported in the single-stage engine. If useMultistageEngine=true, the broker returns a validation error instructing users to disable it.
  • Joins are not supported for system table queries (single table reference enforced).

Testing

  • Unit tests: SystemTableQueryDetectorTest, SystemTableBrokerRequestHandlerTest, SystemTableRegistryTest, TablesSystemTableProviderTest, DatabaseUtilsTest.
  • Integration test: SystemTableIntegrationTest validates system.tables size/totalDocs fields against controller size + segment metadata.

Example Queries

SELECT tableName,type,segments,totalDocs,reportedSize,estimatedSize FROM system.tables ORDER BY tableName
SELECT instanceId,type,state,tags FROM system.instances ORDER BY instanceId

Release Notes

  • New config: pinot.broker.systemtable.executor.pool.size
  • New plugin: pinot-system-table
  • New Java client APIs: PinotAdminClient#getTableSize, PinotTableAdminClient#getTableSize

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a system table SPI and infrastructure to expose virtual metadata tables (like system.tables and system.instances) queryable via standard SQL through the broker. It includes:

  • Core SPI classes for system tables (request, response, filter, provider, config utils) in pinot-spi
  • A registry and provider implementations for system.tables (with table size/metadata fetching) and system.instances (stub)
  • Broker wiring to handle system table queries with projection, filtering, and offset/limit support
  • Database utility updates to allow system tables to bypass database prefix validation
  • Admin client enhancement to fetch table sizes from the controller

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pinot-spi/src/main/java/org/apache/pinot/spi/systemtable/*.java Introduces system table SPI classes (Request, Response, Filter, Provider, ConfigUtils)
pinot-common/src/main/java/org/apache/pinot/common/systemtable/SystemTableRegistry.java Adds registry to manage system table providers
pinot-common/src/main/java/org/apache/pinot/common/systemtable/provider/*.java Implements TablesSystemTableProvider and InstancesSystemTableProvider
pinot-common/src/main/java/org/apache/pinot/common/utils/DatabaseUtils.java Adds bypass logic for system tables to skip database prefix validation
pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/*.java Updates request handlers to route system table queries through new system table path
pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BaseBrokerStarter.java Wires system table providers into broker startup
pinot-clients/pinot-java-client/src/main/java/org/apache/pinot/client/admin/PinotTableAdminClient.java Adds getTableSize method to fetch table size from controller
pinot-broker/pom.xml Adds pinot-java-client dependency to support admin client usage
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SystemTableIntegrationTest.java Adds integration test for system.tables querying
pinot-common/src/test/java/org/apache/pinot/common/systemtable/provider/TablesSystemTableProviderTest.java Adds unit test for TablesSystemTableProvider
pinot-broker/src/test/java/org/apache/pinot/broker/requesthandler/SystemTableBrokerRequestHandlerTest.java Adds unit test for system table request handling
pinot-broker/src/test/java/org/apache/pinot/broker/requesthandler/*.java Updates existing tests to inject SystemTableRegistry

@codecov-commenter
Copy link

codecov-commenter commented Nov 30, 2025

Codecov Report

❌ Patch coverage is 34.16974% with 892 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.99%. Comparing base (9ef03d0) to head (e514a52).
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
...ystemtable/provider/TablesSystemTableProvider.java 39.14% 160 Missing and 39 partials ⚠️
...equesthandler/SystemTableBrokerRequestHandler.java 25.94% 169 Missing and 8 partials ⚠️
...emtable/provider/InstancesSystemTableProvider.java 11.70% 163 Missing and 3 partials ⚠️
...emtable/datasource/InMemorySystemTableSegment.java 32.79% 118 Missing and 7 partials ⚠️
...roker/requesthandler/SystemTableQueryDetector.java 43.66% 28 Missing and 12 partials ⚠️
...ble/provider/BrokerRuntimeSystemTableProvider.java 71.27% 23 Missing and 4 partials ⚠️
...not/systemtable/provider/HelixControllerUtils.java 0.00% 27 Missing ⚠️
...pache/pinot/client/SystemTableDataTableClient.java 39.53% 24 Missing and 2 partials ⚠️
...he/pinot/client/admin/PinotSegmentAdminClient.java 0.00% 26 Missing ⚠️
...t/common/systemtable/SystemTableResponseUtils.java 0.00% 26 Missing ⚠️
... and 6 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17291      +/-   ##
============================================
- Coverage     63.19%   62.99%   -0.21%     
- Complexity     1480     1541      +61     
============================================
  Files          3173     3186      +13     
  Lines        189896   191259    +1363     
  Branches      29057    29280     +223     
============================================
+ Hits         120011   120482     +471     
- Misses        60558    61372     +814     
- Partials       9327     9405      +78     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 62.95% <34.16%> (-0.23%) ⬇️
java-21 62.95% <34.16%> (-0.18%) ⬇️
temurin 62.99% <34.16%> (-0.21%) ⬇️
unittests 62.99% <34.16%> (-0.21%) ⬇️
unittests1 55.51% <28.82%> (-0.03%) ⬇️
unittests2 35.26% <33.72%> (+1.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 requested a review from Copilot December 1, 2025 00:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.

@xiangfu0 xiangfu0 force-pushed the system-table branch 4 times, most recently from 4ed0ef7 to c50b6f3 Compare December 6, 2025 23:16
@xiangfu0 xiangfu0 force-pushed the system-table branch 2 times, most recently from 0b3775f to 32d43a8 Compare December 14, 2025 23:17
@xiangfu0 xiangfu0 requested a review from Copilot December 14, 2025 23:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 10 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

pinot-common/src/main/java/org/apache/pinot/common/systemtable/provider/TablesSystemTableProvider.java:1

  • The initialCapacity logic is incorrect when hasLimit is false. Setting capacity to 0 when no limit is specified will cause ArrayList to grow dynamically, potentially causing multiple resizes. Consider using sortedTableNames.size() as the initial capacity when no limit is specified.
/**

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

@xiangfu0 xiangfu0 marked this pull request as ready for review December 17, 2025 12:00
@xiangfu0 xiangfu0 force-pushed the system-table branch 6 times, most recently from c39949d to 1811dfa Compare December 20, 2025 21:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

pinot-common/src/main/java/org/apache/pinot/common/systemtable/provider/TablesSystemTableProvider.java:1

  • Converting arbitrary objects to bytes via toString().getBytes() may not produce meaningful results for complex objects. Consider documenting the expected input types or adding validation to ensure only appropriate types are converted to bytes.
/**

@xiangfu0 xiangfu0 force-pushed the system-table branch 3 times, most recently from df73191 to 5bab828 Compare December 28, 2025 15:07
@xiangfu0 xiangfu0 requested a review from Copilot December 28, 2025 15:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 2 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 3 comments.

@xiangfu0 xiangfu0 requested a review from Copilot December 29, 2025 11:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 1 comment.

@xiangfu0 xiangfu0 force-pushed the system-table branch 2 times, most recently from 5540416 to 70a02ab Compare December 29, 2025 16:37
@xiangfu0 xiangfu0 requested a review from Copilot December 29, 2025 16:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 36 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

pinot-plugins/pinot-system-table/src/main/java/org/apache/pinot/systemtable/provider/TablesSystemTableProvider.java:68

  • The cache TTL configuration is read once at class load time, making it impossible to update without restarting the broker. Consider implementing dynamic configuration reloading to allow runtime adjustments of cache behavior.
  private static final long SIZE_CACHE_TTL_MS = getNonNegativeLongProperty(SIZE_CACHE_TTL_MS_PROPERTY,
      DEFAULT_SIZE_CACHE_TTL_MS);

@xiangfu0 xiangfu0 force-pushed the system-table branch 2 times, most recently from 76900df to 7936728 Compare January 2, 2026 03:40
@xiangfu0 xiangfu0 requested a review from Copilot January 3, 2026 16:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 2 comments.

@xiangfu0 xiangfu0 force-pushed the system-table branch 2 times, most recently from 3fae010 to a824def Compare January 15, 2026 19:17
@xiangfu0 xiangfu0 force-pushed the system-table branch 2 times, most recently from c82bb16 to 0e78c43 Compare January 29, 2026 17:05
@xiangfu0 xiangfu0 requested a review from Copilot January 30, 2026 07:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 7 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 6 comments.

Comment on lines +83 to +84
.addMetric("reportedSize", FieldSpec.DataType.LONG)
.addMetric("estimatedSize", FieldSpec.DataType.LONG)
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column names reportedSize and estimatedSize are inconsistent with the camelCase convention used by other columns in the schema. Consider renaming them to reportedSizeBytes and estimatedSizeBytes to clarify the unit and maintain consistency with other size-related fields.

Copilot uses AI. Check for mistakes.
Comment on lines +448 to +449
long fetched = fetchTotalDocsFromSegmentMetadata(tableNameWithType, sizeFromController._offlineSegments._segments,
controllerBaseUrls);
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method fetches totalDocs by iterating through all segments and making individual controller API calls for each segment's metadata. For tables with many segments, this could result in a large number of network round trips. Consider optimizing by batching segment metadata requests or using a controller API that returns totalDocs for all segments in a single call.

Copilot uses AI. Check for mistakes.
new NamedThreadFactory("system-table-scatter-gather-executor")));
SSLContext sslContext = BrokerContext.getInstance().getClientHttpsContext();
int timeoutMs = (int) Math.min(Integer.MAX_VALUE, _brokerTimeoutMs);
if (timeoutMs < 1) {
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout clamping logic (minimum 1ms) could lead to unexpected behavior if _brokerTimeoutMs is 0 or negative, which might be a valid configuration for certain use cases. Consider logging a warning when the timeout is clamped, or documenting why 1ms is the minimum acceptable value.

Suggested change
if (timeoutMs < 1) {
if (timeoutMs < 1) {
// Enforce a minimum timeout of 1ms; zero or negative timeouts are not meaningful for client connections.
LOGGER.warn("Configured broker timeout {}ms is non-positive; clamping to 1ms for system table client "
+ "connections", _brokerTimeoutMs);

Copilot uses AI. Check for mistakes.
Comment on lines +151 to +153
_systemTableDataTableClient =
new SystemTableDataTableClient(connectionTimeouts, sslContext);
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SystemTableDataTableClient is instantiated with specific connection timeouts and SSL context, but there are no tests covering error handling when SSL context is null or when connection timeouts are invalid. Consider adding test coverage for these edge cases.

Copilot uses AI. Check for mistakes.
final class PinotAdminPathUtils {
private PinotAdminPathUtils() {
}

Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method lacks a Javadoc comment explaining why the + character is replaced with %20. This is important because URLEncoder.encode() follows application/x-www-form-urlencoded encoding (where space becomes +), but URL path segments require percent-encoding (where space becomes %20). Add a comment explaining this distinction.

Suggested change
/**
* Encodes a single URL path segment using UTF-8.
* <p>
* {@link URLEncoder} implements {@code application/x-www-form-urlencoded} encoding, where spaces
* are converted to {@code +}. For URL path segments, spaces must instead be percent-encoded as
* {@code %20}, so this method post-processes the encoded value to replace {@code +} with
* {@code %20}.
*/

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +122
Map<SystemTableProvider, Boolean> providersToClose = new IdentityHashMap<>();
synchronized (PROVIDERS) {
for (SystemTableProvider provider : PROVIDERS.values()) {
providersToClose.put(provider, Boolean.TRUE);
}
}
try {
for (SystemTableProvider provider : providersToClose.keySet()) {
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using IdentityHashMap<SystemTableProvider, Boolean> to track unique providers for closing is unconventional. Consider using Collections.newSetFromMap(new IdentityHashMap<>()) to create an identity-based Set, which more clearly expresses the intent of tracking unique provider instances.

Suggested change
Map<SystemTableProvider, Boolean> providersToClose = new IdentityHashMap<>();
synchronized (PROVIDERS) {
for (SystemTableProvider provider : PROVIDERS.values()) {
providersToClose.put(provider, Boolean.TRUE);
}
}
try {
for (SystemTableProvider provider : providersToClose.keySet()) {
Set<SystemTableProvider> providersToClose =
Collections.newSetFromMap(new IdentityHashMap<SystemTableProvider, Boolean>());
synchronized (PROVIDERS) {
for (SystemTableProvider provider : PROVIDERS.values()) {
providersToClose.add(provider);
}
}
try {
for (SystemTableProvider provider : providersToClose) {

Copilot uses AI. Check for mistakes.
SystemTableDataProvider now exposes getDataSource() returning an IndexSegment.
Add InMemorySystemTableSegment and update system.tables/system.instances providers
Broker runs system table queries using the v1 query engine and reduce path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants