-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Feature](paimon) Refactor Paimon system tables to use native table execution path #60556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 31965 ms |
ClickBench: Total hot run time: 28.33 s |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
1 similar comment
|
run buildall |
TPC-H: Total hot run time: 30551 ms |
ClickBench: Total hot run time: 28.72 s |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 30294 ms |
ClickBench: Total hot run time: 28.28 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
regression-test/suites/external_table_p0/paimon/paimon_data_system_table.groovy
Show resolved
Hide resolved
ee463aa to
1448d94
Compare
|
run buildall |
TPC-H: Total hot run time: 30161 ms |
ClickBench: Total hot run time: 28.41 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
What problem does this PR solve?
Summary
$snapshots,$files,$schemas,$partitions, etc.) to use the native table execution path (PaimonScanNode) instead of the TVF path (MetadataScanNode/paimon_meta())SysTabletype hierarchy (NativeSysTable/TvfSysTable) and a centralizedSysTableResolverto cleanly separate native vs TVF execution pathspaimon_metaTVF,PaimonSysTableJniScanner, andPaimonSysTableJniReader— all Paimon system table queries now go through the unifiedPaimonScanNode+PaimonJniScannerpathMotivation
Previously, Paimon system tables were queried via a Table-Valued Function (
paimon_meta()), which:PaimonSysTableJniScanner) and C++ reader (PaimonSysTableJniReader) dedicated to system tablesMetadataScanNodeinstead ofPaimonScanNode, missing optimizations available in the native path (predicate pushdown, projection pushdown, etc.)SysTabletightly coupled with the TVF execution model, making it hard to add new system table typesArchitecture After Refactoring
SysTable Type Hierarchy
Each table type declares its supported system tables via
Map<String, SysTable>:PaimonExternalTable.getSupportedSysTables()→PaimonSysTable.SUPPORTED_SYS_TABLESIcebergExternalTable.getSupportedSysTables()→IcebergSysTable.SUPPORTED_SYS_TABLESHMSExternalTable.getSupportedSysTables()→ varies byDLAType(HIVE/ICEBERG)Query Execution Flow
DESCRIBE Flow
Key New Classes
SysTableNativeSysTableuseNativeTablePath()=true, factory methodcreateSysExternalTable()TvfSysTableuseNativeTablePath()=false, factory methodscreateFunction()/createFunctionRef()PaimonSysTableSystemTableLoader.SYSTEM_TABLESSysTableResolverresolveForPlan()/resolveForDescribe()/validateForQuery()PaimonSysExternalTableKey Modified Classes
TableIfgetSupportedSysTables()returnsMap<String, SysTable>(wasList); addedfindSysTable()for O(1) lookupBindRelationhandleMetaTable()usesSysTableResolver; native path →LogicalFileScan, TVF path →LogicalTVFRelationDescribeCommandSysTableResolver.resolveForDescribe(); native path returns column schema directlyPaimonScanNodegetSplits()separatesDataSplitvs non-DataSplit; non-DataSplitalways uses JNI readerPaimonSplitSplit(not justDataSplit); addedgetDataSplit()PaimonSourceresolvePaimonTable()handles bothPaimonExternalTableandPaimonSysExternalTablePhysicalPlanTranslatorTableType.PAIMON_EXTERNAL_TABLEcheck instead ofinstanceof PaimonExternalTableRelationUtilgetDbAndTable()usesSysTableResolver.validateForQuery()IcebergScanNodeBaseTablemetadata tables to avoidClassCastExceptionRemoved Classes
PaimonTableValuedFunctionPaimonSysExternalTable+PaimonScanNodePaimonMeta(Nereids TVF)LogicalFileScanwithPaimonSysExternalTablePaimonSysTableJniScanner(Java)PaimonJniScannerPaimonSysTableJniReader(C++)PaimonJniReaderSupportedSysTablesPaimonSysTable.SUPPORTED_SYS_TABLES, etc.)Test Plan
paimon_system_table.groovy,test_paimon_system_table_auth.groovypaimon_meta()TVF calls replaced with directtable$systemTablesyntaxDESCRIBE table$snapshotsreturns correct schema via native path$snapshots,$files,$schemas,$partitions) return correct resultspaimon_metaTVFtest_table_name_with_dollar.groovyverifies tables with$in their name still work correctlyCheck List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)