|
| 1 | +# **RFC0011 for Presto** |
| 2 | + |
| 3 | +## Introduce new SPI for sql invoked functions |
| 4 | + |
| 5 | +Proposers |
| 6 | +* Pratik Joseph Dabre |
| 7 | +* Tim Meehan |
| 8 | + |
| 9 | +## Related Issues |
| 10 | + |
| 11 | +* https://github.com/prestodb/presto/issues/24964 |
| 12 | +* https://github.com/prestodb/presto/pull/25597 |
| 13 | + |
| 14 | +## Summary |
| 15 | + |
| 16 | +This RFC proposes the introduction of a new SPI method, `getSqlInvokedFunctionsSPI` that enables plugin-based registration of SQL invoked functions in Presto. |
| 17 | +This SPI is supported via a new function namespace manager , `BuiltInPluginFunctionNamespaceManager`, which isolates SQL invoked functions loaded through a plugin from native or built-in functions. |
| 18 | +To support different runtime(sidecar-enabled native and java/sidecar disabled native) environments, two separate plugin modules will be introduced: |
| 19 | +- One for java-only/native deployments that includes all inlined SQL invoked functions currently registered under `BuiltInTypeAndFunctionNamespaceManager`. |
| 20 | +- One for native-sidecar deployments that includes all inlined SQL invoked functions currently registered under `BuiltInTypeAndFunctionNamespaceManager` but excludes functions overridden natively. |
| 21 | + |
| 22 | +This design aims to prevent conflicts in function resolution, especially when native functions are integrated into the coordinator in a sidecar-enabled cluster. |
| 23 | + |
| 24 | + |
| 25 | +## Background |
| 26 | + |
| 27 | +In sidecar-enabled clusters, users can configure the default function namespace that will be used to serve functions. The C++ functions are then loaded via an endpoint under this specified namespace. |
| 28 | +Previously, inlined SQL invoked functions were registered under the `presto.default` namespace (the default Java built-in namespace). Native functions pulled from the c++ sidecar were registered under a separate namespace like `native.default`. |
| 29 | +These native functions contain overridden optimized implementations of some of the inlined SQL invoked functions that the users would like to take advantage of. |
| 30 | +However, this setup led to conflict resolution complexity. Presto would need to prioritize between two namespaces `presto.default` and `native.default` when resolving functions and such namespace-based prioritization is opaque and not ideal. |
| 31 | + |
| 32 | +To address this, the proposed change moves all inlined SQL invoked functions into a plugin-based namespace manager `BuiltInPluginFunctionNamespaceManager` which serves whatever the current default function namespace is. |
| 33 | +This ensures: |
| 34 | +- All inlined SQL invoked functions are grouped together cleanly and loaded via a plugin. |
| 35 | +- Overridden native implementations of inlined SQL invoked functions are excluded from plugin registration. |
| 36 | +- The default serving catalog resolves functions seamlessly since all SQL invoked functions still live under the same default catalog but are now plugin loaded and managed by a plugin-based namespace manager `BuiltInPluginFunctionNamespaceManager`. |
| 37 | + |
| 38 | +### Goals |
| 39 | +- Introduce a new SPI method, `getSqlInvokedFunctionsSPI` for plugin-based registration of SQL invoked functions. |
| 40 | +- Create and wire a new plugin-based namespace manager `BuiltInPluginFunctionNamespaceManager` to manage these functions. |
| 41 | +- Enable native sidecar environments to load only the relevant subset of inlined SQL-invoked functions (i.e., excluding ones overridden natively). |
| 42 | +- Guarantee that inlined SQL invoked functions still work under the default catalog using the new plugin namespace manager. |
| 43 | + |
| 44 | +### Non-goals |
| 45 | +- This RFC does not Change the execution model for inlined SQL invoked functions - they are still evaluated in Java as earlier. |
| 46 | +- It does not modify function resolution rules outside of registration-time separation (i.e., no new prioritization logic is added). |
| 47 | +- It does not impact native (functions pulled from sidecar) or Java built-in function registration, which remains handled by the `NativeFunctionNamespaceManager` and `BuiltInTypeAndFunctionNamespaceManager` respectively. |
| 48 | + |
| 49 | +## Proposed Implementation |
| 50 | + |
| 51 | +This section describes the internal changes made to safely register SQL invoked functions via a new plugin SPI and isolate them from built-in functions. It includes SPI definition, |
| 52 | +the design of a new namespace manager, conflict detection and lazy resolution mechanisms. |
| 53 | +- New SPI: `getSqlInvokedFunctionSPI` |
| 54 | + |
| 55 | + A new method was introduced in the plugin SPI to allow SQL invoked functions to be loaded: |
| 56 | + |
| 57 | + ``` |
| 58 | + default Set<Class<?>> getSqlInvokedFunctions() |
| 59 | + { |
| 60 | + return emptySet(); |
| 61 | + } |
| 62 | + ``` |
| 63 | + |
| 64 | + This SPI is picked up by the plugin manager at Presto startup. These functions are routed to a new plugin-specific namespace manager described below. |
| 65 | +
|
| 66 | +
|
| 67 | +- New Class : `BuiltInPluginFunctionNamespaceManager` |
| 68 | +
|
| 69 | + A new class, `BuiltInPluginFunctionNamespaceManager`, was introduced to hold all SQL invoked functions loaded from plugins. |
| 70 | + |
| 71 | + - Responsibilities: |
| 72 | + - Manage SQL invoked functions from plugin modules |
| 73 | + - Serve only the configured default serving namespace |
| 74 | + - Ensure no duplicate function signatures are registered |
| 75 | + - Support lazy caching and resolution in sidecar environments |
| 76 | +
|
| 77 | + - #### Constructor and Field initialization: |
| 78 | +
|
| 79 | + - ``` |
| 80 | + private final Supplier<FunctionMap> cachedFunctions = |
| 81 | + Suppliers.memoize(this::checkForNamingConflicts); |
| 82 | + ``` |
| 83 | + - ``` |
| 84 | + private synchronized FunctionMap checkForNamingConflicts() |
| 85 | + { |
| 86 | + Optional<FunctionNamespaceManager<?>> functionNamespaceManager = |
| 87 | + functionAndTypeManager.getServingFunctionNamespaceManager(functionAndTypeManager.getDefaultNamespace()); |
| 88 | + checkArgument(functionNamespaceManager.isPresent(), "Cannot find function namespace for catalog '%s'", functionAndTypeManager.getDefaultNamespace().getCatalogName()); |
| 89 | + checkForNamingConflicts(functionNamespaceManager.get().listFunctions(Optional.empty(), Optional.empty())); |
| 90 | + return functions; |
| 91 | + } |
| 92 | + ``` |
| 93 | + This defers the conflict check until functions are actually used, particularly useful in sidecar-enabled clusters. |
| 94 | +
|
| 95 | + - #### Function Registration workflow: |
| 96 | + Functions from each plugin implementing the new SPI are registered as follows: |
| 97 | + - ``` |
| 98 | + public void registerPluginFunctions(List<? extends SqlFunction> functions) |
| 99 | + { |
| 100 | + builtInPluginFunctionNamespaceManager.registerPluginFunctions(functions); |
| 101 | + } |
| 102 | + ``` |
| 103 | + - ``` |
| 104 | + public synchronized void registerPluginFunctions(List<? extends SqlFunction> functions) |
| 105 | + { |
| 106 | + checkForNamingConflicts(functions); |
| 107 | + this.functions = new FunctionMap(this.functions, functions); |
| 108 | + } |
| 109 | + ``` |
| 110 | + |
| 111 | + This ensures plugin-provided SQL functions are separated from built-in/native functions and properly validated. |
| 112 | +
|
| 113 | + - #### Conflict Detection and signature validation: |
| 114 | +
|
| 115 | + Two layers of conflict checks are added: |
| 116 | + 1. During registration against SQL invoked functions from other plugins |
| 117 | + - ``` |
| 118 | + private synchronized void checkForNamingConflicts(Collection<? extends SqlFunction> functions) |
| 119 | + { |
| 120 | + for (SqlFunction function : functions) { |
| 121 | + for (SqlFunction existingFunction : this.functions.list()) { |
| 122 | + checkArgument(!function.getSignature().equals(existingFunction.getSignature()), "Function already registered: %s", function.getSignature()); |
| 123 | + } |
| 124 | + } |
| 125 | + } |
| 126 | + ``` |
| 127 | + |
| 128 | + 2. Against current default serving namespace functions |
| 129 | + |
| 130 | + - ``` |
| 131 | + private synchronized FunctionMap checkForNamingConflicts() |
| 132 | + { |
| 133 | + Optional<FunctionNamespaceManager<?>> functionNamespaceManager = |
| 134 | + functionAndTypeManager.getServingFunctionNamespaceManager(functionAndTypeManager.getDefaultNamespace()); |
| 135 | + checkArgument(functionNamespaceManager.isPresent(), "Cannot find function namespace for catalog '%s'", functionAndTypeManager.getDefaultNamespace().getCatalogName()); |
| 136 | + checkForNamingConflicts(functionNamespaceManager.get().listFunctions(Optional.empty(), Optional.empty())); |
| 137 | + return functions; |
| 138 | + } |
| 139 | + ``` |
| 140 | + |
| 141 | + This ensures that plugin provided functions don't conflict with other registered functions , both with other plugins and also the current default namespace manager. |
| 142 | +
|
| 143 | + - #### Sidecar-Aware Behavior and Lazy Function Resolution: |
| 144 | + In clusters where native-sidecar functions are enabled, function registration and resolution must be carefully orchestrated to: |
| 145 | + - Avoid duplicate function signatures |
| 146 | + - Delay conflict checks until both plugin and native functions are available. |
| 147 | + - This is triggered only upon first use of: |
| 148 | + - `SHOW FUNCTIONS` |
| 149 | + - `DESCRIBE FUNCTION` |
| 150 | + - Internal resolution calls like `resolveFunction` |
| 151 | + - ``` |
| 152 | + public Collection<? extends SqlFunction> getFunctions(QualifiedObjectName functionName) |
| 153 | + { |
| 154 | + if (functions.list().isEmpty() || |
| 155 | + (!functionName.getCatalogSchemaName().equals(functionAndTypeManager.getDefaultNamespace()))) { |
| 156 | + return emptyList(); |
| 157 | + } |
| 158 | + return cachedFunctions.get().get(functionName); |
| 159 | + } |
| 160 | + ``` |
| 161 | + |
| 162 | + For java clusters and non-sidecar enabled native clusters, the first part of the above check is helpful as unless and until the plugin functions are loaded , we don't compare them with the underlying |
| 163 | + serving function namespace i.e. `JAVA_BUILTIN_NAMESPACE` or `presto.default` functions. |
| 164 | + |
| 165 | + For sidecar enabled clusters, the second part of the above check is helpful as if it's not a function under the default namespace , don't trigger the conflict validation with the sidecar fetched functions. |
| 166 | + This works particularly because while starting the coordinator there are going to be functions which need to be resolved, we don't want to trigger the conflict validation then, but delay it until a function from the `native.default` namespace is requested. |
| 167 | + This strategy allows the coordinator to finish initializing and calls the sidecar lazily to trigger the conflict check. |
| 168 | +
|
| 169 | + - #### Metadata and execution support: |
| 170 | + This section explains how SQL invoked functions registered via the plugin SPI are integrated into Presto's function resolution and execution infrastructure. Both metadata fetching and execution are handled with full support for plugin-based SQL invoked functions. |
| 171 | +
|
| 172 | + - When a function is resolved (e.g. during analysis or planning), Presto now checks both: |
| 173 | + The current default serving namespace functions (handled by `BuiltInTypeAndFunctionNamespaceManager` in java/non-sidecar enabled clusters and `NativeFunctionNamespaceManager` in sidecar enabled clusters). |
| 174 | + The new `BuiltInPluginFunctionNamespaceManager`, which contains all plugin provided SQL invoked functions. |
| 175 | + This logic is encapsulated in the below method: |
| 176 | + |
| 177 | + ``` |
| 178 | + private Collection<? extends SqlFunction> getFunctions( |
| 179 | + QualifiedObjectName functionName, |
| 180 | + Optional<? extends FunctionNamespaceTransactionHandle> transactionHandle, |
| 181 | + FunctionNamespaceManager<?> functionNamespaceManager) |
| 182 | + { |
| 183 | + return ImmutableList.<SqlFunction>builder() |
| 184 | + .addAll(functionNamespaceManager.getFunctions(transactionHandle, functionName)) |
| 185 | + .addAll(builtInPluginFunctionNamespaceManager.getFunctions(functionName)) |
| 186 | + .build(); |
| 187 | + } |
| 188 | + |
| 189 | + /** |
| 190 | + * Gets the function handle of the function if there is a match. We enforce explicit naming for dynamic function namespaces. |
| 191 | + * All unqualified function names will only be resolved against the built-in default function namespace. We get all the candidates |
| 192 | + * from the current default namespace and additionally all the candidates from builtInPluginFunctionNamespaceManager. |
| 193 | + * |
| 194 | + * @throws PrestoException if there are no matches or multiple matches |
| 195 | + */ |
| 196 | + private FunctionHandle getMatchingFunctionHandle( |
| 197 | + QualifiedObjectName functionName, |
| 198 | + Optional<? extends FunctionNamespaceTransactionHandle> transactionHandle, |
| 199 | + FunctionNamespaceManager<?> functionNamespaceManager, |
| 200 | + List<TypeSignatureProvider> parameterTypes, |
| 201 | + boolean coercionAllowed) |
| 202 | + { |
| 203 | + Optional<Signature> matchingDefaultFunctionSignature = |
| 204 | + getMatchingFunction(functionNamespaceManager.getFunctions(transactionHandle, functionName), parameterTypes, coercionAllowed); |
| 205 | + Optional<Signature> matchingPluginFunctionSignature = |
| 206 | + getMatchingFunction(builtInPluginFunctionNamespaceManager.getFunctions(functionName), parameterTypes, coercionAllowed); |
| 207 | + |
| 208 | + if (matchingDefaultFunctionSignature.isPresent() && matchingPluginFunctionSignature.isPresent()) { |
| 209 | + throw new PrestoException(AMBIGUOUS_FUNCTION_CALL, format("Function '%s' has two matching signatures. Please specify parameter types. \n" + |
| 210 | + "First match : '%s', Second match: '%s'", functionName, matchingDefaultFunctionSignature.get(), matchingPluginFunctionSignature.get())); |
| 211 | + } |
| 212 | + |
| 213 | + if (matchingDefaultFunctionSignature.isPresent()) { |
| 214 | + return functionNamespaceManager.getFunctionHandle(transactionHandle, matchingDefaultFunctionSignature.get()); |
| 215 | + } |
| 216 | + |
| 217 | + if (matchingPluginFunctionSignature.isPresent()) { |
| 218 | + return builtInPluginFunctionNamespaceManager.getFunctionHandle(matchingPluginFunctionSignature.get()); |
| 219 | + } |
| 220 | + |
| 221 | + throw new PrestoException(FUNCTION_NOT_FOUND, constructFunctionNotFoundErrorMessage(functionName, parameterTypes, |
| 222 | + getFunctions(functionName, transactionHandle, functionNamespaceManager))); |
| 223 | + } |
| 224 | + ``` |
| 225 | +
|
| 226 | +## Adoption Plan |
| 227 | +
|
| 228 | +Users would need to load the newly introduced plugin modules if they wish to use the inlined SQL functions that Presto currently supports. |
| 229 | +This change applies to both Java-based and native sidecar-enabled/non-enabled Presto clusters. |
| 230 | +- In Java-only/ non-sidecar enabled native clusters, users can load the `presto-sql-invoked-functions-plugin` plugin module containing all the inlined SQL invoked functions. |
| 231 | +- In native clusters with sidecar enabled, users should load the trimmed-down plugin module `presto-native-sql-invoked-functions-plugin` that excludes functions overridden by native implementations to avoid signature conflicts. |
| 232 | +
|
| 233 | +This design ensures a smooth transition, enabling explicit control over which functions are available at runtime based on the deployment context. |
| 234 | +
|
| 235 | +## Test Plan |
| 236 | +
|
| 237 | +1. Unit Tests |
| 238 | + - Register plugin functions |
| 239 | + - Trigger signature conflicts |
| 240 | + - Validate new function namespace manager behaviour and caching, ensure memoization is called once. |
| 241 | + - Trigger resolution via SHOW FUNCTIONS, etc. |
| 242 | +
|
| 243 | +2. Integration Tests |
| 244 | + - Java-only / native non-sidecar enabled clusters: |
| 245 | + - Register `presto-sql-invoked-functions-plugin` plugin module. |
| 246 | + - Run queries using the inlined sql invoked functions present under `presto-sql-invoked-functions-plugin`. |
| 247 | +
|
| 248 | + - Sidecar-enabled clusters: |
| 249 | + - Use native sidecar safe plugin `presto-native-sql-invoked-functions-plugin` module. |
| 250 | + - Validate conflict detection (e.g. `test123` not double-registered), if conflicts, need to fail gracefully. |
0 commit comments