Support overloaded computed column functions#2997
Support overloaded computed column functions#2997GeertJohan wants to merge 3 commits intographile:mainfrom
Conversation
|
benjie
left a comment
There was a problem hiding this comment.
Thanks for the contribution; here's some guidance on how to help move this forward. I think we can end up with a much smaller PR that still achieves your goals 🤞
| // When overloaded functions target distinct composite types, each | ||
| // is allowed through the overload check in pgIntrospection_proc. | ||
| // However, the resource registry keys by name (see datasource.ts), | ||
| // so we must ensure each resource has a unique name. We suffix with | ||
| // the target table's schema and name, for example `code(a.pets)` | ||
| // becomes `code_a_pets`. | ||
| // GraphQL field names are unaffected since computedAttributeField | ||
| // inflection uses resource.extensions.pg.name (the raw SQL name). | ||
| { | ||
| const introspection = ( | ||
| await info.helpers.pgIntrospection.getIntrospection() | ||
| ).find((n) => n.pgService.name === serviceName)!.introspection; | ||
| // Check if another proc in the same namespace shares this name | ||
| let hasOverload = false; | ||
| for (const p of introspection.procs) { | ||
| if ( | ||
| p.pronamespace === pgProc.pronamespace && | ||
| p.proname === pgProc.proname && | ||
| p._id !== pgProc._id | ||
| ) { | ||
| hasOverload = true; | ||
| break; | ||
| } | ||
| } | ||
| if (hasOverload) { | ||
| // Resolve the first argument to its target table type | ||
| const firstArgType = pgProc.proargtypes?.[0]; | ||
| if (firstArgType) { | ||
| const composite = resolveToCompositeType( | ||
| firstArgType, | ||
| introspection.types, | ||
| ); | ||
| if (composite) { | ||
| // Look up the pg_class and pg_namespace for the target | ||
| // table so we can build the suffix | ||
| let className: string | undefined; | ||
| let schemaName: string | undefined; | ||
| for (const c of introspection.classes) { | ||
| if (c._id === composite.typrelid) { | ||
| className = c.relname; | ||
| for (const n of introspection.namespaces) { | ||
| if (n._id === c.relnamespace) { | ||
| schemaName = n.nspname; | ||
| break; | ||
| } | ||
| } | ||
| break; | ||
| } | ||
| } | ||
| if (className && schemaName) { | ||
| name = `${name}_${schemaName}_${className}`; | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This logic should be moved to the functionResourceName inflector; otherwise other parts of the code that use the inflector to determine the name of the function (e.g. so they can read it from the pgRegistry) will not get the same name and might extract the wrong function.
Further, renaming a function when it gets an overload makes me uncomfortable... But I think we'll allow it because the alternatives seem worse.
Though having said this... the inflector shouldn't need to look through the rest of the registry to determine what something is called... so... Yeah this is a non-trivial issue.
If we're doing opt-in for this feature, I think just renaming all procedures so they're named by their signature (${fn.name}_${params.map(param => param.type.name).join('_')}) might be the best bet. Or at least all functions that look like computed columns.
Actually we already have a naming convention that encourages ${tableName}_${fieldName} as a name - so I think we can fix this by just enforcing that:
- If this is a "computed column function" (stable/immutable, first argument is composite, belongs to same schema as composite type), then:
- If the function name starts with the composite type name followed by underscore, use current logic;
- Otherwise, pretend the function was named such that it started with the composite type name followed by an underscore, and the follow current logic.
If we follow this, the vast majority of PostGraphile users won't be impacted at all, and only functions that don't follow our established naming conventions will be renamed, and they'll be renamed whether or not they are overloads which means adding an overload later won't break/rename them.
Yes, this feels much better.
There was a problem hiding this comment.
belongs to same schema as composite type
Although for our use-case we actually place all computed columns in a schema specifically for that purpose. This creates nice explicit SQL, e.g.:
SELECT id, computed.code(orders.*) FROM sales.orders;
This works fine already, except for computed columns that are overloaded. e.g. if we would add a computed.code(orderlines.*), then suddenly the computed.code(orders.*) computed column disappears.
So I would prefer to have cross-schema computed columns remain possible with this approach as well, I'm working on a diff for this PR right now :)
There was a problem hiding this comment.
I see this would mean existing functions that take a compound argument will change name.. That's not very nice as it's a breaking change. On the other hand the current behavior where computed.code(orders.*) works fine as long as there is no second computed.code(?) is not very nice either..
Should we accept the breaking change in favor of consistent behavior? Or perhaps be looking at a different solution/direction?
There was a problem hiding this comment.
We cannot accept a breaking change at this point; but it can be handled through a plugin/preset.
There was a problem hiding this comment.
Ah ofcourse, good idea!
I have reverted the existing tests to their original state so we don't have breaking changes any more.
Ive extracted a small part of the functionResourceName into an inflector which can be overridden by plugin/preset. Does it make sense to do it like this? Or should we move the whole functionResourceName to be plugin-overridable?
Do you want me to add a global plugin/preset that's available to the user or just leave that implementation to the user.
I did add a warning log when a computed column is thrown out when it is overloaded. I think it is warranted since, from the user's perspective, its confusing that a field on resource A suddenly disappears when an overload for resource B is added.
| { | ||
| let hasOverload = false; | ||
| for (const p of introspection.procs) { | ||
| if ( | ||
| p.pronamespace === pgProc.pronamespace && | ||
| p.proname === pgProc.proname && | ||
| p._id !== pgProc._id | ||
| ) { | ||
| hasOverload = true; | ||
| break; | ||
| } | ||
| } | ||
| if (hasOverload) { | ||
| // This proc has overloads — check if it qualifies as a computed |
There was a problem hiding this comment.
Seems unnecessary to have rewritten this? Does it need a naked block? In general I'm not a fan of let either, const should be preferred where possible, and in this case the performance cost of find vs for isn't a major concern.
| p.pronamespace === pgProc.pronamespace && | ||
| p.proname === pgProc.proname && | ||
| p._id !== pgProc._id |
There was a problem hiding this comment.
The issue we're really trying to avoid here is when the multiple (overloaded) functions end up with the same name in inflection. Maybe we should encode that specifically as the rule and skip the remaining complexity?
| p.pronamespace === pgProc.pronamespace && | |
| p.proname === pgProc.proname && | |
| p._id !== pgProc._id | |
| p.pronamespace === pgProc.pronamespace && | |
| p.proname === pgProc.proname && | |
| p._id !== pgProc._id && | |
| // Check if name conflicts: | |
| info.inflection.functionResourceName({ | |
| serviceName, | |
| pgProc: p, | |
| }) === name |
| -- Test overloaded computed column functions targeting different tables. | ||
| create table a.pets (id serial primary key, name text); | ||
| create table a.buildings (id serial primary key, address text); | ||
| create function a.code(a.pets) returns text as $$ select 'P' || $1.id::text; $$ language sql stable; | ||
| create function a.code(a.buildings) returns text as $$ select 'B' || $1.id::text; $$ language sql stable; | ||
| comment on function a.code(a.pets) is E'@behavior +typeField'; | ||
| comment on function a.code(a.buildings) is E'@behavior +typeField'; |
There was a problem hiding this comment.
The reason the snapshots have changed so much is you've added stuff to one of our most used schemas (a); please instead create a new schema ("function_overloads") and create these elements in there.
Also, you may need to change the behavior to +typeField -queryField - I can't remember. Check in the generated schema for unexpected changes (e.g. this function showing up twice or causing a clash).
There was a problem hiding this comment.
The __tests__/schema/... is where you'd add a schema test for this. You can use the defaultOptions test as a template, just change the schema name.
…simplify overload detection
|
I have pushed changes to address a number of comments. It also moves towards the consistent prefix behavior, but that does break existing functions (see the thread). In general I'm strictly against breaking changes, however since we are still pre 5.0 stable and since it makes behavior more consistent across the board, I think it may be a good direction. The breaking change will probably have a very limited impact since it's a rare feature? |
|
As a result of these changes, the I've made the test succeed now. But if we think this is the good way forward then it may make sense to rename |
c731613 to
090c885
Compare
benjie
left a comment
There was a problem hiding this comment.
Thanks for the patience whilst we were working through the V5 release!
| functionResourceNameShouldPrefixCompositeType( | ||
| this: Inflection, | ||
| details: { | ||
| pgProc: PgProc; | ||
| firstArgCompositeType: PgType; | ||
| }, | ||
| ): boolean; |
There was a problem hiding this comment.
Inflectors always return string, so this isn't an inflector.
Instead, use something like functionResourceNameCompositeTypePrefix(pgProc) which would return the empty string "" for the falsy path, and the prefix to use (e.g. firstArg.type.typname + "_") for the truthy path. Then you would compose it in as normal:
const computedPrefix = functionResourceNameCompositeTypePrefix(pgProc);
return `${schemaPrefix}${computedPrefix}${pgProc.proname}`;Also, firstArgCompositeType can be derived directly from pgProc and so should not be passed.
Description
Support overloaded computed column functions, where the first argument is a table.
Fixes #2972
Adds a test for this specific scenario.
I deliberately did not modify the PgV4BehaviorPlugin as I suspect users will not appreciate a sudden change in behavior. So this still requires smart tags, just like computed column functions that are NOT overloaded.
AI disclaimer: I used Claude to investigate the codebase and implement parts of this PR, but am actively trying to avoid slop. I personally read all the changes line-by-line, except for the snapshots, and validate the changes to the best of my ability.
Regardless of the use of AI; this is my first PR on this project and first experience with the inner workings of graphile. Although I've been a user for a couple of years now and wrote some custom plugins, diving into a large project like this one always a little overwhelming at first. So please consider carefully and do not hesitate to just close my PR if it's bad, I'll understand.
About the snapshots, I'm not sure if all those changes are correct, this is what recreating the snapshots resulted in but I wonder if it's supposed to do that? It feels like the diff is way too large... I am not familiar enough (yet) with this project to asses this; help needed.
Since procedures need a unique name it probably makes sense to add the table the overloaded procedure is targeting.
For
FUNCTION code(finance.invoices)we could go with either:code_finance_invoicesorfinance_invoices_codeI decided to go with the first since it has the same order of identifiers as the function definition (
a(b.c)>a_b_c), and less likely to be confused/collide with thefinance.invoices_code(..)approach of v4.Performance impact
Should be relatively low. I used manual loops instead of functional to keep allocations low.
Security impact
unknown
Checklist
yarn lint:fixpasses.yarn testpasses.RELEASE_NOTES.mdfile (if one exists).