Conversation
This was not caught before because PHPStan does not report an error when we use an internal method from the same root namespace.
| assert($colName !== ''); | ||
|
|
||
| return new IndexedColumn( | ||
| UnqualifiedName::unquoted($colName), |
There was a problem hiding this comment.
Unsure if I should use this or quoted here.
There was a problem hiding this comment.
It depends on the value of $colName. It could be first_name (unquoted) or "first_name" (quoted, the quotes are part of the value). If the ORM accepts names with quotes in the string form and wants to pass them to the DBAL as objects, it needs to parse them. See the implementation of Index#parseColumns() as a BC layer in DBAL 4.
There was a problem hiding this comment.
I see that getIndexColumns resorts to a "quote strategy", and that the default implementation supports both ways:
orm/src/Mapping/DefaultQuoteStrategy.php
Lines 29 to 31 in d3b47d2
Let me parse them then.
There was a problem hiding this comment.
Hang on… Parsers::getUnqualifiedNameParser() is internal. Is it OK to use it?
And should we deprecate the need for parsing, that is, disallow quoting at the ORM metadata level? Since DBAL is going to quote everything, if I understood correctly.
There was a problem hiding this comment.
The term "quoted" is a bit overloaded, even in the SQL standard. This article and the one it references in the beginning should answer all the questions on quoting in the DBAL.
There was a problem hiding this comment.
Hang on…
Parsers::getUnqualifiedNameParser()is internal. Is it OK to use it?
I guess, yes, by the ORM. The plan is to migrate all DBAL APIs to names as objects – at that point parsing will be no longer necessary, but this is a lengthy process which will likely span major releases.
There was a problem hiding this comment.
I see that getIndexColumns resorts to a "quote strategy", and that the default implementation supports both ways [...]
Let me parse them then.
Well, maybe you don't need to. Let's at least try to explore this possibility.
In the DefaultQuoteStrategy, the actual value of the column name is taken from one property and whether it needs to be quoted is taken from another. This is exactly the structure and the separation of concerns used in the new DBAL API.
In the AnsiQuoteStrategy, the "quoted" property is ignored, so the name effectively will be put into the SQL as is. Um... that's where you have to parse the name.
So yeah, in the worst case scenario you need to parse. Not parsing in the default case would be solely a runtime optimization, it won't allow to save the engineering effort.
|
|
||
| return new IndexedColumn( | ||
| UnqualifiedName::unquoted($colName), | ||
| null, |
There was a problem hiding this comment.
I'm not sure whether I should specify a length here.
There was a problem hiding this comment.
If the ORM supports $indexData['options']['lengths'], then it needs to parse them and pass the length of each column with the column definition (the BC layer for this is also implemented in Index#parseColumns()).
There was a problem hiding this comment.
It is supported, I have been able to specify a negative length in a test and see a deprecation. Then I wondered how this was even possible given I did not write code to parse the length. I dug deeper and found that even though we are creating an Index instance, we don't pass it to a DBAL API. Instead, we… extract columns from it, and pass options alongside it a few lines below:
Line 368 in d3b47d2
So… parsing options is useless? 😐
There was a problem hiding this comment.
Well, it's worse than that. Table::addUniqueIndex() isn't yet deprecated formally (because there's no full upgrade path) but it shouldn't be used since it mutates the table object. One of the goals of the DBAL API rework is to make the schema objects immutable.
Instead of calling Table::addUniqueIndex(), you need to:
- Create the index
- Edit the table
- Add the index via the editor
- Create a new instance of the table.
Note that the index constructor is internal, and its signature will change in 5.0 (use the editor instead).
There was a problem hiding this comment.
The table is created like this:
Line 207 in d3b47d2
So now that I have a new instance created in 4., I need to 5. Replace the previous instance with the new one in $schema. Is that even possible?
There was a problem hiding this comment.
That's one of the reasons why the old mutable API is not yet fully deprecated. Yes, we would replace the table in the schema in the same way as we would perform any modifications in the table itself, but there's no API for that.
The Schema#createTable() method itself as part of the DBAL API doesn't make sense to me personally. It adds a table with no columns leaving the schema and the table in an invalid state. Ideally, the ORM (as well as any DBAL API consumer) should build the schema at once w/o mutations: create the unique index or constraint definitions for all tables, create all tables and then create a schema with all those tables.
You can edit tables but you cannot edit the schema (yet), so you at least need to pre-create all the tables and then create the schema.
| $editor->setColumns(...$columns); | ||
| $editor->setType(IndexType::UNIQUE); | ||
| if (isset($indexData['options']['predicate'])) { | ||
| $editor->setPredicate($indexData['options']['predicate']); |
There was a problem hiding this comment.
Are there other options that I should be handling here? And does predicate actually need to be handled?
There was a problem hiding this comment.
Are there other options that I should be handling here?
I don't think so. The Index constructor in DBAL 4 is the source of truth about supported options.
And does
predicateactually need to be handled?
To take a step back, why does the ORM translate "unique constraints" to "unique indexes"? Speaking of indexes, I don't think a unique index can have a predicate (i.e. this value must be unique if it matches this and this predicate).
There was a problem hiding this comment.
why does the ORM translate "unique constraints" to "unique indexes"
Because it can span several columns I suppose?
There was a problem hiding this comment.
A constraint can also span several columns. I think the real reason is that support for unique constraints was added later than in was supported by the ORM. Or just because they are functionally very close (the difference is primarily in semantics).
There was a problem hiding this comment.
I'm ashamed to say I just did not know about unique constraints other than as implemented with keywords added to a single column definition. Or maybe I did but I forgot 😅
Or just because they are functionally very close (the difference is primarily in semantics).
Oh? Do constraints help with performance? If not I'd rather say that would be the primary difference.
There was a problem hiding this comment.
TL;DR: semantically, unique constraints are about intention (I want the values in these columns to be unique), unique indexes are about the physical implementation (build me a B-tree or hash and make sure all values are unique and use it for the queries where appropriate).
For example (Postgres):
PostgreSQL automatically creates a unique index when a unique constraint or primary key is defined for a table. The index [...] is the mechanism that enforces the constraint.
IIRC, for MySQL the two are just synonyms (you can create one and then drop the other).
Do constraints help with performance?
Not the constraints themselves, but the underlying unique indexes, as a side effect, definitely. But if you want to optimize queries, it's better if you create an index because you know which queries you want to optimize and what type of index to create. If you only want to enforce uniqueness, you create a unique constraint.
Relying on unique constraints for performance is similar to using transient dependencies in your code w/o declaring the dependency explicitly. It works, but may have its issues.
|
I believe that a purely mechanical translation of the existing ORM API to the DBAL 5 API may be painful. The DBAL 5 is very strict in terms of types and capabilities while the ORM API is as loose as the DBAL 4 API. If the requirement is to allow loose ORM API on top of the strict DBAL API, then the solution will be to duplicate the whole BC layer from DBAL 4 to the ORM. This BC layer could be thought of as a reference implementation. |
ORM may become stricter in 4.0.x, but on 3.7.x I think it's going to need to stay as loose as possible. Unless we decide to do conditional breaking changes 🤔 In any case, I need to take a step back to understand when those indices come from (probably some metadata), and how loose the current API is. Thanks for the review! |
This was not caught before because PHPStan does not report an error when we use an internal method from the same root namespace.