Skip to content

Conversation

mvzink
Copy link
Contributor

@mvzink mvzink commented Aug 6, 2025

This PR improves the situation for defining indexes using CREATE INDEX, ALTER TABLE, and CREATE TABLE, by parsing more MySQL options after the column list. (It doesn't really affect key definitions written as options on column definitions.)

The main benefit is supporting MySQL's preferred syntax for USING { BTREE | HASH }, whic is to place it after the column list, instead of before:

-- before: SQL standard, used for example by Postgres
CREATE TABLE t (x INT, KEY idx_name USING BTREE (x));
-- after: preferred by MySQL
CREATE TABLE t (x INT, KEY idx_name (x) USING BTREE);

By "preferred", I mean that MySQL, accepts both forms, but (1) the documentation states the former is deprecated, and (2) if you run SHOW CREATE TABLE on a table with such an index, it will rewrite the USING to come after the column list.

This is accomplished by parsing index options generically after the column list, which currently includes COMMENT and USING. This means these locations will also benefit from future expansion of index option parsing, such as VISIBLE and WITH PARSER. The downside to this approach is that to continue to support the standard syntax which requires USING before the column list, there are now two places where USING could be stored in the AST (the distinct index_type field or in the index_options list). I think this is acceptable to enable broad support and good fidelity.

A secondary benefit is also supporting a subset of MySQL's ALTER options for CREATE INDEX: it accepts ALGORITHM and LOCK options after the index options.

A final, tertiary benefit is rationalizing the code organization by moving CreateIndex and CreateTable from dml.rs to ddl.rs.

mvzink added 5 commits August 6, 2025 09:43
The column list in `CREATE INDEX` now matches the style used elsewhere,
e.g. in `TableConstraint`, which is to use spaces after commas.

```sql
-- before:
CREATE INDEX idx_name ON table_name (column1,column2,column3);
-- after:
CREATE INDEX idx_name ON table_name (column1, column2, column3);
```

When `CreateIndex` was added, there was no explanation for the lack of
spaces, so I assume it was just author preference. But standard style in
all documentation I've seen is to use spaces after commas (including
[MSSQL]'s documentation of `INCLUDE`, which copied the no-spaces style
when added).

[MSSQL]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-index-transact-sql?view=sql-server-ver17#i-create-an-index-with-included-non-key-columns
It's a bit disruptive and pointless to move things around for no reason
other than code organization, but imports look a lot cleaner after this
(and for future use of `IndexOptions` within `CreateIndex`). Plus it's,
you know, correct.
For MySQL, the preferred position to put `USING BTREE` is after the
column list rather than before; before is still supported but considered
[deprecated]:

> The preferred position is after the column list. Expect support for
  use of the option before the column list to be removed in a future
  MySQL release.

[deprecated]: https://dev.mysql.com/doc/refman/8.4/en/alter-table.html

We already parse this correctly for `ALTER TABLE` statements to add
keys, and in table-level constraint parsing for `CREATE TABLE`. We can
just reuse the same parsing to support the newer style for `CREATE
INDEX` statements as well.
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @mvzink!
cc @alamb

@iffyio iffyio merged commit 67fca82 into apache:main Aug 8, 2025
10 checks passed
@mvzink mvzink deleted the push-zsmyukoyzrpr branch August 13, 2025 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants