Skip to content

Apache Doris dialect: add parsing support for inverted-index MATCH operators #6607

@nico-gsantos

Description

@nico-gsantos

Apache Doris supports full-text search through inverted indexes, exposed via the SQL functions and operators MATCH, MATCH_ALL, and MATCH_ANY. These constructs are essential for enabling inverted-index acceleration and are first-class features for text search workloads in Doris.

However, sqlglot currently fails to parse queries that use MATCH, MATCH_ALL, or MATCH_ANY, raising parsing errors such as “Unexpected token: MATCH”.

This limitation becomes a practical issue when sqlglot is used as a SQL parser or validator (for example, in Apache Superset). As a consequence, valid and highly performant Apache Doris SQL queries cannot be used, even though they execute correctly in Doris itself.

Using LIKE is not a viable workaround, because Apache Doris does not apply inverted-index acceleration or term normalization to LIKE predicates. As documented by Apache Doris, full-text search using MATCH_ANY with inverted indexes can yield significant performance improvements (e.g., up to a 9× speedup in documented examples) and may return different result sets due to index-level normalization (such as automatic lowercasing). Therefore, LIKE is neither functionally nor performance-equivalent to MATCH_* queries

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions