Skip to content

Commit 61464b0

Browse files
committed
Don't use lower() for citext columns
In case when an index is present, using `lower()` prevents from using the index. The index is typically present for columns with uniqueness, and `lower()` is added for `validates_uniqueness_of ..., case_sensitive: false`. However, if the index is defined with `lower()`, the query without `lower()` wouldn't use the index either. Setup: ``` CREATE EXTENSION citext; CREATE TABLE citexts (cival citext); INSERT INTO citexts (SELECT MD5(random()::text) FROM generate_series(1,1000000)); ``` Without index: ``` EXPLAIN ANALYZE SELECT * from citexts WHERE cival = 'f00'; Gather (cost=1000.00..14542.43 rows=1 width=33) (actual time=165.923..169.065 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on citexts (cost=0.00..13542.33 rows=1 width=33) (actual time=158.218..158.218 rows=0 loops=3) Filter: (cival = 'f00'::citext) Rows Removed by Filter: 333333 Planning Time: 0.070 ms Execution Time: 169.089 ms Time: 169.466 ms EXPLAIN ANALYZE SELECT * from citexts WHERE lower(cival) = lower('f00'); Gather (cost=1000.00..16084.00 rows=5000 width=33) (actual time=166.896..169.881 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on citexts (cost=0.00..14584.00 rows=2083 width=33) (actual time=157.348..157.349 rows=0 loops=3) Filter: (lower((cival)::text) = 'f00'::text) Rows Removed by Filter: 333333 Planning Time: 0.084 ms Execution Time: 169.905 ms Time: 170.338 ms ``` With index: ``` CREATE INDEX val_citexts ON citexts (cival); EXPLAIN ANALYZE SELECT * from citexts WHERE cival = 'f00'; Index Only Scan using val_citexts on citexts (cost=0.42..4.44 rows=1 width=33) (actual time=0.051..0.052 rows=0 loops=1) Index Cond: (cival = 'f00'::citext) Heap Fetches: 0 Planning Time: 0.118 ms Execution Time: 0.082 ms Time: 0.616 ms EXPLAIN ANALYZE SELECT * from citexts WHERE lower(cival) = lower('f00'); Gather (cost=1000.00..16084.00 rows=5000 width=33) (actual time=167.029..170.401 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on citexts (cost=0.00..14584.00 rows=2083 width=33) (actual time=157.180..157.181 rows=0 loops=3) Filter: (lower((cival)::text) = 'f00'::text) Rows Removed by Filter: 333333 Planning Time: 0.132 ms Execution Time: 170.427 ms Time: 170.946 ms DROP INDEX val_citexts; ``` With an index with `lower()` has a reverse effect, a query with `lower()` performs better: ``` CREATE INDEX val_citexts ON citexts (lower(cival)); EXPLAIN ANALYZE SELECT * from citexts WHERE cival = 'f00'; Gather (cost=1000.00..14542.43 rows=1 width=33) (actual time=174.138..177.311 rows=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on citexts (cost=0.00..13542.33 rows=1 width=33) (actual time=165.983..165.984 rows=0 loops=3) Filter: (cival = 'f00'::citext) Rows Removed by Filter: 333333 Planning Time: 0.080 ms Execution Time: 177.333 ms Time: 177.701 ms EXPLAIN ANALYZE SELECT * from citexts WHERE lower(cival) = lower('f00'); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on citexts (cost=187.18..7809.06 rows=5000 width=33) (actual time=0.021..0.022 rows=0 loops=1) Recheck Cond: (lower((cival)::text) = 'f00'::text) -> Bitmap Index Scan on lower_val_on_citexts (cost=0.00..185.93 rows=5000 width=0) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: (lower((cival)::text) = 'f00'::text) Planning Time: 0.102 ms Execution Time: 0.048 ms (6 rows) Time: 0.491 ms ```
1 parent c6b227b commit 61464b0

File tree

3 files changed

+19
-1
lines changed

3 files changed

+19
-1
lines changed

activerecord/CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
* Stop using `LOWER()` for case-insensitive queries on `citext` columns
2+
3+
Previously, `LOWER()` was added for e.g. uniqueness validations with
4+
`case_sensitive: false`.
5+
It wasn't mentioned in the documentation that the index without `LOWER()`
6+
wouldn't be used in this case.
7+
8+
*Phil Pirozhkov*
9+
110
* Extract `#sync_timezone_changes` method in AbstractMysqlAdapter to enable subclasses
211
to sync database timezone changes without overriding `#raw_execute`.
312

activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1031,7 +1031,10 @@ def build_statement_pool
10311031
end
10321032

10331033
def can_perform_case_insensitive_comparison_for?(column)
1034-
@case_insensitive_cache ||= {}
1034+
# NOTE: citext is an exception. It is possible to perform a
1035+
# case-insensitive comparison using `LOWER()`, but it is
1036+
# unnecessary, as `citext` is case-insensitive by definition.
1037+
@case_insensitive_cache ||= { "citext" => false }
10351038
@case_insensitive_cache.fetch(column.sql_type) do
10361039
@case_insensitive_cache[column.sql_type] = begin
10371040
sql = <<~SQL

activerecord/test/cases/adapters/postgresql/citext_test.rb

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,12 @@ def test_select_case_insensitive
7171
assert_equal "Cased Text", x.cival
7272
end
7373

74+
def test_case_insensitiveness
75+
attr = Citext.arel_table[:cival]
76+
comparison = @connection.case_insensitive_comparison(attr, nil)
77+
assert_no_match(/lower/i, comparison.to_sql)
78+
end
79+
7480
def test_schema_dump_with_shorthand
7581
output = dump_table_schema("citexts")
7682
assert_match %r[t\.citext "cival"], output

0 commit comments

Comments
 (0)