Skip to content

Commit e414d4c

Browse files
authored
sqlite: Remove -DSQLITE_ENABLE_FTS3_TOKENIZER and add tests for compile options (#791)
As noted in the discussion in #562, compiling SQLite with the -DSQLITE_ENABLE_FTS3_TOKENIZER flag is equivalent to using `connection.setconfig(sqlite3.SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER)` at runtime. The purpose of this option, in either syntax, is to disable a security measure to provide backwards compatibility for older code. Specifically, the `fts3_tokenizer()` function can accept or return a native-code pointer to a structure containing callback functions, which makes it an attractive target for SQL injection attacks to escalate to arbitrary native code execution. The more-secure behavior is to require the use of bound parameters with this function; the backwards-compatible behavior allows the function to be called with blob literals or computed values. Because of a documentation shortcoming, some applications thought they needed this option on at compile time, and so Debian's SQLite build, used by e.g. the `python` container on Dockerhub, has it on. But there is no functionality that is only enabled by having this option on at compile time. Ideally, applications should use bound parameters when calling this function. If that code change is hard, they can alternatively set the option themselves at runtime to preserve compatibility with existing code, but that still doesn't need anything turned on at compile time. So the right decision for us is not to enable this flag at compile time and preserve the secure behavior. Add a test that `fts3_tokenizer()` is usable with bound parameters but not with blob literals, and also add tests for a couple of other preivously-requested SQLite flags for compatibility with other implementations: * #309: -DSQLITE_ENABLE_DBSTAT_VTAB * #449: serialize/deserialize (on by default, was just a compile-time detection issue) * #550: -DSQLITE_ENABLE_FTS3_PARENTHESIS
1 parent aeba083 commit e414d4c

File tree

2 files changed

+40
-1
lines changed

2 files changed

+40
-1
lines changed

cpython-unix/build-sqlite.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@ CFLAGS="${EXTRA_TARGET_CFLAGS} \
3232
-DSQLITE_ENABLE_DBSTAT_VTAB \
3333
-DSQLITE_ENABLE_FTS3 \
3434
-DSQLITE_ENABLE_FTS3_PARENTHESIS \
35-
-DSQLITE_ENABLE_FTS3_TOKENIZER \
3635
-DSQLITE_ENABLE_FTS4 \
3736
-DSQLITE_ENABLE_FTS5 \
3837
-DSQLITE_ENABLE_GEOPOLY \

src/verify_distribution.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
# file, You can obtain one at https://mozilla.org/MPL/2.0/.
44

55
import os
6+
import struct
67
import sys
78
import unittest
89

@@ -137,6 +138,45 @@ def test_sqlite(self):
137138
cursor.execute(
138139
f"CREATE VIRTUAL TABLE test{extension} USING {extension}(a, b, c);"
139140
)
141+
142+
# Test various SQLite flags and features requested / expected by users.
143+
# The DBSTAT virtual table shows some metadata about disk usage.
144+
# https://www.sqlite.org/dbstat.html
145+
self.assertNotEqual(
146+
cursor.execute("SELECT COUNT(*) FROM dbstat;").fetchone()[0],
147+
0,
148+
)
149+
150+
# The serialize/deserialize API is configurable at compile time.
151+
if sys.version_info[0:2] >= (3, 11):
152+
self.assertEqual(conn.serialize()[:15], b"SQLite format 3")
153+
154+
# The "enhanced query syntax" (-DSQLITE_ENABLE_FTS3_PARENTHESIS) allows parenthesizable
155+
# AND, OR, and NOT operations. The "standard query syntax" only has OR as a keyword, so we
156+
# can test for the difference with a query using AND.
157+
# https://www.sqlite.org/fts3.html#_set_operations_using_the_enhanced_query_syntax
158+
cursor.execute("INSERT INTO testfts3 VALUES('hello world', '', '');")
159+
self.assertEqual(
160+
cursor.execute(
161+
"SELECT COUNT(*) FROM testfts3 WHERE a MATCH 'hello AND world';"
162+
).fetchone()[0],
163+
1,
164+
)
165+
166+
# fts3_tokenizer() takes/returns native pointers. Newer SQLite versions require the use of
167+
# bound parameters with this function to avoid the risk of a SQL injection esclating into a
168+
# full RCE. This requirement can be disabled at either compile time or runtime for
169+
# backwards compatibility. Ensure that the check is enabled (more secure) by default but
170+
# applications can still use fts3_tokenize with a bound parameter. See discussion at
171+
# https://github.com/astral-sh/python-build-standalone/pull/562#issuecomment-3254522958
172+
wild_pointer = struct.pack("P", 0xDEADBEEF)
173+
with self.assertRaises(sqlite3.OperationalError) as caught:
174+
cursor.execute(
175+
f"SELECT fts3_tokenizer('mytokenizer', x'{wild_pointer.hex()}')"
176+
)
177+
self.assertEqual(str(caught.exception), "fts3tokenize disabled")
178+
cursor.execute("SELECT fts3_tokenizer('mytokenizer', ?)", (wild_pointer,))
179+
140180
conn.close()
141181

142182
def test_ssl(self):

0 commit comments

Comments
 (0)