Skip to content

Commit b94cf78

Browse files
Copilotjoocer
andcommitted
Add documentation and update tests for wildcard support
Co-authored-by: joocer <[email protected]>
1 parent bfc0589 commit b94cf78

File tree

3 files changed

+33
-1
lines changed

3 files changed

+33
-1
lines changed

README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,37 @@ _this example requires a data file, [space_missions.parquet](https://storage.goo
201201

202202
</details>
203203

204+
<details>
205+
<summary>Query Multiple Files with Wildcards</summary>
206+
207+
In this example, we are querying multiple files using wildcard patterns. Opteryx supports `*` (any characters), `?` (single character), and `[range]` patterns in file paths.
208+
209+
~~~python
210+
# Import the Opteryx query engine.
211+
import opteryx
212+
213+
# Execute a SQL query to select data from all parquet files in a directory.
214+
# The wildcard '*' matches any characters in the filename.
215+
result = opteryx.query("SELECT * FROM 'data/*.parquet' LIMIT 10;")
216+
217+
# Display the result.
218+
result.head()
219+
~~~
220+
221+
You can also use more specific patterns:
222+
223+
~~~python
224+
# Query files matching a range pattern, e.g., file1.parquet through file9.parquet
225+
result = opteryx.query("SELECT COUNT(*) FROM 'data/file[1-9].parquet';")
226+
227+
# Query files with specific naming patterns
228+
result = opteryx.query("SELECT * FROM 'logs/2024-01-*.jsonl';")
229+
~~~
230+
231+
_Wildcards work with all supported file formats (Parquet, JSONL, CSV, etc.) and prevent path traversal for security._
232+
233+
</details>
234+
204235
<details>
205236
<summary>Query Data in SQLite</summary>
206237

tests/unit/connectors/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Test module for connectors

tests/unit/connectors/test_wildcard_paths.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ def test_wildcard_no_matches():
4040
stats = MockStatistics()
4141

4242
with pytest.raises(DatasetNotFoundError):
43-
FileConnector(dataset="/nonexistent/path/*.parquet", statistics=stats)
43+
FileConnector(dataset="nonexistent/path/*.parquet", statistics=stats)
4444

4545

4646
def test_path_traversal_protection():

0 commit comments

Comments
 (0)