Skip to content

Commit aacdaa5

Browse files
authored
Merge pull request #19 from pavanjava/qql13
full docs and SEO optimization for QQL
2 parents d234c5f + bebe93f commit aacdaa5

14 files changed

Lines changed: 1770 additions & 1453 deletions

README.md

Lines changed: 50 additions & 1449 deletions
Large diffs are not rendered by default.

docs/_config.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
theme: minima
2+
title: "QQL — Qdrant Query Language"
3+
description: "SQL-like query language and CLI for Qdrant vector database — INSERT, SEARCH, hybrid search, reranking, quantization, and more."
4+
url: "https://pavanjava.github.io/qql"
5+
baseurl: "/qql"
6+
repository: "pavanjava/qql"
7+
8+
# Disable Jekyll processing of the HTML file (it has its own styling)
9+
include:
10+
- index.html

docs/collections.md

Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# Managing Collections
2+
3+
---
4+
5+
## SHOW COLLECTIONS — list collections
6+
7+
Lists all collections in the connected Qdrant instance.
8+
9+
```sql
10+
SHOW COLLECTIONS
11+
```
12+
13+
**Output:**
14+
```
15+
✓ 3 collection(s) found
16+
┌──────────────────┐
17+
│ Collection │
18+
├──────────────────┤
19+
│ articles │
20+
│ notes │
21+
│ products │
22+
└──────────────────┘
23+
```
24+
25+
---
26+
27+
## CREATE COLLECTION — create a collection
28+
29+
Explicitly creates a new empty collection. Collections are also created automatically on the first INSERT, so this command is optional — use it when you want to pre-create a collection before inserting data.
30+
31+
**Syntax:**
32+
```
33+
CREATE COLLECTION <collection_name>
34+
CREATE COLLECTION <collection_name> HYBRID
35+
CREATE COLLECTION <collection_name> USING MODEL '<model_name>'
36+
CREATE COLLECTION <collection_name> USING HYBRID
37+
CREATE COLLECTION <collection_name> USING HYBRID DENSE MODEL '<model>'
38+
```
39+
40+
Any of the above forms can be followed by an optional `QUANTIZE` clause — see [Quantization](#quantization--quantize-clause) below.
41+
42+
**Examples:**
43+
44+
Dense-only collection (standard, uses default model dimensions):
45+
```sql
46+
CREATE COLLECTION research_papers
47+
```
48+
49+
Dense-only collection pinned to a specific model (768-dimensional):
50+
```sql
51+
CREATE COLLECTION research_papers USING MODEL 'BAAI/bge-base-en-v1.5'
52+
```
53+
54+
Hybrid collection (dense + sparse BM25, default models):
55+
```sql
56+
CREATE COLLECTION research_papers HYBRID
57+
```
58+
59+
Hybrid collection with a custom dense model:
60+
```sql
61+
CREATE COLLECTION research_papers USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'
62+
```
63+
64+
When `USING MODEL` is omitted, the collection uses the **default embedding model's dimensions** (384 for `all-MiniLM-L6-v2`). If the collection already exists, the command succeeds with a message and does nothing.
65+
66+
---
67+
68+
## Quantization — QUANTIZE clause
69+
70+
Quantization reduces the memory footprint of vector collections and speeds up search at the cost of a small, controllable accuracy loss. QQL supports all three Qdrant quantization strategies via an optional `QUANTIZE` clause appended to `CREATE COLLECTION`.
71+
72+
**Three strategies:**
73+
74+
| Type | Compression | Accuracy Loss | Best For |
75+
|---|---|---|---|
76+
| `SCALAR` | 4× (float32 → int8) | < 1% | Most collections — best balance |
77+
| `BINARY` | 32× (float32 → 1-bit) | Higher | High-dimensional vectors (768+), speed priority |
78+
| `PRODUCT` | 4× (configurable) | Variable | Memory-constrained deployments |
79+
80+
**Full syntax:**
81+
```
82+
CREATE COLLECTION <name> ... QUANTIZE SCALAR [QUANTILE <0.0–1.0>] [ALWAYS RAM]
83+
CREATE COLLECTION <name> ... QUANTIZE BINARY [ALWAYS RAM]
84+
CREATE COLLECTION <name> ... QUANTIZE PRODUCT [ALWAYS RAM]
85+
```
86+
87+
- **`QUANTILE <float>`** — (scalar only) calibration quantile for the INT8 conversion; defaults to Qdrant's built-in default (0.99) when omitted.
88+
- **`ALWAYS RAM`** — keep the **quantized** vectors in RAM at all times, regardless of the collection's `on_disk` setting. Improves search throughput at the cost of higher RAM usage for the compressed index. The original full-precision vectors are stored and managed independently of this flag. Supported by all three quantization types.
89+
- **`QUANTIZE`** always appears **after** all other clauses (`HYBRID`, `USING MODEL`, etc.).
90+
- For `PRODUCT`, the compression ratio is fixed at **** in this version.
91+
- When used with `HYBRID` collections, quantization applies only to the **dense** vector.
92+
93+
**Examples:**
94+
95+
Scalar quantization (recommended default):
96+
```sql
97+
CREATE COLLECTION research_papers QUANTIZE SCALAR
98+
```
99+
100+
Scalar with explicit calibration and quantized vectors pinned to RAM:
101+
```sql
102+
CREATE COLLECTION research_papers QUANTIZE SCALAR QUANTILE 0.95 ALWAYS RAM
103+
```
104+
105+
Binary quantization for large high-dimensional embeddings:
106+
```sql
107+
CREATE COLLECTION research_papers QUANTIZE BINARY
108+
```
109+
110+
Product quantization for maximum memory savings:
111+
```sql
112+
CREATE COLLECTION research_papers QUANTIZE PRODUCT ALWAYS RAM
113+
```
114+
115+
Combined with hybrid collection:
116+
```sql
117+
CREATE COLLECTION research_papers HYBRID QUANTIZE SCALAR
118+
```
119+
120+
Combined with a pinned model:
121+
```sql
122+
CREATE COLLECTION research_papers USING MODEL 'BAAI/bge-base-en-v1.5' QUANTIZE SCALAR QUANTILE 0.99
123+
```
124+
125+
**Valid combinations:**
126+
127+
| Base form | + QUANTIZE SCALAR | + QUANTIZE BINARY | + QUANTIZE PRODUCT |
128+
|---|---|---|---|
129+
| `CREATE COLLECTION name` ||||
130+
| `... HYBRID` ||||
131+
| `... USING MODEL 'x'` ||||
132+
| `... USING HYBRID` ||||
133+
| `... USING HYBRID DENSE MODEL 'x'` ||||
134+
135+
> INSERT and SEARCH on quantized collections work exactly the same as on non-quantized ones — no changes to INSERT or SEARCH syntax are needed.
136+
137+
---
138+
139+
## CREATE INDEX — create a payload index
140+
141+
Creates a payload index on a collection field. Payload indexes speed up `WHERE` clause filtering by allowing Qdrant to efficiently match on indexed fields.
142+
143+
**Syntax:**
144+
```
145+
CREATE INDEX ON COLLECTION <collection_name> FOR <field_name> TYPE <schema_type>
146+
```
147+
148+
**Supported schema types:**
149+
150+
| Type | Description |
151+
|---|---|
152+
| `keyword` | Exact string match (e.g. status, category) |
153+
| `integer` | Whole numbers |
154+
| `float` | Decimal numbers |
155+
| `bool` | Boolean values |
156+
| `text` | Full-text search (enables `MATCH` operators) |
157+
| `geo` | Geospatial coordinates |
158+
| `datetime` | Date/time values |
159+
160+
**Examples:**
161+
162+
```sql
163+
CREATE INDEX ON COLLECTION articles FOR category TYPE keyword
164+
CREATE INDEX ON COLLECTION articles FOR year TYPE integer
165+
CREATE INDEX ON COLLECTION articles FOR title TYPE text
166+
CREATE INDEX ON COLLECTION articles FOR meta.author TYPE keyword
167+
```
168+
169+
**Rules:**
170+
- The collection must already exist. Raises an error otherwise.
171+
- Indexes are idempotent — creating the same index twice succeeds silently.
172+
173+
---
174+
175+
## DROP COLLECTION — delete a collection
176+
177+
Permanently deletes a collection and **all points inside it**. This operation is irreversible.
178+
179+
```sql
180+
DROP COLLECTION old_experiments
181+
```
182+
183+
Raises an error if the collection does not exist.
184+
185+
---
186+
187+
## DELETE — remove points
188+
189+
Deletes one or more points from a collection by specific ID or by a `WHERE` filter.
190+
191+
**Syntax:**
192+
```
193+
DELETE FROM <collection_name> WHERE id = '<point_id>'
194+
DELETE FROM <collection_name> WHERE id = <integer_id>
195+
DELETE FROM <collection_name> WHERE <filter>
196+
```
197+
198+
**Examples:**
199+
200+
```sql
201+
-- Delete by UUID
202+
DELETE FROM articles WHERE id = '3f2e1a4b-8c91-4d0e-b123-abc123def456'
203+
204+
-- Delete by integer ID
205+
DELETE FROM articles WHERE id = 42
206+
207+
-- Delete all points matching a filter
208+
DELETE FROM articles WHERE category = 'archived'
209+
210+
-- Delete with a compound filter
211+
DELETE FROM articles WHERE year < 2020 AND status = 'draft'
212+
```
213+
214+
**Notes:**
215+
- If no points match the filter or ID, the operation succeeds silently with a count of 0.
216+
- The collection itself must exist; deleting from a non-existent collection raises an error.

docs/filters.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# WHERE Clause Filters
2+
3+
The `WHERE` clause lets you filter on any payload field using SQL-style predicates. All standard comparison, range, membership, null-check, and full-text operators are supported.
4+
5+
`WHERE` works on `SEARCH`, `RECOMMEND`, and `DELETE` statements.
6+
7+
---
8+
9+
## Equality and inequality
10+
11+
```sql
12+
-- Exact match
13+
SEARCH articles SIMILAR TO 'ml' LIMIT 10 WHERE category = 'paper'
14+
15+
-- Not equal
16+
SEARCH articles SIMILAR TO 'ml' LIMIT 10 WHERE status != 'draft'
17+
```
18+
19+
---
20+
21+
## Range comparisons
22+
23+
```sql
24+
SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE score > 0.8
25+
SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE year < 2024
26+
SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE score >= 0.75
27+
SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE year <= 2023
28+
```
29+
30+
---
31+
32+
## BETWEEN … AND
33+
34+
```sql
35+
-- Inclusive range (equivalent to year >= 2018 AND year <= 2023)
36+
SEARCH articles SIMILAR TO 'history of ai' LIMIT 10 WHERE year BETWEEN 2018 AND 2023
37+
```
38+
39+
---
40+
41+
## IN and NOT IN
42+
43+
```sql
44+
SEARCH articles SIMILAR TO 'retrieval' LIMIT 10 WHERE status IN ('published', 'reviewed')
45+
SEARCH articles SIMILAR TO 'retrieval' LIMIT 10 WHERE status NOT IN ('deleted', 'archived')
46+
```
47+
48+
---
49+
50+
## IS NULL and IS NOT NULL
51+
52+
```sql
53+
SEARCH articles SIMILAR TO 'peer review' LIMIT 5 WHERE reviewer IS NULL
54+
SEARCH articles SIMILAR TO 'peer review' LIMIT 5 WHERE reviewer IS NOT NULL
55+
```
56+
57+
---
58+
59+
## IS EMPTY and IS NOT EMPTY
60+
61+
```sql
62+
SEARCH articles SIMILAR TO 'untagged' LIMIT 5 WHERE tags IS EMPTY
63+
SEARCH articles SIMILAR TO 'categorized' LIMIT 5 WHERE tags IS NOT EMPTY
64+
```
65+
66+
---
67+
68+
## Full-text MATCH
69+
70+
```sql
71+
-- All terms must appear in the field (requires a Qdrant full-text index)
72+
SEARCH articles SIMILAR TO 'search' LIMIT 10 WHERE title MATCH 'vector database'
73+
74+
-- Any term can match
75+
SEARCH articles SIMILAR TO 'search' LIMIT 10 WHERE title MATCH ANY 'embedding retrieval'
76+
77+
-- Exact phrase must appear
78+
SEARCH articles SIMILAR TO 'search' LIMIT 10 WHERE title MATCH PHRASE 'semantic search'
79+
```
80+
81+
> To use `MATCH` operators efficiently, create a full-text index first:
82+
> ```sql
83+
> CREATE INDEX ON COLLECTION articles FOR title TYPE text
84+
> ```
85+
86+
---
87+
88+
## AND, OR, NOT — logical operators
89+
90+
Operator precedence: `NOT` (highest) > `AND` > `OR` (lowest). Use parentheses to override.
91+
92+
```sql
93+
-- AND: both conditions must be true
94+
SEARCH articles SIMILAR TO 'nlp' LIMIT 10 WHERE category = 'paper' AND year >= 2020
95+
96+
-- OR: either condition can be true
97+
SEARCH articles SIMILAR TO 'llm' LIMIT 10 WHERE source = 'arxiv' OR source = 'pubmed'
98+
99+
-- NOT: negate a condition
100+
SEARCH articles SIMILAR TO 'benchmark' LIMIT 10 WHERE NOT status = 'draft'
101+
102+
-- Parentheses to group OR inside AND
103+
SEARCH articles SIMILAR TO 'conference paper' LIMIT 10
104+
WHERE (source = 'arxiv' OR source = 'ieee') AND year >= 2022
105+
106+
-- NOT on a parenthesized group
107+
SEARCH articles SIMILAR TO 'x' LIMIT 5 WHERE NOT (status = 'draft' OR status = 'deleted')
108+
```
109+
110+
---
111+
112+
## Dot-notation for nested fields
113+
114+
```sql
115+
SEARCH articles SIMILAR TO 'wikipedia' LIMIT 5 WHERE meta.source = 'web'
116+
SEARCH cities SIMILAR TO 'large city' LIMIT 5 WHERE country.cities[].population > 1000000
117+
```
118+
119+
---
120+
121+
## WHERE also works in hybrid mode
122+
123+
```sql
124+
SEARCH articles SIMILAR TO 'deep learning' LIMIT 10
125+
USING HYBRID WHERE year BETWEEN 2020 AND 2024 AND status = 'published'
126+
```
127+
128+
---
129+
130+
## WHERE in DELETE
131+
132+
```sql
133+
-- Delete by filter
134+
DELETE FROM articles WHERE category = 'archived'
135+
136+
-- Delete with compound filter
137+
DELETE FROM articles WHERE year < 2020 AND status = 'draft'
138+
```
139+
140+
---
141+
142+
## Full filter reference
143+
144+
| WHERE syntax | Description |
145+
|---|---|
146+
| `field = 'x'` | Exact match |
147+
| `field != 'x'` | Not equal |
148+
| `field > n` | Greater than |
149+
| `field >= n` | Greater than or equal |
150+
| `field < n` | Less than |
151+
| `field <= n` | Less than or equal |
152+
| `field BETWEEN a AND b` | Inclusive range |
153+
| `field IN ('a', 'b')` | Value in list |
154+
| `field NOT IN ('a', 'b')` | Value not in list |
155+
| `field IS NULL` | Field absent or null |
156+
| `field IS NOT NULL` | Field present and non-null |
157+
| `field IS EMPTY` | Field is an empty list |
158+
| `field IS NOT EMPTY` | Field is a non-empty list |
159+
| `field MATCH 'text'` | All terms present (full-text) |
160+
| `field MATCH ANY 'text'` | Any term present (full-text) |
161+
| `field MATCH PHRASE 'text'` | Exact phrase present (full-text) |
162+
| `A AND B` | Both conditions must hold |
163+
| `A OR B` | Either condition must hold |
164+
| `NOT A` | Condition must not hold |
165+
| `(A OR B) AND C` | Parentheses for grouping |
166+
| `meta.source = 'x'` | Dot-notation nested field |

0 commit comments

Comments
 (0)