Skip to content

Commit 3178018

Browse files
committed
Add documentation on how to define fields with vector indexing
1 parent 99ae3fb commit 3178018

File tree

4 files changed

+40
-1
lines changed

4 files changed

+40
-1
lines changed

docs/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,7 @@
122122
* [Storage Algorithm](technical-details/reference/storage-algorithm.md)
123123
* [Release Notes](technical-details/release-notes/README.md)
124124
* [Harper Tucker (Version 4)](technical-details/release-notes/4.tucker/README.md)
125+
* [4.6.0](technical-details/release-notes/4.tucker/4.6.0.md)
125126
* [4.5.2](technical-details/release-notes/4.tucker/4.5.2.md)
126127
* [4.5.1](technical-details/release-notes/4.tucker/4.5.1.md)
127128
* [4.5.0](technical-details/release-notes/4.tucker/4.5.0.md)

docs/developers/applications/defining-schemas.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,35 @@ The `@primaryKey` directive specifies that an attribute is the primary key for a
169169

170170
#### `@indexed`
171171

172-
The `@indexed` directive specifies that an attribute should be indexed. This is necessary if you want to execute queries using this attribute (whether that is through RESTful query parameters, SQL, or NoSQL operations).
172+
The `@indexed` directive specifies that an attribute should be indexed. When an attribute is indexed, Harper will create secondary index from the data in this field for fast/efficient querying using this field. This is necessary if you want to execute queries using this attribute (whether that is through RESTful query parameters, SQL, or NoSQL operations).
173+
174+
A standard index will index the values in each field, so you can query directly by those values. If the field's value is an array, each of the values in the array will be indexed (you can query by any individual value).
175+
176+
#### Vector Indexing
177+
178+
The `@indexed` directive can also specify a `type`. To use vector indexing, you can specify the `type` as `HNSW` for Hierarchical Navigable Small World indexing. This will create a vector index for the attribute. For example:
179+
```graphql
180+
type Product @table {
181+
id: Long @primaryKey
182+
textEmbeddings: [Float] @indexed(type: "HNSW")
183+
}
184+
```
185+
186+
HNSW indexing finds the nearest neighbors to a search vector. To use this, you can query with a `sort` parameter, for example:
187+
```javascript
188+
let results = Product.search({
189+
sort: { attribute: 'textEmbeddings', target: searchVector },
190+
limit: 5 // get the five nearest neighbors
191+
})
192+
```
193+
194+
HNSW supports several additional arguments to the `@indexed` directive to adjust the HNSW parameters:
195+
* `distance` - Define the distance function. This can be set to 'euclidean' or 'cosine' (uses negative of cosine similarity). The default is cosine.
196+
* `efConstruction` - Maximum number of nodes to keep in the list for finding nearest neighbors. A higher value can yield better recall, and a lower value can have better performance. If `efSearchConstruction` is set, this is only applied to indexing. The default is 100.
197+
* `M` - The preferred number of connections at each layer in the HNSW graph. A higher number uses more space but can be helpful when the intrinsic dimensionality of the data is higher. A lower number can be more efficient. The default is 16.
198+
* `optimizeRouting` - This uses a heuristic to avoid graph connections that match existing indirect connections (connections through another node). This can yield more efficient graph traversals for the same M setting. This is a number between 0 and 1 and a higher value will more aggressively omit connections with alternate paths. Setting this to 0 will disable route optimizing and follow the traditional HNSW algorithm for creating connections. The default is 0.5.
199+
* `mL` - The normalization factor for level generation, by default this is computed from `M`.
200+
* `efSearchConstruction` - Maximum number of nodes to keep in the list for finding nearest neighbors for searching. The default is 50.
173201

174202
#### `@createdTime`
175203

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# 4.6.0
2+
3+
#### HarperDB 4.6.0
4+
5+
6/13/2025
6+
7+
### Vector Indexing
8+
4.6 introduces vector indexing support with the Hierarchical Navigable Small World (HNSW) algorithm. This provides powerful efficient vector-based searching for semantic and AI-based querying functionality. HNSW maintains an optimal balance of recall rate with efficient, high-performance execution.

docs/technical-details/release-notes/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
[Meet Tucker](4.tucker/tucker.md) Our 4th Release Pup
66

7+
[4.6.0 Tucker](4.tucker/4.6.0.md)
8+
79
[4.5.2 Tucker](4.tucker/4.5.2.md)
810

911
[4.5.1 Tucker](4.tucker/4.5.1.md)

0 commit comments

Comments
 (0)