@@ -123,6 +123,7 @@ Properties of `hnswlib.Index` that support reading and writing:
123123
124124
125125#### Python bindings examples
126+ [ See more examples here] ( examples/EXAMPLES.md )
126127``` python
127128import hnswlib
128129import numpy as np
@@ -229,104 +230,6 @@ labels, distances = p.knn_query(data, k=1)
229230print (" Recall for two batches:" , np.mean(labels.reshape(- 1 ) == np.arange(len (data))), " \n " )
230231```
231232
232- An example with a filter:
233- ``` python
234- import hnswlib
235- import numpy as np
236-
237- dim = 16
238- num_elements = 10000
239-
240- # Generating sample data
241- data = np.float32(np.random.random((num_elements, dim)))
242-
243- # Declaring index
244- hnsw_index = hnswlib.Index(space = ' l2' , dim = dim) # possible options are l2, cosine or ip
245-
246- # Initiating index
247- # max_elements - the maximum number of elements, should be known beforehand
248- # (probably will be made optional in the future)
249- #
250- # ef_construction - controls index search speed/build speed tradeoff
251- # M - is tightly connected with internal dimensionality of the data
252- # strongly affects the memory consumption
253-
254- hnsw_index.init_index(max_elements = num_elements, ef_construction = 100 , M = 16 )
255-
256- # Controlling the recall by setting ef:
257- # higher ef leads to better accuracy, but slower search
258- hnsw_index.set_ef(10 )
259-
260- # Set number of threads used during batch search/construction
261- # By default using all available cores
262- hnsw_index.set_num_threads(4 )
263-
264- print (" Adding %d elements" % (len (data)))
265- # Added elements will have consecutive ids
266- hnsw_index.add_items(data, ids = np.arange(num_elements))
267-
268- print (" Querying only even elements" )
269- # Define filter function that allows only even ids
270- filter_function = lambda idx : idx% 2 == 0
271- # Query the elements for themselves and search only for even elements:
272- labels, distances = hnsw_index.knn_query(data, k = 1 , filter = filter_function)
273- # labels contain only elements with even id
274- ```
275-
276- An example with replacing of deleted elements:
277- ``` python
278- import hnswlib
279- import numpy as np
280-
281- dim = 16
282- num_elements = 1_000
283- max_num_elements = 2 * num_elements
284-
285- # Generating sample data
286- labels1 = np.arange(0 , num_elements)
287- data1 = np.float32(np.random.random((num_elements, dim))) # batch 1
288- labels2 = np.arange(num_elements, 2 * num_elements)
289- data2 = np.float32(np.random.random((num_elements, dim))) # batch 2
290- labels3 = np.arange(2 * num_elements, 3 * num_elements)
291- data3 = np.float32(np.random.random((num_elements, dim))) # batch 3
292-
293- # Declaring index
294- hnsw_index = hnswlib.Index(space = ' l2' , dim = dim)
295-
296- # Initiating index
297- # max_elements - the maximum number of elements, should be known beforehand
298- # (probably will be made optional in the future)
299- #
300- # ef_construction - controls index search speed/build speed tradeoff
301- # M - is tightly connected with internal dimensionality of the data
302- # strongly affects the memory consumption
303-
304- # Enable replacing of deleted elements
305- hnsw_index.init_index(max_elements = max_num_elements, ef_construction = 200 , M = 16 , allow_replace_deleted = True )
306-
307- # Controlling the recall by setting ef:
308- # higher ef leads to better accuracy, but slower search
309- hnsw_index.set_ef(10 )
310-
311- # Set number of threads used during batch search/construction
312- # By default using all available cores
313- hnsw_index.set_num_threads(4 )
314-
315- # Add batch 1 and 2 data
316- hnsw_index.add_items(data1, labels1)
317- hnsw_index.add_items(data2, labels2) # Note: maximum number of elements is reached
318-
319- # Delete data of batch 2
320- for label in labels2:
321- hnsw_index.mark_deleted(label)
322-
323- # Replace deleted elements
324- # Maximum number of elements is reached therefore we cannot add new items,
325- # but we can replace the deleted ones by using replace_deleted=True
326- hnsw_index.add_items(data3, labels3, replace_deleted = True )
327- # hnsw_index contains the data of batch 1 and batch 3 only
328- ```
329-
330233### Bindings installation
331234
332235You can install from sources:
@@ -346,9 +249,9 @@ Contributions are highly welcome!
346249
347250Please make pull requests against the ` develop ` branch.
348251
349- When making changes please run tests (and please add a test to ` python_bindings/ tests` in case there is new functionality):
252+ When making changes please run tests (and please add a test to ` tests/python ` in case there is new functionality):
350253``` bash
351- python -m unittest discover --start-directory python_bindings/ tests --pattern " *_test *.py"
254+ python -m unittest discover --start-directory tests/python --pattern " bindings_test *.py"
352255```
353256
354257
@@ -373,7 +276,7 @@ https://github.com/dbaranchuk/ivf-hnsw
373276### 200M SIFT test reproduction
374277To download and extract the bigann dataset (from root directory):
375278``` bash
376- python3 download_bigann.py
279+ python tests/cpp/ download_bigann.py
377280```
378281To compile:
379282``` bash
@@ -393,7 +296,7 @@ The size of the BigANN subset (in millions) is controlled by the variable **subs
393296### Updates test
394297To generate testing data (from root directory):
395298``` bash
396- cd examples
299+ cd tests/cpp
397300python update_gen_data.py
398301```
399302To compile (from root directory):
0 commit comments