forked from erikbern/ann-benchmarks
-
Notifications
You must be signed in to change notification settings - Fork 1
Multiclient tool #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
GuyAv46
wants to merge
109
commits into
redisearch_benchmark
Choose a base branch
from
multiclient_tool
base: redisearch_benchmark
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
109 commits
Select commit
Hold shift + click to select a range
9a39023
added support for other algorithms
GuyAv46 d565635
re-write milvus.py file for thier new API
GuyAv46 d6c06af
added --run-group
GuyAv46 a80dd70
more updates for milvus algorithm
GuyAv46 eebac85
added support for multi-client build
GuyAv46 f52e623
milvus.py improvement
GuyAv46 4b07c48
default values update
GuyAv46 25ff7ae
rename TOP_K to KNN
GuyAv46 00f51b5
moved from L2 to either L2 or IP
05caf7e
added drop collection
253e5a8
Update ann_benchmarks/algorithms/milvus.py
94fc098
Merge pull request #3 from RedisAI/fix_milvus_metric
6ae9bff
Changes towards redisbench_admin integration. Workdir fix
filipecosta90 579b25a
added yandex 1B subset dataset generator
db852a5
moved to main folder
d0d5917
empty line
e714dd8
Merge pull request #4 from RedisAI/dataset_generator
947c6a9
hybrid datasets generator
d7ed689
write the id buckets to the hd5f file
0c0d073
empty line
1a8e116
Ensure workdir is used when creating build_stats results dir
filipecosta90 55b2e84
Merge pull request #6 from RedisAI/makedirs.fix
filipecosta90 f1d80d0
fix for passing number of runs
GuyAv46 5b98837
hybrid dataset generator. Redisearch hybrid load and run
766941b
fixed dataset name. redisearch fixes
0ba58d1
testers clients now read build stats and
GuyAv46 4f6e597
fixing types in multirun.py
GuyAv46 a6956c0
Merge pull request #5 from RedisAI/create_hybrid_datasets
bfa141e
Merge pull request #7 from RedisAI/build_stats
GuyAv46 e75dc79
added try..except
GuyAv46 78d73cc
Merge pull request #8 from RedisAI/build_stats
GuyAv46 bb906f1
Fixed redisearch query() on non hybrid runs
filipecosta90 73b674f
Merge pull request #9 from RedisAI/run.fix.text
filipecosta90 fc9ae71
aggregate clients
GuyAv46 aced151
Merge pull request #10 from RedisAI/aggregate_testers
GuyAv46 1e9c684
improved assertion log
GuyAv46 afe607d
Merge pull request #11 from RedisAI/aggregate_testers
GuyAv46 7a5bc76
fix hybrid creation. added big ann
ddeb8f8
updated big ann bucket
b9a1897
fixed big ann hybrid datasets name
2dd0684
Merge pull request #12 from RedisAI/fix_hybrid_creation_big_ann
d2d91b7
fixed initial capacity on FT.Create
316d308
Merge pull request #13 from RedisAI/fix_initial_cap
e98a337
fixed race condition
GuyAv46 a3bce91
wip
43a3e21
sirealized run-groups
GuyAv46 93f1344
skips aggregate files when running with 1 client
GuyAv46 19f82a5
added dialect 2 for redisreach
GuyAv46 1b9e3fb
added comments
GuyAv46 394109a
Merge pull request #15 from RedisAI/sirealize_run_groups
GuyAv46 a6f8345
In multirun change to the proper workdir asap
filipecosta90 41d3848
Merge pull request #16 from RedisAI/fix.get_run_groups
filipecosta90 e1a4d38
report memory in kb
95b2c18
Merge pull request #17 from RedisAI/memory_in_kb
ab08e39
fix float conversion of vector_index_sz_mb before multiplying
filipecosta90 0ce7a5a
Fixes per PR review
filipecosta90 fc17cb2
Merge pull request #18 from RedisAI/fix.memory_in_kb
filipecosta90 fe2e4b2
Merge remote-tracking branch 'origin/multiclient_tool' into fix_multi…
filipecosta90 5b6ade4
Merge pull request #14 from RedisAI/fix_multiclient_flow
filipecosta90 1b7f862
Revert "wip"
filipecosta90 55d4575
Revert 'wip'
filipecosta90 79aeedc
changed watcher to watch results dir
423ad07
Update multirun.py
4be13fb
Merge pull request #19 from RedisAI/fix_watcher
495bbba
dbpedia
0dd4689
fixed PR comment
0c56189
Merge pull request #20 from RedisAI/dbpedia_dataset
f9970c2
amazon reviews
7aeed36
fixed amazon review dataset creation
70b3a73
Merge pull request #21 from RedisAI/amazon_reviews
4566945
added shards aux arg
2316942
Merge pull request #22 from RedisAI/shards_arg
ab5ceb0
Fixed shards arg usage on multirun/redisearch
filipecosta90 7ee8bd5
Merge pull request #23 from RedisAI/fix.shards
filipecosta90 c21597d
redisearch ef runtime in algo name
8acc9f1
print qps to stdout
47c7ddd
new line
ca755d0
Merge pull request #24 from RedisAI/add_qps_and_redisearch_efruntime
filipecosta90 eaa4626
fix dbpedia download
5b7ada9
Merge pull request #25 from RedisAI/fix_dbpedia_download
2d6a223
Enable recall/latency charts on results
filipecosta90 3cd6fbb
removed create command optimizations
1bb2b77
Merge pull request #27 from RedisAI/fix_tf_create_command
filipecosta90 cf55960
Fixes per PR review
filipecosta90 46c5430
Merge pull request #26 from RedisAI/multiclient_latencies
filipecosta90 cbe99b0
Milvus update (#28)
GuyAv46 592f3ed
Added pinecone client (#29)
GuyAv46 12183f1
splitting the test load between test clients (#30)
GuyAv46 58c10f6
Update requirements.txt
accc744
Update requirements.txt
9bbf401
Merge pull request #31 from RedisAI/multirun-patch-1
fb3a46c
fixing double fetching(#33)
GuyAv46 020ad08
fixing bulk insertion (#36)
GuyAv46 3b5012d
Elastic client update (#34)
GuyAv46 80e8bf1
added vecsim lib algo
GuyAv46 d19900f
add dummy docker tag
GuyAv46 e092992
removing password from filename
GuyAv46 2f3000f
skipping using multiprocessing when `parallelism == 1`
GuyAv46 f9b5422
added throughput metric (collect start and end time)
GuyAv46 63d08d0
Updated throughput tracking
filipecosta90 f68fc3c
Merge pull request #38 from RedisAI/guyav-throughput_graph
filipecosta90 9d31f9f
Fixed algorithm str when using FLAT on redisearch
filipecosta90 0bcd51c
Merge pull request #39 from RedisAI/fix.ef
filipecosta90 fb02421
Ensure all primaries receive the FT.CREATE due to 'missing index erro…
filipecosta90 f92c29b
Merge pull request #40 from RedisAI/fix.ft.create
filipecosta90 166d1f4
Fix unary typo on redisearch __str__
filipecosta90 26c4daa
Merge pull request #41 from RedisAI/fix.unary.str
filipecosta90 cc36963
disable query timeout on redisearch
filipecosta90 765f2bf
Merge pull request #42 from RedisAI/timeout.0
filipecosta90 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,45 +1,71 @@ | ||
| from __future__ import absolute_import | ||
| import milvus | ||
| from sqlite3 import paramstyle | ||
| from pymilvus import ( | ||
| connections, | ||
| utility, | ||
| FieldSchema, | ||
| CollectionSchema, | ||
| DataType, | ||
| IndexType, | ||
| Collection, | ||
| ) | ||
| import numpy | ||
| import sklearn.preprocessing | ||
| from ann_benchmarks.algorithms.base import BaseANN | ||
|
|
||
|
|
||
| class Milvus(BaseANN): | ||
| def __init__(self, metric, index_type, nlist): | ||
| self._nlist = nlist | ||
| def __init__(self, metric, conn_params, index_type, method_params): | ||
| self._host = conn_params['host'] | ||
| self._port = conn_params['port'] # 19530 | ||
| # connections.connect(host=conn_params['host'], port=conn_params['port']) | ||
| # fields = [ | ||
| # FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False), | ||
| # FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=100) | ||
| # ] | ||
| # schema = CollectionSchema(fields) | ||
| # self._milvus = Collection('milvus', schema) | ||
| self._index_type = index_type | ||
| self._method_params = method_params | ||
| self._nprobe = None | ||
| self._metric = metric | ||
| self._milvus = milvus.Milvus() | ||
| self._milvus.connect(host='localhost', port='19530') | ||
| self._table_name = 'test01' | ||
| self._index_type = index_type | ||
|
|
||
| def fit(self, X): | ||
| if self._metric == 'angular': | ||
| X = sklearn.preprocessing.normalize(X, axis=1, norm='l2') | ||
| X = sklearn.preprocessing.normalize(X, axis=1) | ||
|
|
||
| # TODO: if we can set the dim later, mabe return this to the init func | ||
| connections.connect(host=self._host, port=self._port) | ||
| fields = [ | ||
| FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False), | ||
| FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=len(X[0])) | ||
| ] | ||
| schema = CollectionSchema(fields) | ||
| self._milvus = Collection('milvus', schema) | ||
|
|
||
| self._milvus.create_table({'table_name': self._table_name, 'dimension': X.shape[1]}) | ||
| vector_ids = [id for id in range(len(X))] | ||
| self._milvus.insert(table_name=self._table_name, records=X.tolist(), ids=vector_ids) | ||
| index_type = getattr(milvus.IndexType, self._index_type) # a bit hacky but works | ||
| self._milvus.create_index(self._table_name, {'index_type': index_type, 'nlist': self._nlist}) | ||
| self._milvus.insert([[id for id in range(len(X))], X.tolist()]) | ||
| self._milvus.create_index('vector', {'index_type': self._index_type, 'metric_type':'L2', 'params':self._method_params}) | ||
GuyAv46 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| self._milvus.load() | ||
|
|
||
| def set_query_arguments(self, nprobe): | ||
| if nprobe > self._nlist: | ||
| print('warning! nprobe > nlist') | ||
| nprobe = self._nlist | ||
| self._nprobe = nprobe | ||
| def set_query_arguments(self, param): | ||
| self._query_params = dict() | ||
| if 'IVF_' in self._index_type: | ||
| if param > self._method_params['nlist']: | ||
| print('warning! nprobe > nlist') | ||
| param = self._method_params['nlist'] | ||
| self._query_params['nprobe'] = param | ||
| if 'HNSW' in self._index_type: | ||
| self._query_params['ef'] = param | ||
|
|
||
| def query(self, v, n): | ||
| if self._metric == 'angular': | ||
| v /= numpy.linalg.norm(v) | ||
| v = v.tolist() | ||
| status, results = self._milvus.search(table_name=self._table_name, query_records=[v], top_k=n, nprobe=self._nprobe) | ||
| results = self._milvus.search([v], 'vector', {'metric_type':'L2', 'params':self._query_params}, limit=n) | ||
| if not results: | ||
| return [] # Seems to happen occasionally, not sure why | ||
| result_ids = [result.id for result in results[0]] | ||
| return result_ids | ||
|
|
||
| def __str__(self): | ||
| return 'Milvus(index_type=%s, nlist=%d, nprobe=%d)' % (self._index_type, self._nlist, self._nprobe) | ||
| return 'Milvus(index_type=%s, method_params=%s, query_params=%s)' % (self._index_type, str(self._method_params), str(self._nprobe)) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.