Skip to content

Conversation

@PavelShilin89
Copy link
Contributor

Type of Change (select one):

  • New feature

Description of the Change:

  • Add basic KNN filtering test coverage for remote auto-embeddings in MCL test suite.
  • Test KNN search with text query parameter (auto-embeddings approach)
  • Test KNN search with additional WHERE filtering
    The tests verify that KNN filtering works correctly with remote embedding providers by querying existing test data and validating result format and count.

Related Issue (provide the link):

@github-actions
Copy link
Contributor

clt

❌ CLT tests in test/clt-tests/mcl/
✅ OK: 12
❌ Failed: 2
⏳ Duration: 406s
👉 Check Action Results for commit dd3f893

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/mcl/auto-embeddings-voyage-remote.rec
––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
cosine_similarity() {
    local file1="$1" file2="$2"

    awk '
    NR==FNR { a[NR]=$1; suma2+=$1*$1; next }
    {
        dot += a[FNR]*$1
        sumb2 += $1*$1
    }
    END {
        print dot / (sqrt(suma2) * sqrt(sumb2))
    }' "$file1" "$file2"
}
––– output –––
OK
––– input –––
export -f cosine_similarity
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_invalid_model (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'voyage/invalid-model-name-12345' FROM = 'title') " 2>&1
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_valid_model_no_api_key (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'voyage/voyage-3.5-lite' FROM = 'title') " 2>&1
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_voyage_remote (title TEXT, content TEXT, description TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'voyage/voyage-3.5-lite' FROM = 'title, content' API_KEY='${VOYAGE_API_KEY}') "; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SHOW CREATE TABLE test_voyage_remote"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_voyage_remote (id, title, content, description) VALUES(1, 'machine learning algorithms', 'deep neural networks and artificial intelligence', 'advanced AI research')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as record_count FROM test_voyage_remote WHERE id=1"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_voyage_remote (id, title, content, description) VALUES(2, 'machine learning algorithms', 'deep neural networks and artificial intelligence', 'different description')"

mysql -h0 -P9306 -e "SELECT embedding FROM test_voyage_remote WHERE id=1" | \
    grep -v embedding | \
    sed 's/[0-9]\+\(\.[0-9]\+\)\?/\n&\n/g' | \
    grep -E '^[0-9]+(\.[0-9]+)?$' | \
    awk '{printf "%.5f\n", $1}' > /tmp/vector1.txt

mysql -h0 -P9306 -e "SELECT embedding FROM test_voyage_remote WHERE id=2" | \
    grep -v embedding | \
    sed 's/[0-9]\+\(\.[0-9]\+\)\?/\n&\n/g' | \
    grep -E '^[0-9]+(\.[0-9]+)?$' | \
    awk '{printf "%.5f\n", $1}' > /tmp/vector2.txt

SIMILARITY=$(cosine_similarity /tmp/vector1.txt /tmp/vector2.txt)

echo "Cosine similarity: $SIMILARITY"

RESULT=$(awk -v sim="$SIMILARITY" 'BEGIN {
    if (sim > 0.99)
        print "SUCCESS: Same FROM fields produce similar vectors (similarity: " sim ")"
    else
        print "FAIL: Different vectors (FROM does not include description field and should not change generated vector value) (similarity: " sim ")"
}')

echo "$RESULT"

rm -f /tmp/vector1.txt /tmp/vector2.txt
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_voyage_title_only (title TEXT, content TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'voyage/voyage-3.5-lite' FROM = 'title' API_KEY='${VOYAGE_API_KEY}') "; mysql -h0 -P9306 -e "INSERT INTO test_voyage_title_only (id, title, content) VALUES(1, 'machine learning algorithms', 'completely different content here')"; MD5_MULTI=$(mysql -h0 -P9306 -e "SELECT embedding FROM test_voyage_remote WHERE id=1" | grep -v embedding | md5sum | awk '{print $1}'); MD5_SINGLE=$(mysql -h0 -P9306 -e "SELECT embedding FROM test_voyage_title_only WHERE id=1" | grep -v embedding | md5sum | awk '{print $1}'); echo "multi_field_md5: $MD5_MULTI"; echo "single_field_md5: $MD5_SINGLE"; if [ "$MD5_MULTI" != "$MD5_SINGLE" ]; then echo "SUCCESS: Different FROM specifications produce different vectors"; else echo "INFO: FROM field comparison result"; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test__invalid_field (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'voyage/text-embedding-ada-002' FROM = 'nonexistent_field') " 2>&1
––– output –––
OK
––– input –––
if mysql -h0 -P9306 -e "SHOW TABLES LIKE 'test_voyage_no_from'" | grep -q test_voyage_no_from; then mysql -h0 -P9306 -e "INSERT INTO test__no_from (id, title, embedding) VALUES(1, 'test title', '(0.1, 0.2, 0.3, 0.4, 0.5)')"; echo "insert_result: $?"; else echo "insert_result: skipped (table not created)"; fi
––– output –––
OK
––– input –––
if mysql -h0 -P9306 -e "SHOW TABLES LIKE 'test__no_from'" | grep -q test_voyage_no_from; then mysql -h0 -P9306 -e "SHOW CREATE TABLE test_voyage_no_from"; else echo "table_structure: skipped (table not created)"; fi
––– output –––
OK
––– input –––
if [ -n "$VOYAGE_API_KEY" ] && [ "$VOYAGE_API_KEY" != "dummy_key_for_testing" ]; then echo "API key is available for testing"; else echo "API key not available - using dummy for error testing"; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT id, knn_dist() FROM test_voyage_remote WHERE knn(embedding, 3, 'machine learning and artificial intelligence')"
––– output –––
- #!/.*id.*knn_dist().*/!#
+ +------+------------+
- #!/.*[12].*[0-9]+\.[0-9]+.*/!#
+ | id   | knn_dist() |
- #!/.*[12].*[0-9]+\.[0-9]+.*/!#
+ +------+------------+
+ |    2 | 0.35459742 |
+ |    1 | 0.35459742 |
+ +------+------------+
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as count FROM test_voyage_remote WHERE knn(embedding, 5, 'technology and AI') AND id > 0"
––– output –––
OK
test/clt-tests/mcl/auto-embeddings-openai-remote.rec
––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd --stopwait > /dev/null; stdbuf -oL searchd ${SEARCHD_ARGS:-} > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 'accepting connections' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Accepting connections!'; else echo 'Timeout or failed!'; fi
––– output –––
OK
––– input –––
cosine_similarity() {
    local file1="$1" file2="$2"

    awk '
    NR==FNR { a[NR]=$1; suma2+=$1*$1; next }
    {
        dot += a[FNR]*$1
        sumb2 += $1*$1
    }
    END {
        print dot / (sqrt(suma2) * sqrt(sumb2))
    }' "$file1" "$file2"
}
––– output –––
OK
––– input –––
export -f cosine_similarity
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_invalid_model (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'openai/invalid-model-name-12345' FROM = 'title') " 2>&1
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_valid_model_no_api_key (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'openai/text-embedding-ada-002' FROM = 'title') " 2>&1
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_openai_remote (title TEXT, content TEXT, description TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'openai/text-embedding-ada-002' FROM = 'title, content' API_KEY='${OPENAI_API_KEY}') "; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SHOW CREATE TABLE test_openai_remote"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_openai_remote (id, title, content, description) VALUES(1, 'machine learning algorithms', 'deep neural networks and artificial intelligence', 'advanced AI research')"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as record_count FROM test_openai_remote WHERE id=1"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_openai_remote (id, title, content, description) VALUES(2, 'machine learning algorithms', 'deep neural networks and artificial intelligence', 'different description')"

mysql -h0 -P9306 -e "SELECT embedding FROM test_openai_remote WHERE id=1" | \
    grep -v embedding | \
    sed 's/[0-9]\+\(\.[0-9]\+\)\?/\n&\n/g' | \
    grep -E '^[0-9]+(\.[0-9]+)?$' | \
    awk '{printf "%.5f\n", $1}' > /tmp/vector1.txt

mysql -h0 -P9306 -e "SELECT embedding FROM test_openai_remote WHERE id=2" | \
    grep -v embedding | \
    sed 's/[0-9]\+\(\.[0-9]\+\)\?/\n&\n/g' | \
    grep -E '^[0-9]+(\.[0-9]+)?$' | \
    awk '{printf "%.5f\n", $1}' > /tmp/vector2.txt

SIMILARITY=$(cosine_similarity /tmp/vector1.txt /tmp/vector2.txt)

echo "Cosine similarity: $SIMILARITY"

RESULT=$(awk -v sim="$SIMILARITY" 'BEGIN {
    if (sim > 0.99)
        print "SUCCESS: Same FROM fields produce similar vectors (similarity: " sim ")"
    else
        print "FAIL: Different vectors (FROM does not include description field and should not change generated vector value) (similarity: " sim ")"
}')

echo "$RESULT"

rm -f /tmp/vector1.txt /tmp/vector2.txt
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_openai_title_only (title TEXT, content TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'openai/text-embedding-ada-002' FROM = 'title' API_KEY='${OPENAI_API_KEY}') "; mysql -h0 -P9306 -e "INSERT INTO test_openai_title_only (id, title, content) VALUES(1, 'machine learning algorithms', 'completely different content here')"; MD5_MULTI=$(mysql -h0 -P9306 -e "SELECT embedding FROM test_openai_remote WHERE id=1" | grep -v embedding | md5sum | awk '{print $1}'); MD5_SINGLE=$(mysql -h0 -P9306 -e "SELECT embedding FROM test_openai_title_only WHERE id=1" | grep -v embedding | md5sum | awk '{print $1}'); echo "multi_field_md5: $MD5_MULTI"; echo "single_field_md5: $MD5_SINGLE"; if [ "$MD5_MULTI" != "$MD5_SINGLE" ]; then echo "SUCCESS: Different FROM specifications produce different vectors"; else echo "INFO: FROM field comparison result"; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_openai_invalid_field (title TEXT, embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2' MODEL_NAME = 'openai/text-embedding-ada-002' FROM = 'nonexistent_field') " 2>&1
––– output –––
OK
––– input –––
if mysql -h0 -P9306 -e "SHOW TABLES LIKE 'test_openai_no_from'" | grep -q test_openai_no_from; then mysql -h0 -P9306 -e "INSERT INTO test_openai_no_from (id, title, embedding) VALUES(1, 'test title', '(0.1, 0.2, 0.3, 0.4, 0.5)')"; echo "insert_result: $?"; else echo "insert_result: skipped (table not created)"; fi
––– output –––
OK
––– input –––
if mysql -h0 -P9306 -e "SHOW TABLES LIKE 'test_openai_no_from'" | grep -q test_openai_no_from; then mysql -h0 -P9306 -e "SHOW CREATE TABLE test_openai_no_from"; else echo "table_structure: skipped (table not created)"; fi
––– output –––
OK
––– input –––
if [ -n "$OPENAI_API_KEY" ] && [ "$OPENAI_API_KEY" != "dummy_key_for_testing" ]; then echo "API key is available for testing"; else echo "API key not available - using dummy for error testing"; fi
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT id, knn_dist() FROM test_openai_remote WHERE knn(embedding, 3, 'machine learning and artificial intelligence')"
––– output –––
- #!/.*id.*knn_dist().*/!#
+ +------+------------+
- #!/.*[12].*[0-9]+\.[0-9]+.*/!#
+ | id   | knn_dist() |
- #!/.*[12].*[0-9]+\.[0-9]+.*/!#
+ +------+------------+
+ |    2 | 0.05504094 |
+ |    1 | 0.05504094 |
+ +------+------------+
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) as count FROM test_openai_remote WHERE knn(embedding, 5, 'technology and AI') AND id > 0"
––– output –––
OK

@github-actions
Copy link
Contributor

clt

❌ CLT tests in test/clt-tests/sharding/cluster/ test/clt-tests/sharding/functional/ test/clt-tests/sharding/replication/ test/clt-tests/sharding/syntax/
✅ OK: 18
❌ Failed: 1
⏳ Duration: 281s
👉 Check Action Results for commit b880bbe

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/sharding/functional/functional-sharding-positive.rec
––– input –––
export INSTANCE=1
––– output –––
OK
––– input –––
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
––– output –––
OK
––– input –––
stdbuf -oL searchd -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
––– output –––
OK
––– input –––
export INSTANCE=2
––– output –––
OK
––– input –––
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
––– output –––
OK
––– input –––
stdbuf -oL searchd -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
––– output –––
OK
––– input –––
export INSTANCE=3
––– output –––
OK
––– input –––
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
––– output –––
OK
––– input –––
stdbuf -oL searchd -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
––– output –––
OK
––– input –––
export CLUSTER_NAME=replication
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "create cluster ${CLUSTER_NAME}"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "show status like 'cluster_${CLUSTER_NAME}_status'\G"
––– output –––
OK
––– input –––
for n in `seq 2 $INSTANCE`; do mysql -h0 -P${n}306 -e "join cluster ${CLUSTER_NAME} at '127.0.0.1:1312'"; done;
––– output –––
OK
––– input –––
mysql -h0 -P${INSTANCE}306 -e "show status like 'cluster_${CLUSTER_NAME}_status'\G"
––– output –––
OK
––– input –––
for port in 1306 2306 3306; do timeout 60 mysql -h0 -P$port -e "SHOW STATUS LIKE 'cluster_replication_status'\G" > /tmp/status_$port.log && grep -q "Value: primary" /tmp/status_$port.log && echo "Port $port: Node synced"; done && cat /tmp/status_1306.log
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "create table ${CLUSTER_NAME}:tbl1(id bigint) shards='5' rf='2'"; echo $?
––– output –––
OK
––– input –––
for i in 1 2 3; do mysql -h0 -P${i}306 -e "show tables from system\G"; done | grep 'system.tbl1_s' | sort -V
––– output –––
OK
––– input –––
mysql -h0 -P2306 -e "create table ${CLUSTER_NAME}:tbl2(id bigint) shards='4' rf='1'"; echo $?
––– output –––
OK
––– input –––
for i in 1 2 3; do mysql -h0 -P${i}306 -e "show tables from system\G"; done | grep 'system.tbl2_s' | sort -V
––– output –––
OK
––– input –––
mysql -h0 -P3306 -e "create table ${CLUSTER_NAME}:tbl3(id bigint) shards='5' rf='2'"; echo $?
––– output –––
OK
––– input –––
for i in 1 2 3; do mysql -h0 -P${i}306 -e "show tables from system\G"; done | grep 'system.tbl3_s' | sort -V
––– output –––
OK
––– input –––
mysql -h0 -P3306 -e "DROP TABLE ${CLUSTER_NAME}:tbl3"; echo $?
––– output –––
- 0
+ ERROR 1064 (42000) at line 1: Waiting timeout exceeded.
+ 1
––– input –––
mysql -h0 -P3306 -e "create table ${CLUSTER_NAME}:tbl3(id bigint) shards='10' rf='3'"; echo $?
––– output –––
OK
––– input –––
for i in 1 2 3; do mysql -h0 -P${i}306 -e "show tables from system\G"; done | grep 'system.tbl3_s' | sort -V
––– output –––
- Table: system.tbl3_s0
- Table: system.tbl3_s0
- Table: system.tbl3_s0
- Table: system.tbl3_s1
- Table: system.tbl3_s1
- Table: system.tbl3_s1
- Table: system.tbl3_s2
- Table: system.tbl3_s2
- Table: system.tbl3_s2
- Table: system.tbl3_s3
- Table: system.tbl3_s3
- Table: system.tbl3_s3
- Table: system.tbl3_s4
- Table: system.tbl3_s4
- Table: system.tbl3_s4
- Table: system.tbl3_s5
- Table: system.tbl3_s5
- Table: system.tbl3_s5
- Table: system.tbl3_s6
- Table: system.tbl3_s6
- Table: system.tbl3_s6
- Table: system.tbl3_s7
- Table: system.tbl3_s7
- Table: system.tbl3_s7
- Table: system.tbl3_s8
- Table: system.tbl3_s8
- Table: system.tbl3_s8
- Table: system.tbl3_s9
- Table: system.tbl3_s9
- Table: system.tbl3_s9
––– input –––
mysql -h0 -P3306 -e "create table ${CLUSTER_NAME}:tbl4(id bigint) shards='3000' rf='3' timeout='60'"; echo $?
––– output –––
OK
––– input –––
for i in 1 2 3; do mysql -h0 -P${i}306 -e "show tables from system\G"; done | grep -c 'system.tbl4_s'
––– output –––
OK
––– input –––
mysql -h0 -P3306 -e "create table ${CLUSTER_NAME}:tbl5(id bigint) shards='3001' rf='3'"?
––– output –––
OK

@github-actions
Copy link
Contributor

clt

❌ CLT tests in test/clt-tests/bugs/
✅ OK: 11
❌ Failed: 1
⏳ Duration: 123s
👉 Check Action Results for commit c9d45ee

Failed tests:

🔧 Edit failed tests in UI:

test/clt-tests/bugs/3847-conflict-handling-verification.rec
––– input –––
set -b +m
––– output –––
OK
––– input –––
grep -q 'threads = 4' test/clt-tests/base/searchd-with-flexible-ports.conf || sed -i '/searchd {/a\	threads = 4' test/clt-tests/base/searchd-with-flexible-ports.conf
––– output –––
OK
––– input –––
export INSTANCE=1
––– output –––
OK
––– input –––
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
––– output –––
OK
––– input –––
stdbuf -oL searchd -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
––– output –––
OK
––– input –––
export INSTANCE=2
––– output –––
OK
––– input –––
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
––– output –––
OK
––– input –––
stdbuf -oL searchd -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
––– output –––
OK
––– input –––
export INSTANCE=3
––– output –––
OK
––– input –––
mkdir -p /var/{run,lib,log}/manticore-${INSTANCE}
––– output –––
OK
––– input –––
stdbuf -oL searchd -c test/clt-tests/base/searchd-with-flexible-ports.conf > /dev/null
––– output –––
OK
––– input –––
if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore-${INSTANCE}/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore-${INSTANCE}/searchd.log; fi
––– output –––
OK
––– input –––
wait_for_sync() { sleep 0.5; for i in {1..10}; do c1=$(mysql -h0 -P1306 -sN -e "SELECT COUNT(*) FROM test:tbl1" 2>/dev/null | grep -oE '[0-9]+' | head -1); c2=$(mysql -h0 -P2306 -sN -e "SELECT COUNT(*) FROM test:tbl1" 2>/dev/null | grep -oE '[0-9]+' | head -1); c3=$(mysql -h0 -P3306 -sN -e "SELECT COUNT(*) FROM test:tbl1" 2>/dev/null | grep -oE '[0-9]+' | head -1); if [ "$c1" = "$c2" ] && [ "$c2" = "$c3" ] && [ -n "$c1" ]; then return 0; fi; sleep 0.5; done; return 1; }
––– output –––
OK
––– input –––
mkdir /var/{lib,log}/manticore-{1,2,3}/test
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "CREATE CLUSTER test 'test' as path"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P2306 -e "JOIN CLUSTER test at '127.0.0.1:1312' 'test' as path"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P3306 -e "JOIN CLUSTER test at '127.0.0.1:1312' 'test' as path"; echo $?
––– output –––
OK
––– input –––
sleep 2
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "CREATE TABLE tbl1 (id bigint, attr1 int)"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "ALTER CLUSTER test ADD tbl1"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "INSERT INTO test:tbl1 (id, attr1) VALUES (1,1), (3,2), (10,3), (11,4), (12,5), (13,6), (14,7), (15,8), (20,9)"; echo $?
––– output –––
OK
––– input –––
wait_for_sync && echo "Cluster synchronized" || echo "Sync timeout"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -NB -e "SELECT COUNT(*) FROM test:tbl1\G"
––– output –––
OK
––– input –––
mysql -h0 -P2306 -NB -e "SELECT COUNT(*) FROM test:tbl1\G"
––– output –––
OK
––– input –––
mysql -h0 -P3306 -NB -e "SELECT COUNT(*) FROM test:tbl1\G"
––– output –––
OK
––– input –––
manticore-load --host=127.0.0.1 --threads=4 --port=1306 --total=1000000 --query="REPLACE INTO test:tbl1 (id, attr1) VALUES (%RAND, %RAND)" --together --host=127.0.0.1 --threads=4 --port=2306 --total=1000000 --query="REPLACE INTO test:tbl1 (id, attr1) VALUES (%RAND, %RAND)" > /dev/null 2>&1 & LOAD_PID=$!; sleep 1; echo "Load started: $LOAD_PID"
––– output –––
OK
––– input –––
mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=1 WHERE id=13" & sleep 0.05; mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (10, 999)" & wait
––– output –––
OK
––– input –––
mysql -h0 -P2306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (11, 111)" & sleep 0.05; mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (10, 101)" & wait
––– output –––
OK
––– input –––
mysql -h0 -P2306 -e "DELETE FROM test:tbl1 WHERE id=3" & sleep 0.05; mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (10, 102)" & wait
––– output –––
OK
––– input –––
mysql -h0 -P2306 -e "INSERT INTO test:tbl1 (id, attr1) VALUES (100, 1)" & sleep 0.05; mysql -h0 -P1306 -e "INSERT INTO test:tbl1 (id, attr1) VALUES (200, 2)" & wait
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=1 WHERE id=13" 2>&1 & mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (13, 999)" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=1 WHERE id>13" 2>&1 & mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (14, 888)" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (3, 2)" > /dev/null 2>&1; sleep 2
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=1 WHERE attr1=2" 2>&1 & mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (3, 333)" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (3, 2)" > /dev/null 2>&1; sleep 2
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=1 WHERE attr1=2" 2>&1 & mysql -h0 -P1306 -e "DELETE FROM test:tbl1 WHERE id=3" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (3, 2)" > /dev/null 2>&1; sleep 2
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "DELETE FROM test:tbl1 WHERE id=3" 2>&1 & mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (3, 303)" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (1, 1)" > /dev/null 2>&1; sleep 2
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "DELETE FROM test:tbl1 WHERE id=1" 2>&1 & mysql -h0 -P1306 -e "DELETE FROM test:tbl1 WHERE id=1" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=111 WHERE id=15" 2>&1 & mysql -h0 -P1306 -e "UPDATE test:tbl1 SET attr1=222 WHERE id=15" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (1, 1001)" & sleep 0.05; mysql -h0 -P2306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (10, 1010)" & sleep 0.05; mysql -h0 -P3306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (20, 1020)" & wait
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P1306 -e "UPDATE test:tbl1 SET attr1=100 WHERE id=12" 2>&1 & mysql -h0 -P2306 -e "UPDATE test:tbl1 SET attr1=200 WHERE id=12" 2>&1 & mysql -h0 -P3306 -e "UPDATE test:tbl1 SET attr1=300 WHERE id=12" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (14, 14)" > /dev/null 2>&1; sleep 2
––– output –––
OK
––– input –––
conflicts=0; for i in {1..30}; do result=$( (mysql -h0 -P1306 -e "DELETE FROM test:tbl1 WHERE id=14" 2>&1 & mysql -h0 -P2306 -e "DELETE FROM test:tbl1 WHERE id=14" 2>&1 & mysql -h0 -P3306 -e "REPLACE INTO test:tbl1 (id, attr1) VALUES (15, 1500)" 2>&1 & wait) ); if echo "$result" | grep -q "error at PostRollback"; then ((conflicts++)); fi; done; echo "Conflicts: $conflicts/30"; test $conflicts -ge 1 && echo "PASS" || echo "FAIL"
––– output –––
Conflicts: %{NUMBER}/30
- PASS
+ FAIL
––– input –––
kill $LOAD_PID 2>/dev/null; wait $LOAD_PID 2>/dev/null; echo "Load stopped"
––– output –––
OK
––– input –––
wait_for_sync && echo "Final sync successful" || echo "Final sync failed"
––– output –––
OK
––– input –––
mysql -h0 -P1306 -NB -e "SELECT COUNT(*) FROM test:tbl1\G"
––– output –––
OK
––– input –––
mysql -h0 -P2306 -NB -e "SELECT COUNT(*) FROM test:tbl1\G"
––– output –––
OK
––– input –––
mysql -h0 -P3306 -NB -e "SELECT COUNT(*) FROM test:tbl1\G"
––– output –––
OK
––– input –––
c1=$(mysql -h0 -P1306 -sN -e "SELECT COUNT(*) FROM test:tbl1" | grep -oE '[0-9]+' | head -1); c2=$(mysql -h0 -P2306 -sN -e "SELECT COUNT(*) FROM test:tbl1" | grep -oE '[0-9]+' | head -1); c3=$(mysql -h0 -P3306 -sN -e "SELECT COUNT(*) FROM test:tbl1" | grep -oE '[0-9]+' | head -1); if [ "$c1" = "$c2" ] && [ "$c2" = "$c3" ]; then echo "All nodes synchronized ($c1 rows)"; else echo "Discrepancies: node1=$c1, node2=$c2, node3=$c3"; fi
––– output –––
OK
––– input –––
for i in 1 2 3; do grep -q 'FATAL:' /var/log/manticore-${i}/searchd.log && echo "Node #$i has FATAL" || echo "Node #$i OK"; done
––– output –––
OK

@PavelShilin89 PavelShilin89 merged commit d73fcf4 into master Nov 28, 2025
115 checks passed
@PavelShilin89 PavelShilin89 deleted the test/add-knn-tests-for-auto-embeddings branch November 28, 2025 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add SELECT...WHERE KNN tests for Jina, OpenAI, and Voyage auto-embeddings

3 participants