Skip to content

Commit 22c8182

Browse files
committed
ai: add Milvus vector database benchmarking support
Add initial AI/ML workflow infrastructure starting with Milvus vector database benchmarking. This provides a foundation for testing AI systems with the same rigor as existing kernel testing workflows (fstests, blktests). Key features: - Docker-based Milvus deployment with etcd and MinIO - Supports using a dedicated drive for docker /var/lib/docker/ including custom filesystem configurations - Python virtual environment management for benchmark dependencies - Comprehensive benchmarking of vector operations (insert, search, delete) - A/B testing support for baseline vs development comparisons - Performance visualization focusing on key metrics (QPS, latency) - Result collection and analysis infrastructure Performance Metrics: The benchmarks focus on two critical vector database metrics: - QPS (Queries Per Second): Throughput measurement for search operations - Latency: Response time percentiles (p50, p95, p99) for operations Recall rate measurement is challenging without ground truth data - the correct answers must be known beforehand to measure search accuracy. Since we generate random vectors for testing, establishing meaningful ground truth would require careful similarity calculations that would essentially duplicate the work being tested. Defconfigs: - ai-milvus-docker: Standard Docker-based Milvus deployment - ai-milvus-docker-ci: CI-optimized with minimal dataset (1000 vectors) Workflow integration follows kdevops patterns: make defconfig-ai-milvus-docker make bringup make ai # Setup infrastructure make ai-tests # Run benchmarks make ai-results # View results The implementation handles proper cleanup, lock file management, and comprehensive error handling to ensure reliable benchmark execution. Generated-by: Claude AI Signed-off-by: Luis Chamberlain <[email protected]>
1 parent 4a16017 commit 22c8182

File tree

75 files changed

+10001
-1
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+10001
-1
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@ scripts/workflows/fstests/lib/__pycache__/
3232
scripts/workflows/blktests/lib/__pycache__/
3333
scripts/workflows/lib/__pycache__/
3434

35-
3635
include/
3736

3837
# You can override role specific stuff on these
@@ -48,7 +47,9 @@ playbooks/secret.yml
4847
playbooks/python/workflows/fstests/__pycache__/
4948
playbooks/python/workflows/fstests/lib/__pycache__/
5049
playbooks/python/workflows/fstests/gen_results_summary.pyc
50+
playbooks/roles/ai_run_benchmarks/files/__pycache__/
5151

52+
workflows/ai/results/
5253
workflows/pynfs/results/
5354

5455
workflows/fstests/new_expunge_files.txt

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ Table of Contents
1414
* [reboot-limit](#reboot-limit)
1515
* [sysbench](#sysbench)
1616
* [fio-tests](#fio-tests)
17+
* [AI workflow](#ai-workflow)
1718
* [kdevops chats](#kdevops-chats)
1819
* [kdevops on discord](#kdevops-on-discord)
1920
* [kdevops IRC](#kdevops-irc)
@@ -273,6 +274,22 @@ A/B testing capabilities, and advanced graphing and visualization support. For
273274
detailed configuration and usage information, refer to the
274275
[kdevops fio-tests documentation](docs/fio-tests.md).
275276

277+
### AI workflow
278+
279+
kdevops now supports AI/ML system benchmarking, starting with vector databases
280+
like Milvus. Similar to fstests, you can quickly set up and benchmark AI
281+
infrastructure with just a few commands:
282+
283+
```bash
284+
make defconfig-ai-milvus-docker
285+
make bringup
286+
make ai
287+
```
288+
289+
The AI workflow supports A/B testing, filesystem performance impact analysis,
290+
and comprehensive benchmarking of vector similarity search workloads. For
291+
details, see the [kdevops AI workflow documentation](docs/ai/README.md).
292+
276293
## kdevops chats
277294

278295
We use discord and IRC. Right now we have more folks on discord than on IRC.
@@ -324,6 +341,7 @@ want to just use the kernel that comes with your Linux distribution.
324341
* [kdevops NFS docs](docs/nfs.md)
325342
* [kdevops selftests docs](docs/selftests.md)
326343
* [kdevops reboot-limit docs](docs/reboot-limit.md)
344+
* [kdevops AI workflow docs](docs/ai/README.md)
327345

328346
# kdevops general documentation
329347

defconfigs/ai-milvus-docker

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# AI benchmarking configuration for Milvus vector database testing
2+
CONFIG_KDEVOPS_FIRST_RUN=n
3+
CONFIG_LIBVIRT=y
4+
CONFIG_LIBVIRT_URI="qemu:///system"
5+
CONFIG_LIBVIRT_HOST_PASSTHROUGH=y
6+
CONFIG_LIBVIRT_MACHINE_TYPE_DEFAULT=y
7+
CONFIG_LIBVIRT_CPU_MODEL_PASSTHROUGH=y
8+
CONFIG_LIBVIRT_VCPUS=4
9+
CONFIG_LIBVIRT_RAM=8192
10+
CONFIG_LIBVIRT_OS_VARIANT="generic"
11+
CONFIG_LIBVIRT_STORAGE_POOL_PATH_CUSTOM=n
12+
CONFIG_LIBVIRT_STORAGE_POOL_CREATE=y
13+
CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_NVME=y
14+
CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_SIZE="100"
15+
16+
# Network configuration
17+
CONFIG_KDEVOPS_NETWORK_TYPE_NATUAL_BRIDGE=y
18+
19+
# Workflow configuration
20+
CONFIG_WORKFLOWS=y
21+
CONFIG_WORKFLOWS_TESTS=y
22+
CONFIG_WORKFLOWS_LINUX_TESTS=y
23+
CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
24+
CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
25+
26+
# AI workflow configuration
27+
CONFIG_AI_TESTS_VECTOR_DATABASE=y
28+
CONFIG_AI_VECTOR_DB_MILVUS=y
29+
CONFIG_AI_VECTOR_DB_MILVUS_DOCKER=y
30+
31+
# Milvus Docker configuration
32+
CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5=y
33+
CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_STRING="milvusdb/milvus:v2.5.10"
34+
CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_NAME="milvus-ai-benchmark"
35+
CONFIG_AI_VECTOR_DB_MILVUS_ETCD_CONTAINER_IMAGE_STRING="quay.io/coreos/etcd:v3.5.18"
36+
CONFIG_AI_VECTOR_DB_MILVUS_ETCD_CONTAINER_NAME="milvus-etcd"
37+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_CONTAINER_IMAGE_STRING="minio/minio:RELEASE.2023-03-20T20-16-18Z"
38+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_CONTAINER_NAME="milvus-minio"
39+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_ACCESS_KEY="minioadmin"
40+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_SECRET_KEY="minioadmin"
41+
42+
# Docker storage configuration
43+
CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_DATA_PATH="/data/milvus-data"
44+
CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_ETCD_DATA_PATH="/data/milvus-etcd"
45+
CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_MINIO_DATA_PATH="/data/milvus-minio"
46+
CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_NETWORK_NAME="milvus-network"
47+
48+
# Docker ports
49+
CONFIG_AI_VECTOR_DB_MILVUS_PORT=19530
50+
CONFIG_AI_VECTOR_DB_MILVUS_WEB_UI_PORT=9091
51+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_API_PORT=9000
52+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_CONSOLE_PORT=9001
53+
CONFIG_AI_VECTOR_DB_MILVUS_ETCD_CLIENT_PORT=2379
54+
CONFIG_AI_VECTOR_DB_MILVUS_ETCD_PEER_PORT=2380
55+
56+
# Docker resource limits
57+
CONFIG_AI_VECTOR_DB_MILVUS_MEMORY_LIMIT="8g"
58+
CONFIG_AI_VECTOR_DB_MILVUS_CPU_LIMIT="4.0"
59+
CONFIG_AI_VECTOR_DB_MILVUS_ETCD_MEMORY_LIMIT="1g"
60+
CONFIG_AI_VECTOR_DB_MILVUS_MINIO_MEMORY_LIMIT="2g"
61+
62+
# Milvus connection configuration
63+
CONFIG_AI_VECTOR_DB_MILVUS_COLLECTION_NAME="benchmark_collection"
64+
CONFIG_AI_VECTOR_DB_MILVUS_DIMENSION=768
65+
CONFIG_AI_VECTOR_DB_MILVUS_DATASET_SIZE=1000000
66+
CONFIG_AI_VECTOR_DB_MILVUS_BATCH_SIZE=10000
67+
CONFIG_AI_VECTOR_DB_MILVUS_NUM_QUERIES=10000
68+
69+
# Benchmark configuration
70+
CONFIG_AI_BENCHMARK_ITERATIONS=3
71+
# Vector dataset configuration
72+
CONFIG_AI_VECTOR_DB_MILVUS_DIMENSION=128
73+
74+
# Test runtime configuration
75+
CONFIG_AI_BENCHMARK_RUNTIME="180"
76+
CONFIG_AI_BENCHMARK_WARMUP_TIME="30"
77+
78+
# Query patterns for CI testing
79+
CONFIG_AI_BENCHMARK_QUERY_TOPK_1=y
80+
CONFIG_AI_BENCHMARK_QUERY_TOPK_10=y
81+
CONFIG_AI_BENCHMARK_QUERY_TOPK_100=n
82+
83+
# Batch size configuration for CI
84+
CONFIG_AI_BENCHMARK_BATCH_1=y
85+
CONFIG_AI_BENCHMARK_BATCH_10=y
86+
CONFIG_AI_BENCHMARK_BATCH_100=n
87+
88+
# Index configuration
89+
CONFIG_AI_INDEX_HNSW=y
90+
CONFIG_AI_INDEX_TYPE="HNSW"
91+
CONFIG_AI_INDEX_HNSW_M=16
92+
CONFIG_AI_INDEX_HNSW_EF_CONSTRUCTION=200
93+
CONFIG_AI_INDEX_HNSW_EF=64
94+
95+
# Results and graphing
96+
CONFIG_AI_BENCHMARK_RESULTS_DIR="/data/ai-benchmark"
97+
CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
98+
CONFIG_AI_BENCHMARK_GRAPH_FORMAT="png"
99+
CONFIG_AI_BENCHMARK_GRAPH_DPI=300
100+
CONFIG_AI_BENCHMARK_GRAPH_THEME="default"
101+
102+
# Filesystem configuration
103+
CONFIG_AI_FILESYSTEM_XFS=y
104+
CONFIG_AI_FILESYSTEM="xfs"
105+
CONFIG_AI_FSTYPE="xfs"
106+
CONFIG_AI_XFS_MKFS_OPTS="-f -s size=4096"
107+
CONFIG_AI_XFS_MOUNT_OPTS="rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
108+
109+
# Baseline/dev testing setup
110+
CONFIG_KDEVOPS_BASELINE_AND_DEV=y
111+
# Build Linux
112+
CONFIG_WORKFLOW_LINUX_CUSTOM=y
113+
CONFIG_BOOTLINUX_AB_DIFFERENT_REF=y

defconfigs/ai-milvus-docker-ci

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# SPDX-License-Identifier: copyleft-next-0.3.1
2+
#
3+
# AI vector database benchmarking for CI testing
4+
# Uses minimal dataset size and short runtime for quick verification
5+
6+
CONFIG_KDEVOPS_FIRST_RUN=y
7+
CONFIG_GUESTFS=y
8+
CONFIG_GUESTFS_DEBIAN=y
9+
CONFIG_GUESTFS_DEBIAN_TRIXIE=y
10+
11+
# Enable AI workflow
12+
CONFIG_WORKFLOWS_TESTS=y
13+
CONFIG_WORKFLOWS_LINUX_TESTS=y
14+
CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
15+
CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
16+
CONFIG_AI_TESTS_VECTOR_DATABASE=y
17+
18+
# Docker deployment
19+
CONFIG_AI_VECTOR_DB_MILVUS=y
20+
CONFIG_AI_VECTOR_DB_MILVUS_DOCKER=y
21+
22+
# CI-optimized: Use custom small dataset
23+
CONFIG_AI_DATASET_CUSTOM=y
24+
25+
# Small vector dimensions for faster processing
26+
CONFIG_AI_VECTOR_DIM_128=y
27+
28+
# Minimal query configurations
29+
CONFIG_AI_BENCHMARK_QUERY_TOPK_1=y
30+
CONFIG_AI_BENCHMARK_BATCH_1=y
31+
32+
# Fast HNSW indexing
33+
CONFIG_AI_INDEX_HNSW=y
34+
35+
# Short runtime for CI
36+
# These will be overridden by environment variables in CI:
37+
# AI_VECTOR_DATASET_SIZE=1000
38+
# AI_BENCHMARK_RUNTIME=30
39+
40+
# Reduced resource limits for CI
41+
CONFIG_AI_VECTOR_DB_MILVUS_MEMORY_LIMIT="2g"
42+
CONFIG_AI_VECTOR_DB_MILVUS_CPU_LIMIT="2.0"
43+
44+
# Enable graphing for result verification
45+
CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
46+
47+
# XFS filesystem (fastest for AI workloads)
48+
CONFIG_AI_FILESYSTEM_XFS=y
49+
50+
# A/B testing enabled for baseline/dev comparison
51+
CONFIG_KDEVOPS_BASELINE_AND_DEV=y

docs/ai/README.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# AI Workflow Documentation
2+
3+
The kdevops AI workflow provides infrastructure for benchmarking and testing AI/ML systems, with initial support for vector databases.
4+
5+
## Quick Start
6+
7+
Just like other kdevops workflows (fstests, blktests), the AI workflow follows the same pattern:
8+
9+
```bash
10+
make defconfig-ai-milvus-docker # Configure for AI vector database testing
11+
make bringup # Bring up the test environment
12+
make ai # Run the AI benchmarks
13+
make ai-baseline # Establish baseline results
14+
make ai-results # View results
15+
```
16+
17+
## Supported Components
18+
19+
### Vector Databases
20+
- [Milvus](vector-databases/milvus.md) - High-performance vector database for AI applications
21+
22+
### Future Components (Planned)
23+
- Language Models (LLMs)
24+
- Embedding Services
25+
- Training Infrastructure
26+
- Inference Servers
27+
28+
## Configuration Options
29+
30+
The AI workflow can be configured through `make menuconfig`:
31+
32+
1. **Vector Database Selection**
33+
- Milvus (Docker or Native deployment)
34+
- Future: Weaviate, Qdrant, Pinecone
35+
36+
2. **Dataset Configuration**
37+
- Dataset size (number of vectors)
38+
- Vector dimensions
39+
- Batch sizes
40+
41+
3. **Benchmark Parameters**
42+
- Query patterns
43+
- Concurrency levels
44+
- Runtime duration
45+
46+
4. **Filesystem Testing**
47+
- Test on different filesystems (XFS, ext4, btrfs)
48+
- Compare performance across storage configurations
49+
50+
## Pre-built Configurations
51+
52+
Quick configurations for common use cases:
53+
54+
- `defconfig-ai-milvus-docker` - Docker-based Milvus deployment
55+
- `defconfig-ai-milvus-docker-ci` - CI-optimized with minimal dataset
56+
- `defconfig-ai-milvus-native` - Native Milvus installation from source
57+
- `defconfig-ai-milvus-multifs` - Multi-filesystem performance comparison
58+
59+
## A/B Testing Support
60+
61+
Like other kdevops workflows, AI supports baseline/dev comparisons:
62+
63+
```bash
64+
# Configure with A/B testing
65+
make menuconfig # Enable CONFIG_KDEVOPS_BASELINE_AND_DEV
66+
make ai-baseline # Run on baseline
67+
make ai-dev # Run on dev
68+
make ai-results # Compare results
69+
```
70+
71+
## Results and Analysis
72+
73+
The AI workflow generates comprehensive performance metrics:
74+
75+
- Throughput (operations/second)
76+
- Latency percentiles (p50, p95, p99)
77+
- Resource utilization
78+
- Performance graphs and trends
79+
80+
Results are stored in the configured results directory (default: `/data/ai-results/`).
81+
82+
## Integration with CI/CD
83+
84+
The workflow includes CI-optimized configurations that use:
85+
- Minimal datasets for quick validation
86+
- `/dev/null` storage for I/O testing without disk requirements
87+
- Environment variable overrides for runtime configuration
88+
89+
Example CI usage:
90+
```bash
91+
AI_VECTOR_DATASET_SIZE=1000 AI_BENCHMARK_RUNTIME=30 make defconfig-ai-milvus-docker-ci
92+
make bringup
93+
make ai
94+
```
95+
96+
## Workflow Architecture
97+
98+
The AI workflow follows kdevops patterns:
99+
100+
1. **Configuration** - Kconfig-based configuration system
101+
2. **Provisioning** - Ansible-based infrastructure setup
102+
3. **Execution** - Standardized test execution
103+
4. **Collection** - Automated result collection and analysis
104+
5. **Reporting** - Performance visualization and comparison
105+
106+
For detailed usage of specific components, see:
107+
- [Vector Databases Overview](vector-databases/README.md)
108+
- [Milvus Usage Guide](vector-databases/milvus.md)

0 commit comments

Comments
 (0)