- Elasticsearch support version: 7.x or 8.x
- Milvus2.x support version: 2.2+
Tool will migration ES index data to Milvus2.x Collection. Now support migration es field type have below:
| ES field Type | Mapping field type of Milvus2.x collection |
|---|---|
| keyword | VarChar |
| text | VarChar |
| long | int64 |
| integer | int32 |
| short | int16 |
| double | Double |
| float | Float |
| boolean | Bool |
| dense_vector | FloatVector |
- must need migrate a
dense_vectortype es field to Milvus2.x as collection's FloatVector field. - must need migrate a
keywordorlongtype es field as Collection primary key, if not specified will use_idas primary key.
dumper:
worker:
workMode: elasticsearch # migration mode: elasticsearch, faiss, milvus1x
reader:
bufferSize: 2500 # es data total rows in each time read from es.
meta:
mode: config #config: will read meta config from this config itself.
index: test_es_index #es index
fields: # es index fields of need migrate to milvus2x
- name: id #field name
type: long
pk: true #set pk=true will use this field as milvus collection primary key
- name: data
type: dense_vector #dense_vector field have to migrate field.
dims: 512 #vector dims
- name: other_field
type: keyword
maxLen: 60 #as milvus collection Varchar type filed maxLen property.
milvus: #optional configuration
collection: rename_test_name #if not exits, will use es index as collection name.
closeDynamicField: false
shardNum: 2
consistencyLevel: Customized
source: # es connect configuration
es:
urls:
- http://localhost:9200
username: x*****x
password: x*****x
target:
mode: remote
remote:
outputDir: migration/test/xx # temp storage source data, cannot start with '/'
cloud: aws #cloud includes: aws(s3,minio), gcp, ali(oss)
endpoint: localhost:9000 # here is the minio server endpoint
region: ap-southeast-1 # region, if local server needn't it
bucket: a-bucket # bucket name, need same with milvus2.x bucket name
ak: minioadmin # ak/sk, if useIAM=false, need ak/sk
sk: minioadmin
useIAM: false #if useIAM=true, not need ak/sk.
useSSL: false
checkBucket: true #if bucket not exits will create bucket when set checkBucket=true
milvus2x:
endpoint: localhost:19530
username: xxxxx
password: xxxxxyou can place the migration.yaml to configs/ directory, then tool will auto read config from the configs/migration.yaml when you execute cmd:
./milvus-migration startor you can place the migration.yaml to any directory, then will read config from --config param path when execute cmd
like below:
./milvus-migration start --config=/{YourConfigFilePath}/migration.yamlmigration success when you see the print log like below:
[migration/es_starter.go:25] ["[Starter] migration ES to Milvus finish!!!"] [Cost=80.009174459]
[starter/starter.go:106] ["[Starter] Migration Success!"] [Cost=80.00928425]
[cleaner/remote_cleaner.go:27] ["[Remote Cleaner] Begin to clean files"] [bucket=a-bucket] [rootPath=testfiles/output/zwh/migration]
[cmd/start.go:32] ["[Cleaner] clean file success!"]if you want to verify the migration data result, you can use Attu see your new collection info. Attu
- if you don't config pk=true field, default will use es
_idas primary key, pk type is Varchar and maxLen is 65535. Usually, this is not a good pk strategy, so here provide to explicit setting like below config:
...
meta:
fields:
- name: _id # explicit _id as primary key a field
type: keyword # primary key type specified `keyword` type
maxLen: 60 # primary key maxLen set 60
pk: true
...also you can change type to long
...
meta:
fields:
- name: _id
type: long # primary key type specified `long` type
pk: true
...- if your es server using the Elastic Cloud es, then you can config like below to connect es:
...
source:
es:
cloudId: x*********x
apiKey: x*******x
...- if your es server setting others auth style, like: serviceToken, fingerprint, ca file, you can add corresponding authorization config:
...
source:
es:
fingerprint: x********x
serviceToken: x******x
cert: /{YourPath}/http_ca.cert
...- About target, if you use aliyun-oss, your config will like below:
target:
mode: remote
remote:
outputDir: "migration/test/xxx"
cloud: ali # cloud set is ali
endpoint: oss-{YourRegion}-internal.aliyuncs.com # if dont set oss endpoint default is: oss.aliyuncs.com
region: { yourRegion }
bucket: { yourBucket }
useIAM: true #useIAM=true, don't need ak/sk
checkBucket: false
useSSL: trueor if you are use s3:
target:
mode: remote
remote:
outputDir: "migration/test/xxx"
cloud: aws
region: { yourRegion }
bucket: { yourBucket }
useIAM: true #use IAM connect s3
checkBucket: false
useSSL: true- If want to migrate data to Target Milvus collection (isn't
defaultdatabase), you can addtarget.milvus2x.databaseto specify database name, database name will auto create if not exists.
...
target:
milvus2x:
...
database: my_database
...Here dumper module mainly sets adjustment parameters for migration job.
| Parameter | Description | Example |
|---|---|---|
| dumper.worker.workMode | Work mode | elasticsearch |
| dumper.worker.reader.bufferSize | how many rows data read from es in every batch | suggest: 2000-4000 |
Here meta module mainly sets source data(es index info) and will create target data(milvus2.x collection info)
| parameter | description | example |
|---|---|---|
| meta.mode | Where to read meta config, now only support: config | config: represents read from migration.yaml itself |
| meta.index | Read data from which es index | test_es_index |
| meta.fields | Which es index fields need to be migrated | field info below: |
| meta.fields.-name | es field name | id |
| meta.fields.-pk | Whether es field as primary key | true, default: false |
| meta.fields.-type | es field type | long, integer,keyword,float,dense_vector... |
| meta.fields.-maxLen | keyword or text es field maxLen in 2.x collection | 100, default: 65535 |
| meta.fields.-dims | dense_vector type field dimension | 512 |
| meta.milvus | not required, set create 2.x collection property | below: |
| meta.milvus.collection | 2.x collection name | if null will use es index name as collection name |
| meta.milvus.closeDynamicField | whether close 2.x Collection dynamic field feature | default: false |
| meta.milvus.consistencyLevel | 2.x Collection consistency level | default: collection default level |
Here source module mainly sets source es connection info
| parameter | description | example |
|---|---|---|
| source.es.urls | es server address list | http://localhost:9200 |
| source.es.username | es server username | xxx |
| source.es.password | es server password | xxx |
| source.es.cert | es cert file path | /xxx/http_ca.crt |
| source.es.fingerprint | es fingerprint | xxxxxxxxxxxx |
| source.es.serviceToken | es server serviceToken | Bearer xxxxxxxxxx |
| source.es.cloudId | elasticCloud cloudId | xx |
| source.es.apiKey | elasticCloud apiKey | xxx |
Here target module mainly sets milvus2.x server info
| parameter | description | example |
|---|---|---|
| target.mode | Where to store the dumped files | local: store dumped files on local disk; remote: store dumped files on S3 |
| target.remote.outputDir | Directory path in bucket where to store files | output/ |
| target.remote.cloud | Which Cloud Storage the Milvus 2.x data | aws (if using Minio, use aws), gcp, or azure, ali |
| target.remote.endpoint | Endpoint of the Milvus 2.x storage | 127.0.0.1:9000 |
| target.remote.region | Region of the Milvus 2.x storage | If using local Minio, can use any value |
| target.remote.bucket | Bucket of the Milvus 2.x storage | Must use the same bucket as configured in milvus.yaml for Milvus 2.x |
| target.remote.ak | Access Key of the Milvus 2.x storage | minioadmin |
| target.remote.sk | Secret Key of the Milvus 2.x storage | minioadmin |
| target.remote.useIAM | Whether to use IAM Role to connect to Milvus 2.x | false |
| target.remote.useSSL | Whether to use SSL when connecting to Milvus 2.x | For local Minio, use false; for remote S3, use true |
| target.remote.checkBucket | Whether to check if the bucket exists in the storage | True to check if you can connect to the Milvus 2.x storage |
| target.milvus2x.endpoint | Endpoint of Milvus 2.x | xxxxxx:19530 |
| target.milvus2x.username | Username of Milvus 2.x | root |
| target.milvus2x.password | Password of Milvus 2.x | xxxxxxx |