You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the **Create Index** area for the collection, next to **Vector Fields**, click **Edit Index**. Make sure that for the
50
51
`embeddings` field, the **Field Type** is set to **FLOAT_VECTOR** and the **Metric Type** is set to **Cosine**.
51
52
53
+
<Warning>
54
+
The number of dimensions for the `embeddings` field must match the number of dimensions for the embedding model that you plan to use.
55
+
</Warning>
56
+
52
57
- For Milvus on IBM watsonx.data, you will need:
53
58
54
59
<iframe
@@ -69,7 +74,7 @@ The following video shows how to fulfill the minimum set of requirements for Mil
69
74
[Get the instance's GRPC host and GRPC port](https://cloud.ibm.com/docs/watsonxdata?topic=watsonxdata-conn-to-milvus).
70
75
- The name of the [database](https://milvus.io/docs/manage_databases.md) in the instance.
71
76
- The name of the [collection](https://milvus.io/docs/manage-collections.md) in the database. Note the collection requirements at the end of this section.
72
-
- The uername and password to access the instance.
77
+
- The username and password to access the instance.
73
78
The username for Milvus on IBM watsonx.data is always `ibmlhapikey`.
74
79
The password for Milvus on IBM watsonx.data is in the form of an IBM Cloud user API key.
75
80
[Get the user API key](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui).
@@ -84,75 +89,95 @@ The following video shows how to fulfill the minimum set of requirements for Mil
84
89
- The [username and password, or token](https://milvus.io/docs/authenticate.md) to access the instance.
85
90
86
91
All Milvus instances require the target collection to have a defined schema before Unstructured can write to the collection. The minimum viable
87
-
schema for Unstructured contains only the fields `element_id`, `embeddings`, and `record_id`, as follows. Adding a `text` field is optional but highly recommended.This example code demonstrates the use of the
92
+
schema for Unstructured contains only the fields `element_id`, `embeddings`, `record_id`, and `text`, as follows. This example code demonstrates the use of the
88
93
[Python SDK for Milvus](https://pypi.org/project/pymilvus/) to create a collection with this schema,
89
-
targeting Milvus on IBM watsonx.data. For the `connections.connect` arguments to connect to other types of Milvus deployments, see your Milvus provider's documentation:
94
+
targeting Milvus on IBM watsonx.data. For the `MilvusClient` arguments to connect to other types of Milvus deployments, see your Milvus provider's documentation:
90
95
91
96
```python Python
92
97
import os
98
+
93
99
from pymilvus import (
94
-
connections,
100
+
MilvusClient,
95
101
FieldSchema,
96
102
DataType,
97
-
CollectionSchema,
98
-
Collection,
103
+
CollectionSchema
99
104
)
100
105
101
-
connections.connect(
102
-
alias="default",
103
-
host=os.getenv("MILVUS_GRPC_HOST"),
104
-
port=os.getenv("MILVUS_GRPC_PORT"),
105
-
user=os.getenv("MILVUS_USER"),
106
-
password=os.getenv("MILVUS_PASSWORD"),
107
-
secure=True
106
+
DATABASE_NAME="default"
107
+
COLLECTION_NAME="my_collection"
108
+
109
+
client = MilvusClient(
110
+
uri="https://"+
111
+
os.getenv("MILVUS_USER") +
112
+
":"+
113
+
os.getenv("MILVUS_PASSWORD") +
114
+
"@"+
115
+
os.getenv("MILVUS_GRPC_HOST") +
116
+
":"+
117
+
os.getenv("MILVUS_GRPC_PORT"),
118
+
db_name=DATABASE_NAME
108
119
)
109
120
110
-
primary_key= FieldSchema(
121
+
primary_key_field= FieldSchema(
111
122
name="element_id",
112
123
dtype=DataType.VARCHAR,
113
124
is_primary=True,
114
125
max_length=200
115
126
)
116
127
117
-
vector = FieldSchema(
128
+
# IMPORTANT: The number of dimensions for the "embeddings" field
129
+
# must match the number of dimensions for the embedding model
130
+
# that you plan to use.
131
+
embeddings_field = FieldSchema(
118
132
name="embeddings",
119
133
dtype=DataType.FLOAT_VECTOR,
120
-
dim=3072
134
+
dim=384
121
135
)
122
136
123
-
record_id= FieldSchema(
137
+
record_id_field= FieldSchema(
124
138
name="record_id",
125
139
dtype=DataType.VARCHAR,
126
140
max_length=200
127
141
)
128
142
129
-
text= FieldSchema(
143
+
text_field= FieldSchema(
130
144
name="text",
131
145
dtype=DataType.VARCHAR,
132
-
max_length=65536
146
+
max_length=65535
133
147
)
134
148
135
149
schema = CollectionSchema(
136
-
fields=[primary_key, vector, record_id, text],
137
-
enable_dynamic_field=True
150
+
fields=[
151
+
primary_key_field,
152
+
embeddings_field,
153
+
record_id_field,
154
+
text_field
155
+
]
138
156
)
139
157
140
-
collection = Collection(
141
-
name="my_collection",
158
+
client.create_collection(
159
+
collection_name=COLLECTION_NAME",
142
160
schema=schema,
143
-
using="default"
161
+
using=DATABASE_NAME
144
162
)
145
163
146
-
index_params = {
147
-
"metric_type": "L2",
148
-
"index_type": "IVF_FLAT",
149
-
"params": {"nlist": 1024}
150
-
}
164
+
index_params = client.prepare_index_params()
151
165
152
-
collection.create_index(
166
+
index_params.add_index(
153
167
field_name="embeddings",
168
+
metric_type="COSINE",
169
+
index_type="IVF_FLAT",
170
+
params={"nlist": 1024}
171
+
)
172
+
173
+
client.create_index(
174
+
collection_name=COLLECTION_NAME,
154
175
index_params=index_params
155
176
)
177
+
178
+
client.load_collection(
179
+
collection_name=COLLECTION_NAME
180
+
)
156
181
```
157
182
158
183
Other approaches, such as [creating collections instantly](https://milvus.io/docs/create-collection-instantly.md) or
0 commit comments