|
19 | 19 | - A Weaviate database instance. The following information assumes that you have a Weaviate Cloud (WCD) account with a Weaviate database cluster in that account. |
20 | 20 | [Create a WCD account](https://weaviate.io/developers/wcs/quickstart#create-a-wcd-account). [Create a database cluster](https://weaviate.io/developers/wcs/quickstart#create-a-weaviate-cluster). For other database options, [learn more](https://weaviate.io/developers/weaviate/installation). |
21 | 21 | - The URL and API key for the database cluster. [Get the URL and API key](https://weaviate.io/developers/wcs/quickstart#explore-the-details-panel). |
22 | | - - The name of the target collection in the database. [Create a collection](https://weaviate.io/developers/wcs/tools/collections-tool). |
| 22 | + - The name of the target collection in the database. [Create a collection](https://weaviate.io/developers/wcs/tools/collections-tool). |
| 23 | + |
| 24 | + An existing collection is not required. At runtime, the collection behavior is as follows: |
23 | 25 |
|
24 | | -Weaviate requires the collection to have a data schema before you add data. At minimum, this schema must contain the `record_id` property, as follows: |
| 26 | + For the [Unstructured Platform](/platform/overview): |
| 27 | + |
| 28 | + - If an existing collection name is specified, and Unstructured generates embeddings, |
| 29 | + but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail. |
| 30 | + You must change your Unstructured embedding settings or your existing collection's embedding settings to match, and try the run again. |
| 31 | + - If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. If Unstructured generates embeddings, |
| 32 | + the new collection's name will be `U<short-workflow-id>_<short-embedding-model-name>_<number-of-dimensions>`. |
| 33 | + If Unstructured does not generate embeddings, the new collection's name will be `U<short-workflow-id`. |
| 34 | + |
| 35 | + For [Unstructured Ingest](/ingestion/overview): |
| 36 | + |
| 37 | + - If an existing collection name is specified, and Unstructured generates embeddings, |
| 38 | + but the number of dimensions that are generated does not match the existing collection's embedding settings, the run will fail. |
| 39 | + You must change your Unstructured embedding settings or your existing collection's embedding settings to match, and try the run again. |
| 40 | + - If a collection name is not specified, Unstructured creates a new collection in your Weaviate cluster. The new collection's name will be `Elements`. |
| 41 | + |
| 42 | + If Unstructured creates a new collection and generates embeddings, you will not see an embeddings property in tools such as the Weaviate Cloud |
| 43 | + **Collections** user interface. To view the generated embeddings, you can run a Weaviate GraphQL query such as the following. In this query, replace `<collection-name>` with |
| 44 | + the name of the new collection, and replace `<property-name>` with the name of each additional available property that |
| 45 | + you want to return results for, such as `text`, `type`, `element_id`, `record_id`, and so on. The embeddings will be |
| 46 | + returned in the `vector` property. |
| 47 | + |
| 48 | + ```text |
| 49 | + { |
| 50 | + Get { |
| 51 | + <collection-name> { |
| 52 | + _additional { |
| 53 | + vector |
| 54 | + } |
| 55 | + <property-name> |
| 56 | + <property-name> |
| 57 | + } |
| 58 | + } |
| 59 | + } |
| 60 | + ``` |
| 61 | + |
| 62 | +Weaviate requires an existing collection to have a data schema before you add data. At minimum, this schema must contain the `record_id` property, as follows: |
25 | 63 |
|
26 | 64 | ```json |
27 | 65 | { |
|
0 commit comments