|
| 1 | +--- |
| 2 | +title: Text indexes in Azure Cosmos DB for MongoDB vCore |
| 3 | +titleSuffix: Azure Cosmos DB for MongoDB vCore |
| 4 | +description: How to configure and use text indexes in Azure Cosmos DB for MongoDB vCore |
| 5 | +author: suvishodcitus |
| 6 | +ms.author: suvishod |
| 7 | +ms.reviewer: gahllevy |
| 8 | +ms.service: cosmos-db |
| 9 | +ms.subservice: mongodb-vcore |
| 10 | +ms.custom: build-2023 |
| 11 | +ms.topic: how-to |
| 12 | +ms.date: 07/26/2023 |
| 13 | +--- |
| 14 | + |
| 15 | +# Text indexes in Azure Cosmos DB for MongoDB vCore |
| 16 | + |
| 17 | +[!INCLUDE[MongoDB vCore](../../includes/appliesto-mongodb-vcore.md)] |
| 18 | + |
| 19 | +One of the key features that Azure Cosmos DB for MongoDB vCore provides is text indexing, which allows for efficient searching and querying of text-based data. The service implements version 2 text indexes which support case sensitivity but not diacritic sensitivity. In this article, we will explore the usage of text indexes in Azure Cosmos DB for MongoDB, along with practical examples and syntax to help you leverage this feature effectively. |
| 20 | + |
| 21 | +## What are Text Indexes? |
| 22 | + |
| 23 | +Text indexes in Azure Cosmos DB for MongoDB are special data structures that optimize text-based queries, making them faster and more efficient. They are designed to handle textual content like documents, articles, comments, or any other text-heavy data. Text indexes use techniques such as tokenization, stemming, and stop words to create an index that improves the performance of text-based searches. |
| 24 | + |
| 25 | +## Defining a Text Index |
| 26 | + |
| 27 | +For simplicity let us consider an example of a blog application that stores articles with the following document structure: |
| 28 | + |
| 29 | +```json |
| 30 | +{ |
| 31 | + "_id": ObjectId("617a34e7a867530bff1b2346"), |
| 32 | + "title": "Azure Cosmos DB - A Game Changer", |
| 33 | + "content": "Azure Cosmos DB is a globally distributed, multi-model database service.", |
| 34 | + "author": "John Doe", |
| 35 | + "category": "Technology", |
| 36 | + "published": true |
| 37 | +} |
| 38 | +``` |
| 39 | + |
| 40 | +To create a text index in Azure Cosmos DB for MongoDB, you can use the "createIndex" method with the "text" option. Here's an example of how to create a text index for a "title" field in a collection named "articles": |
| 41 | + |
| 42 | +``` |
| 43 | +db.articles.createIndex({ title: "text" }) |
| 44 | +``` |
| 45 | + |
| 46 | +While we can define only one text index per collection, Azure Cosmos DB for MongoDB allows you to create text indexes on multiple fields to enable you to perform text searches across different fields in your documents. |
| 47 | + |
| 48 | +For example, if we want to perform search on both the "title" and "content" fields, then the text index can be defined as: |
| 49 | + |
| 50 | +``` |
| 51 | +db.articles.createIndex({ title: "text", content: "text" }) |
| 52 | +``` |
| 53 | + |
| 54 | +## Text Index Options |
| 55 | + |
| 56 | +Text indexes in Azure Cosmos DB for MongoDB come with several options to customize their behavior. For example, you can specify the language for text analysis, set weights to prioritize certain fields, and configure case-insensitive searches. Here's an example of creating a text index with options: |
| 57 | + |
| 58 | +``` |
| 59 | +db.articles.createIndex( |
| 60 | + { content: "text", title: "text" }, |
| 61 | + { default_language: "english", weights: { title: 10, content: 5 }, caseSensitive: false } |
| 62 | +) |
| 63 | +``` |
| 64 | +In this example, we have defined a text index on both the "content" and "title" fields with English language support. We have also assigned higher weights to the "title" field to prioritize it in search results. |
| 65 | + |
| 66 | +## Significance of weights in text indexes |
| 67 | + |
| 68 | +When creating a text index, you have the option to assign different weights to individual fields in the index. These weights represent the importance or relevance of each field in the search. |
| 69 | + |
| 70 | +When executing a text search query, Cosmos DB will calculate a score for each document based on the search terms and the assigned weights of the indexed fields. The score represents the relevance of the document to the search query. |
| 71 | + |
| 72 | + |
| 73 | +``` |
| 74 | +db.articles.createIndex( |
| 75 | + { title: "text", content: "text" }, |
| 76 | + { weights: { title: 2, content: 1 } } |
| 77 | +) |
| 78 | +``` |
| 79 | + |
| 80 | +For example, let's say we have a text index with two fields: "title" and "content." We assign a weight of 2 to the "title" field and a weight of 1 to the "content" field. When a user performs a text search query with the term "Cosmos DB," the score for each document in the collection will be calculated based on the presence and frequency of the term in both the "title" and "content" fields, with higher importance given to the "title" field due to its higher weight. |
| 81 | + |
| 82 | +To look at the score of documents in the query result, you can use the $meta projection operator along with the textScore field in your query projection. |
| 83 | + |
| 84 | + |
| 85 | +``` |
| 86 | +db.articles.find( |
| 87 | + { $text: { $search: "Cosmos DB" } }, |
| 88 | + { score: { $meta: "textScore" } } |
| 89 | +) |
| 90 | +``` |
| 91 | + |
| 92 | +## Performing a Text Search |
| 93 | + |
| 94 | +Once the text index is created, you can perform text searches using the "text" operator in your queries. The text operator takes a search string and matches it against the text index to find relevant documents. Here's an example of a text search query: |
| 95 | + |
| 96 | +``` |
| 97 | +db.articles.find({ $text: { $search: "Azure Cosmos DB" } }) |
| 98 | +``` |
| 99 | + |
| 100 | +This query will return all documents in the "articles" collection that contain the terms "Azure" and "Cosmos DB" in any order. |
| 101 | + |
| 102 | +## Limitations |
| 103 | + |
| 104 | +* Only one text index can be defined on a collection. |
| 105 | +* Text indexes support simple text searches and do not provide advanced search capabilities like regular expression searches. |
| 106 | +* Hint() is not supported in combination with a query using $text expression. |
| 107 | +* Sort operations cannot leverage the ordering of the text index in MongoDB. |
| 108 | +* Text indexes can be relatively large, consuming significant storage space compared to other index types. |
| 109 | + |
| 110 | + |
| 111 | + |
| 112 | +## Dropping a text index |
| 113 | + |
| 114 | +To drop a text index in MongoDB, you can use the dropIndex() method on the collection and specify the index key or name for the text index you want to remove. |
| 115 | + |
| 116 | +``` |
| 117 | +db.articles.dropIndex({ title: "text" }) |
| 118 | +``` |
| 119 | +or |
| 120 | +``` |
| 121 | +db.articles.dropIndex("title_text") |
| 122 | +``` |
0 commit comments