You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/managed-instance-apache-cassandra/search-lucene-index.md
+23-17Lines changed: 23 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,11 @@ Cassandra Lucene Index, derived from Stratio Cassandra, is a plugin for Apache C
16
16
> This feature is provided without a service level agreement, and it's not recommended for production workloads.
17
17
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
18
18
19
+
> [!WARNING]
20
+
> A limitation with Lucene index searches is that cross partition searches cannot be executed solely in the index (unlike elastic search or Solr). This can lead to issues with performance (memory and CPU load) for cross partition searches that may affect steady state workloads.
21
+
>
22
+
> As such, where search requirements are significant, if you intend to use this feature in production, we recommend deploying a dedicated secondary data center to be used only for searches, with a minimal number of nodes, each having a high number of cores (minimum 16). The keyspaces in your primary (operational) data center should then be configured to replicate data to your secondary (search) data center.
23
+
19
24
## Prerequisites
20
25
21
26
- If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
@@ -71,27 +76,11 @@ Insert the following sample tweets:
71
76
INSERT INTO tweets (id,user,body,time,latitude,longitude) VALUES (5,'quetzal','Click my link, like my stuff!', '2023-04-01T11:21:59.001+0000', 40.3930, -3.7329);
72
77
```
73
78
74
-
1. The index you created earlier will index all the columns in the table with the specified types, and it will be refreshed once per second. Alternatively, you can explicitly refresh all the index shards with an empty search with consistency ALL:
75
-
76
-
```SQL
77
-
CONSISTENCY ALL
78
-
SELECT * FROM tweets WHERE expr(tweets_index, '{refresh:true}');
79
-
CONSISTENCY QUORUM
80
-
```
81
-
82
79
1. Now, you can search for tweets within a certain date range:
83
80
84
81
```SQL
85
82
SELECT * FROM tweets WHERE expr(tweets_index, '{filter: {type: "range", field: "time", lower: "2023/03/01", upper: "2023/05/01"}}');
86
83
```
87
-
1. The same search can be performed forcing an explicit refresh of the involved index shards:
1. Now, to search the top 100 more relevant tweets where body field contains the phrase “Click my link” within the aforementioned date range:
97
86
@@ -158,7 +147,24 @@ Insert the following sample tweets:
158
147
}') limit 100;
159
148
```
160
149
161
-
For more in-depth information and samples see [Stratio's Cassandra Lucene Index](https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.14/doc/documentation.rst).
150
+
1. The index you created earlier will index all the columns in the table with the specified types, and it will be refreshed once per second. Alternatively, you can explicitly refresh all the index shards with an empty search with consistency ALL:
151
+
152
+
```SQL
153
+
CONSISTENCY ALL
154
+
SELECT * FROM tweets WHERE expr(tweets_index, '{refresh:true}');
155
+
CONSISTENCY QUORUM
156
+
```
157
+
158
+
1. The same search can be performed forcing an explicit refresh of the involved index shards:
0 commit comments