Skip to content

Commit 640f7d3

Browse files
author
Theo van Kraay
committed
edits
1 parent 0e4d354 commit 640f7d3

File tree

1 file changed

+23
-17
lines changed

1 file changed

+23
-17
lines changed

articles/managed-instance-apache-cassandra/search-lucene-index.md

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,11 @@ Cassandra Lucene Index, derived from Stratio Cassandra, is a plugin for Apache C
1616
> This feature is provided without a service level agreement, and it's not recommended for production workloads.
1717
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
1818
19+
> [!WARNING]
20+
> A limitation with Lucene index searches is that cross partition searches cannot be executed solely in the index (unlike elastic search or Solr). This can lead to issues with performance (memory and CPU load) for cross partition searches that may affect steady state workloads.
21+
>
22+
> As such, where search requirements are significant, if you intend to use this feature in production, we recommend deploying a dedicated secondary data center to be used only for searches, with a minimal number of nodes, each having a high number of cores (minimum 16). The keyspaces in your primary (operational) data center should then be configured to replicate data to your secondary (search) data center.
23+
1924
## Prerequisites
2025

2126
- If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
@@ -71,27 +76,11 @@ Insert the following sample tweets:
7176
INSERT INTO tweets (id,user,body,time,latitude,longitude) VALUES (5,'quetzal','Click my link, like my stuff!', '2023-04-01T11:21:59.001+0000', 40.3930, -3.7329);
7277
```
7378

74-
1. The index you created earlier will index all the columns in the table with the specified types, and it will be refreshed once per second. Alternatively, you can explicitly refresh all the index shards with an empty search with consistency ALL:
75-
76-
```SQL
77-
CONSISTENCY ALL
78-
SELECT * FROM tweets WHERE expr(tweets_index, '{refresh:true}');
79-
CONSISTENCY QUORUM
80-
```
81-
8279
1. Now, you can search for tweets within a certain date range:
8380

8481
```SQL
8582
SELECT * FROM tweets WHERE expr(tweets_index, '{filter: {type: "range", field: "time", lower: "2023/03/01", upper: "2023/05/01"}}');
8683
```
87-
1. The same search can be performed forcing an explicit refresh of the involved index shards:
88-
89-
```SQL
90-
SELECT * FROM tweets WHERE expr(tweets_index, '{
91-
filter: {type: "range", field: "time", lower: "2023/03/01", upper: "2023/05/01"},
92-
refresh: true
93-
}') limit 100;
94-
```
9584

9685
1. Now, to search the top 100 more relevant tweets where body field contains the phrase “Click my link” within the aforementioned date range:
9786

@@ -158,7 +147,24 @@ Insert the following sample tweets:
158147
}') limit 100;
159148
```
160149

161-
For more in-depth information and samples see [Stratio's Cassandra Lucene Index](https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.14/doc/documentation.rst).
150+
1. The index you created earlier will index all the columns in the table with the specified types, and it will be refreshed once per second. Alternatively, you can explicitly refresh all the index shards with an empty search with consistency ALL:
151+
152+
```SQL
153+
CONSISTENCY ALL
154+
SELECT * FROM tweets WHERE expr(tweets_index, '{refresh:true}');
155+
CONSISTENCY QUORUM
156+
```
157+
158+
1. The same search can be performed forcing an explicit refresh of the involved index shards:
159+
160+
```SQL
161+
SELECT * FROM tweets WHERE expr(tweets_index, '{
162+
filter: {type: "range", field: "time", lower: "2023/03/01", upper: "2023/05/01"},
163+
refresh: true
164+
}') limit 100;
165+
```
166+
167+
162168

163169
## Next steps
164170

0 commit comments

Comments
 (0)