|
1 |
| -# Example Queries |
2 |
| - |
| 1 | +# OpenObserve Query Examples |
3 | 2 |
|
4 | 3 | We will use the k8s [sample logs data](https://zinc-public-data.s3.us-west-2.amazonaws.com/zinc-enl/sample-k8s-logs/k8slog_json.json.zip) to demonstrate the sample queries that you can use.
|
5 | 4 |
|
| 5 | +To ingest this sample data refer to this [guide.](../getting-started#load-sample-data) |
| 6 | + |
| 7 | + |
| 8 | +## Text Search Queries |
| 9 | + |
| 10 | +**Search all fields containing the word "error" using full-text index:** |
| 11 | + |
| 12 | +```sql |
| 13 | +match_all('error') |
| 14 | +``` |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | +- `match_all` searches only the fields configured for full-text search. By default, these include: `log`, `message`, `msg`, `content`, `data`, and `json`. |
| 19 | +- If you want more fields to be scanned, configure them under stream settings. |
| 20 | + |
| 21 | +**Search for "error" in just the `log` field (more efficient):** |
| 22 | +```sql |
| 23 | +str_match(log, 'error') |
| 24 | +``` |
| 25 | +  |
| 26 | + |
| 27 | +## Numeric Field Filters |
| 28 | + |
| 29 | +**Find logs where `code` is exactly 200:** |
| 30 | +```sql |
| 31 | +code = 200 |
| 32 | +``` |
| 33 | +  |
| 34 | + |
| 35 | +**Find logs where `code` is missing (`null`):** |
| 36 | +```sql |
| 37 | +code is null |
| 38 | +``` |
| 39 | +  |
| 40 | + |
| 41 | +**Find logs where `code` has any value:** |
| 42 | +```sql |
| 43 | +code is not null |
| 44 | +``` |
| 45 | +  |
| 46 | + |
| 47 | + |
| 48 | +**Avoid using `code = ''` or `code != ''`** — these do not work properly for numeric fields. |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | + |
| 53 | +**Logs where `code` is greater than 399:** |
| 54 | +```sql |
| 55 | +code > 399 |
| 56 | +``` |
| 57 | +  |
| 58 | + |
| 59 | + |
| 60 | +**Logs where `code` is greater than or equal to 400:** |
| 61 | +```sql |
| 62 | +code >= 400 |
| 63 | +``` |
| 64 | +  |
| 65 | + |
| 66 | +**`code => 400` is invalid syntax.** Always use SQL-compatible operators like **>=**. |
| 67 | + |
| 68 | + |
| 69 | + |
| 70 | + |
| 71 | +## Filtering using WHERE Clause |
| 72 | + |
| 73 | +**Filter by service and status code:** |
| 74 | +```sql |
| 75 | +SELECT * FROM your_stream_name |
| 76 | +WHERE service_name = 'api-gateway' |
| 77 | + AND code >= 500 |
| 78 | +``` |
| 79 | +  |
| 80 | + |
| 81 | + |
| 82 | +**Exclude health check logs:** |
| 83 | +```sql |
| 84 | +SELECT * FROM your_stream_name |
| 85 | +WHERE NOT str_match(log, 'health-check') |
| 86 | +``` |
| 87 | +  |
| 88 | + |
| 89 | +## Grouping and Counting |
| 90 | + |
| 91 | +**Group Logs over time** |
| 92 | + |
| 93 | +```sql |
| 94 | +SELECT histogram(_timestamp) as ts, count(*) as total_logs |
| 95 | +FROM your_stream_name |
| 96 | +GROUP BY ts |
| 97 | +``` |
| 98 | +  |
| 99 | + |
| 100 | + |
| 101 | +**Find top 10 IP addresses by request volume:** |
| 102 | +```sql |
| 103 | +SELECT |
| 104 | + client_ip, |
| 105 | + count(*) AS request_count |
| 106 | +FROM your_stream_name |
| 107 | +GROUP BY client_ip |
| 108 | +ORDER BY request_count DESC |
| 109 | +LIMIT 10 |
| 110 | +``` |
| 111 | +  |
| 112 | + |
| 113 | + |
| 114 | + |
| 115 | +## Aggregations & Complex Queries |
| 116 | + |
| 117 | +**Histogram of log timestamps with status code counts:** |
| 118 | +```sql |
| 119 | +SELECT |
| 120 | + histogram(_timestamp) AS ts_histogram, |
| 121 | + count(CASE WHEN code = 200 THEN 1 END) AS code_200_count, |
| 122 | + count(CASE WHEN code = 401 THEN 1 END) AS code_401_count, |
| 123 | + count(CASE WHEN code = 500 THEN 1 END) AS code_500_count |
| 124 | +FROM your_stream_name |
| 125 | +GROUP BY ts_histogram |
| 126 | +``` |
6 | 127 |
|
7 |
| -- To search for all the fields containing the word `error` using `Inverted Index`: |
8 |
| - - `match_all('error')` |
9 |
| - - - match_all searches only the fields that are configured for full text search. Default set of fields are `log`, `message`, `msg`, `content`, `data`, `json`. If you want more fields to be scanned during full text search, you can configure them under stream settings. You should use `str_match` for full text search in specific fields. |
10 |
| -- Search only `log` field for error. This is much more efficient than `match_all` as it search in a single field. |
11 |
| - - `str_match(log, 'error')` |
12 |
| -- To search for all log entries that have log entries where `code is 200` . code is a numeric field |
13 |
| - - `code=200` |
14 |
| -- To search for all log entries where code field does not contain any value |
15 |
| - - ✅ `code is null` |
16 |
| - - ❌ code=' ' will not yield right results |
17 |
| -- To search for all log entries where code field has some value |
18 |
| - - ✅ `code is not null` |
19 |
| - - ❌ code!=' ' will not yield right results |
20 |
| -- code > 399 |
21 |
| - - `code>399` |
22 |
| -- code >= 400 |
23 |
| - - ✅ `code >= 400` |
24 |
| - - ❌ code=>400 will not work |
25 |
| -- A mildly complex query |
26 |
| - - <pre> `SELECT histogram(_timestamp) as ts_histogram, |
27 |
| - count(case when code=200 then 1 end) as code_200_count, |
28 |
| - count(case when code=401 then 1 end) as code_401_count, |
29 |
| - count(case when code=500 then 1 end) as code_500_count FROM quickstart1 GROUP BY ts_histogram`</pre> |
30 |
| - - If you are looking to draw complex charts based on values in the logs (e.g. status code), you should use standard drag and drop charting functionality of OpenObserve which is very powerful and you do not have to write any SQL queries manually. Most users will be able to build 99% + of their required dashboards without writing any SQL. |
31 |
| - |
32 |
| -- Percentile P95 P99 |
33 |
| - |
34 |
| - ```sql |
35 |
| - SELECT histogram(_timestamp) as x_axis_1, |
36 |
| - approx_percentile_cont(duration, 0.95) as percentile_95, |
37 |
| - approx_percentile_cont(duration, 0.99) as percentile_99 |
38 |
| - FROM default |
39 |
| - where service_name='$service' |
40 |
| - GROUP BY x_axis_1 ORDER BY x_axis_1 |
41 |
| - ``` |
| 128 | +Replace `your_stream_name` with the actual stream name in your OpenObserve setup. |
| 129 | +- `histogram(_timestamp)` bins timestamps into uniform intervals (e.g. hourly). You can configure the granularity in the UI or query if needed. |
| 130 | +  |
0 commit comments