Skip to content

Commit 34da318

Browse files
authored
Merge pull request #75 from simranquirky/example-queries
Example queries
2 parents 4f498d5 + edadf65 commit 34da318

17 files changed

+126
-37
lines changed

docs/example-queries.md

Lines changed: 126 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,130 @@
1-
# Example Queries
2-
1+
# OpenObserve Query Examples
32

43
We will use the k8s [sample logs data](https://zinc-public-data.s3.us-west-2.amazonaws.com/zinc-enl/sample-k8s-logs/k8slog_json.json.zip) to demonstrate the sample queries that you can use.
54

5+
To ingest this sample data refer to this [guide.](../getting-started#load-sample-data)
6+
7+
8+
## Text Search Queries
9+
10+
**Search all fields containing the word "error" using full-text index:**
11+
12+
```sql
13+
match_all('error')
14+
```
15+
16+
![Full text Search](../images/example-queries/match-all-error.png)
17+
18+
- `match_all` searches only the fields configured for full-text search. By default, these include: `log`, `message`, `msg`, `content`, `data`, and `json`.
19+
- If you want more fields to be scanned, configure them under stream settings.
20+
21+
**Search for "error" in just the `log` field (more efficient):**
22+
```sql
23+
str_match(log, 'error')
24+
```
25+
![String Match](../images/example-queries/log-error.png)
26+
27+
## Numeric Field Filters
28+
29+
**Find logs where `code` is exactly 200:**
30+
```sql
31+
code = 200
32+
```
33+
![Exact Numeric Match](../images/example-queries/code.png)
34+
35+
**Find logs where `code` is missing (`null`):**
36+
```sql
37+
code is null
38+
```
39+
![Null Numeric Match](../images/example-queries/code-is-null.png)
40+
41+
**Find logs where `code` has any value:**
42+
```sql
43+
code is not null
44+
```
45+
![Not-Null Numeric Match](../images/example-queries/is-not-null.png)
46+
47+
48+
**Avoid using `code = ''` or `code != ''`** — these do not work properly for numeric fields.
49+
50+
![Inappropriate Numeric Match](../images/example-queries/inappropriate.png)
51+
52+
53+
**Logs where `code` is greater than 399:**
54+
```sql
55+
code > 399
56+
```
57+
![Greater than Numeric Match](../images/example-queries/greater-than.png)
58+
59+
60+
**Logs where `code` is greater than or equal to 400:**
61+
```sql
62+
code >= 400
63+
```
64+
![Greater than Equal to Numeric Match](../images/example-queries/greater-than-equalto.png)
65+
66+
**`code => 400` is invalid syntax.** Always use SQL-compatible operators like **>=**.
67+
68+
![Invalid Syntax](../images/example-queries/equalto-greaterthan-error.png)
69+
70+
71+
## Filtering using WHERE Clause
72+
73+
**Filter by service and status code:**
74+
```sql
75+
SELECT * FROM your_stream_name
76+
WHERE service_name = 'api-gateway'
77+
AND code >= 500
78+
```
79+
![Filtering Queries](../images/example-queries/filtering.png)
80+
81+
82+
**Exclude health check logs:**
83+
```sql
84+
SELECT * FROM your_stream_name
85+
WHERE NOT str_match(log, 'health-check')
86+
```
87+
![Filtering Queries](../images/example-queries/exclude-filtering.png)
88+
89+
## Grouping and Counting
90+
91+
**Group Logs over time**
92+
93+
```sql
94+
SELECT histogram(_timestamp) as ts, count(*) as total_logs
95+
FROM your_stream_name
96+
GROUP BY ts
97+
```
98+
![Group Logs](../images/example-queries/group-logs.png)
99+
100+
101+
**Find top 10 IP addresses by request volume:**
102+
```sql
103+
SELECT
104+
client_ip,
105+
count(*) AS request_count
106+
FROM your_stream_name
107+
GROUP BY client_ip
108+
ORDER BY request_count DESC
109+
LIMIT 10
110+
```
111+
![Top 10 by request volume](../images/example-queries/top-10.png)
112+
113+
114+
115+
## Aggregations & Complex Queries
116+
117+
**Histogram of log timestamps with status code counts:**
118+
```sql
119+
SELECT
120+
histogram(_timestamp) AS ts_histogram,
121+
count(CASE WHEN code = 200 THEN 1 END) AS code_200_count,
122+
count(CASE WHEN code = 401 THEN 1 END) AS code_401_count,
123+
count(CASE WHEN code = 500 THEN 1 END) AS code_500_count
124+
FROM your_stream_name
125+
GROUP BY ts_histogram
126+
```
6127

7-
- To search for all the fields containing the word `error` using `Inverted Index`:
8-
- `match_all('error')`
9-
- - match_all searches only the fields that are configured for full text search. Default set of fields are `log`, `message`, `msg`, `content`, `data`, `json`. If you want more fields to be scanned during full text search, you can configure them under stream settings. You should use `str_match` for full text search in specific fields.
10-
- Search only `log` field for error. This is much more efficient than `match_all` as it search in a single field.
11-
- `str_match(log, 'error')`
12-
- To search for all log entries that have log entries where `code is 200` . code is a numeric field
13-
- `code=200`
14-
- To search for all log entries where code field does not contain any value
15-
-`code is null`
16-
- ❌ code=' ' will not yield right results
17-
- To search for all log entries where code field has some value
18-
-`code is not null`
19-
- ❌ code!=' ' will not yield right results
20-
- code > 399
21-
- `code>399`
22-
- code >= 400
23-
-`code >= 400`
24-
- ❌ code=>400 will not work
25-
- A mildly complex query
26-
- <pre> `SELECT histogram(_timestamp) as ts_histogram,
27-
count(case when code=200 then 1 end) as code_200_count,
28-
count(case when code=401 then 1 end) as code_401_count,
29-
count(case when code=500 then 1 end) as code_500_count FROM quickstart1 GROUP BY ts_histogram`</pre>
30-
- If you are looking to draw complex charts based on values in the logs (e.g. status code), you should use standard drag and drop charting functionality of OpenObserve which is very powerful and you do not have to write any SQL queries manually. Most users will be able to build 99% + of their required dashboards without writing any SQL.
31-
32-
- Percentile P95 P99
33-
34-
```sql
35-
SELECT histogram(_timestamp) as x_axis_1,
36-
approx_percentile_cont(duration, 0.95) as percentile_95,
37-
approx_percentile_cont(duration, 0.99) as percentile_99
38-
FROM default
39-
where service_name='$service'
40-
GROUP BY x_axis_1 ORDER BY x_axis_1
41-
```
128+
Replace `your_stream_name` with the actual stream name in your OpenObserve setup.
129+
- `histogram(_timestamp)` bins timestamps into uniform intervals (e.g. hourly). You can configure the granularity in the UI or query if needed.
130+
![Histogram of log timestamps](../images/example-queries/histogram.png)
687 KB
Loading

docs/images/example-queries/code.png

849 KB
Loading
364 KB
Loading
845 KB
Loading
374 KB
Loading
633 KB
Loading
618 KB
Loading
550 KB
Loading
560 KB
Loading

0 commit comments

Comments
 (0)