Skip to content

Commit cd331d3

Browse files
YANG-DBderek-ho
authored andcommitted
adding Nginx Integration (opensearch-project#1493)
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
1 parent 560bfc4 commit cd331d3

31 files changed

+21258
-0
lines changed

integrations/README.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Definitions
2+
3+
## Bundle
4+
5+
An OpenSearch Integration Bundle may contain the following:
6+
- dashboards
7+
- visualisations
8+
- configurations
9+
These bundle assets are designed to assist monitor of logs and metrics for a particular resource (device, network element, service ) or group of related resources, such as “Nginx”, or “System”.
10+
11+
---
12+
13+
The Bundle consists of:
14+
15+
* Version
16+
* Metadata configuration file
17+
* Dashboards and visualisations and Notebooks
18+
* Data stream index templates used for the signal's ingestion
19+
* Documentation & information
20+
21+
22+
## Integration
23+
24+
An integration is a type of _bundle_ defining data-streams for ingetion of a resource observed signals using logs, metrics, and traces.
25+
26+
### Structure
27+
As mentioned above, integration is a collection of elements that formulate how to observe a specific data emitting resource - in our case a telemetry data producer.
28+
29+
A typical Observability Integration consists of the following parts:
30+
31+
***Metadata***
32+
33+
* Observability data producer resource
34+
* Supplement Indices (mapping & naming)
35+
* Collection Agent Version
36+
* Transformation schema
37+
* Optional test harnesses repository
38+
* Verified version and documentation
39+
* Category & classification (logs/traces/alerts/metrics)
40+
41+
***Display components***
42+
43+
* Dashboards
44+
* Maps
45+
* Applications
46+
* Notebooks
47+
* Operations Panels
48+
* Saved PPL/SQL/DQL Queries
49+
* Alerts
50+
51+
Since the structured data has an enormous contribution to the understanding of the system behaviour - each resource will define a well-structured mapping it conforms with.
52+
53+
Once input content has form and shape - it can and will be used to calculate and correlate different pieces of data.
54+
55+
The next parts of this document will present **Integrations For Observability** which has a key concept of Observability schema.
56+
57+
It will overview the concepts of observability, will describe the current issues customers are facing with observability and continue to elaborate on how to mitigate them using Integrations and structured schemas.
58+
59+
---
60+
61+
### Creating An Integration
62+
63+
```yaml
64+
65+
integration-template-name
66+
config.json
67+
display
68+
Application.json
69+
Maps.json
70+
Dashboard.json
71+
stored-queries
72+
Query.json
73+
transformation-schemas
74+
transformation.json
75+
samples
76+
resource.access logs
77+
resource.error logs
78+
resource.stats metrics
79+
expected_results
80+
info
81+
documentation
82+
images
83+
```
84+
85+
**Definitions**
86+
87+
- `config.json` defines the general configuration for the entire integration component.
88+
- `display` this is the folder in which the actual visualization components are stored
89+
- `queries` this is the folder in which the actual PPL queries are stored
90+
- `schemas` this is the folder in which the schemas are stored - schema for mapping translations or index mapping.
91+
- `samples` this folder contains sample logs and translated logs are present
92+
- `metadata` this folder contains additional metadata definitions such as security and policies
93+
- `info` this folder contains documentations, licences and external references
94+
95+
---
96+
97+
#### Config
98+
99+
`Config.json` file includes the following Integration configuration see [NginX config](nginx/config.json)
100+
101+
Additional information on the config structure see [Structure](docs/Integration-structure.md)
102+
103+
#### Display:
104+
105+
Visualization contains the relevant visual components associated with this integration.
106+
107+
The visual display component will need to be validated to the schema that it is expected to work on - this may be part of the Integration validation flow...
108+
109+
#### Queries
110+
111+
Queries contains specific PPL queries that precisely demonstrates some common and useful use-case .
112+
113+

integrations/nginx/assets/display/sso-logs-dashboard-new.ndjson

Lines changed: 266 additions & 0 deletions
Large diffs are not rendered by default.

integrations/nginx/config.json

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
{
2+
"name": "nginx",
3+
"version": {
4+
"integ": "0.1.0",
5+
"schema": "1.0.0",
6+
"resource": "^1.23.0"
7+
},
8+
"description": "Nginx HTTP server collector",
9+
"identification": "instrumentationScope.attributes.identification",
10+
"catalog": "observability",
11+
"components": [
12+
"communication","http"
13+
],
14+
"collection":[
15+
{
16+
"logs": [{
17+
"info": "access logs",
18+
"input_type":"logfile",
19+
"dataset":"nginx.access",
20+
"labels" :["nginx","access"]
21+
},
22+
{
23+
"info": "error logs",
24+
"input_type":"logfile",
25+
"labels" :["nginx","error"],
26+
"dataset":"nginx.error"
27+
}]
28+
},
29+
{
30+
"metrics": [{
31+
"info": "status metrics",
32+
"input_type":"metrics",
33+
"dataset":"nginx.status",
34+
"labels" :["nginx","status"]
35+
}]
36+
}
37+
],
38+
"repo": {
39+
"github": "https://github.com/opensearch-project/observability/tree/main/integrarions/nginx"
40+
}
41+
}
42+

integrations/nginx/info/README.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
![](nginx.png)
2+
3+
# Nginx Integrations
4+
5+
## What it Nginx ?
6+
Nginx is a popular open-source web server software used by millions of websites worldwide. It was developed to address the limitations of Apache, which is another popular web server software. Nginx is known for its high performance, scalability, and reliability, and is widely used as a reverse proxy server, load balancer, and HTTP cache.
7+
8+
One of the primary advantages of Nginx is its ability to handle large numbers of concurrent connections and requests. It uses an event-driven architecture that allows it to handle multiple connections with minimal resources, making it an ideal choice for high-traffic websites. In addition, Nginx can also serve static content very efficiently, which further improves its performance.
9+
10+
Another important feature of Nginx is its ability to act as a reverse proxy server. This means that it can sit in front of web servers and route incoming requests to the appropriate server based on various criteria, such as the URL or the type of request. Reverse proxying can help improve website performance and security by caching static content, load balancing incoming traffic, and providing an additional layer of protection against attacks.
11+
12+
Nginx is also widely used as a load balancer. In this role, it distributes incoming traffic across multiple web servers to improve performance and ensure high availability. Nginx can balance traffic using a variety of algorithms, such as round-robin or least connections, and can also perform health checks to ensure that requests are only sent to healthy servers.
13+
14+
Finally, Nginx is also an effective HTTP cache. By caching frequently accessed content, Nginx can reduce the load on backend servers and improve website performance. Nginx can cache content based on a variety of criteria, such as the URL, response headers, or response body.
15+
16+
## What is An Nginx Integration ?
17+
As described in the [documentation](../../README.md) Nginx integrations is a bundle of resources, assets and documentations.
18+
19+
An Integration may have multiple ways of ingesting Observability signals, for example nginx logs may arrive via fluent-bit agent or OTEL-logs collector...
20+
21+
## Which are the Nginx Observability providers ?
22+
Observability Providers are agents which can collect nginx logs, metrics and traces information, convert them to `sso` observability schema and send them to opensearch observability data-streams.
23+
24+
### Fluent-Bit
25+
Fluent-bit has a dedicated input plugin for Nginx called `in_tail` which can be used to tail the Nginx access logs and send them to a destination of your choice.
26+
The in_tail plugin reads log files line by line and sends them to Fluent-bit engine to be processed.
27+
28+
See additional details [here](fluet-bit/README.md).
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
![](fluentbit.png)
2+
3+
## Fluent-bit
4+
5+
Fluent-bit is a lightweight and flexible data collector and forwarder, designed to handle a large volume of log data in real-time.
6+
It is an open-source projectpart of the Cloud Native Computing Foundation (CNCF). and has gained popularity among developers for simplicity and ease of use.
7+
8+
Fluent-bit is designed to be lightweight, which means that it has a small footprint and can be installed on resource-constrained environments like embedded systems or containers. It is written in C language, making it fast and efficient, and it has a low memory footprint, which allows it to consume minimal system resources.
9+
10+
Fluent-bit is a versatile tool that can collect data from various sources, including files, standard input, syslog, and TCP/UDP sockets. It also supports parsing different log formats like JSON, Apache, and Syslog. Fluent-bit provides a flexible configuration system that allows users to tailor their log collection needs, which makes it easy to adapt to different use cases.
11+
12+
One of the main advantages of Fluent-bit is its ability to forward log data to various destinations, including Opensearch, InfluxDB, and Kafka. Fluent-bit provides multiple output plugins that allow users to route their log data to different destinations based on their requirements. This feature makes Fluent-bit ideal for distributed systems where log data needs to be collected and centralized in a central repository.
13+
14+
Fluent-bit also provides a powerful filtering mechanism that allows users to manipulate log data in real-time. It supports various filter plugins, including record modifiers, parsers, and field extraction. With these filters, users can parse and enrich log data, extract fields, and modify records before sending them to their destination.
15+
16+
## Setting Up Fluent-bit agent
17+
18+
For setting up a fluent-bit agent on Nginx, please follow the next instructions
19+
20+
- Install Fluent-bit on the Nginx server. You can download the latest package from the official Fluent-bit website or use your package manager to install it.
21+
22+
- Once Fluent-bit is installed, create a configuration file named fluent-bit.conf in the /etc/fluent-bit/ directory. Add the following configuration to the file:
23+
24+
```text
25+
[SERVICE]
26+
Flush 1
27+
Log_Level info
28+
Parsers_File parsers.conf
29+
30+
[Filter]
31+
Name lua
32+
Match *
33+
code function cb_filter(a,b,c)local d={}local e=os.date("!%Y-%m-%dT%H:%M:%S.000Z")d["observerTime"]=e;d["body"]=c.remote.." "..c.host.." "..c.user.." ["..os.date("%d/%b/%Y:%H:%M:%S %z").."] \""..c.method.." "..c.path.." HTTP/1.1\" "..c.code.." "..c.size.." \""..c.referer.."\" \""..c.agent.."\""d["trace_id"]="102981ABCD2901"d["span_id"]="abcdef1010"d["attributes"]={}d["attributes"]["data_stream"]={}d["attributes"]["data_stream"]["dataset"]="nginx.access"d["attributes"]["data_stream"]["namespace"]="production"d["attributes"]["data_stream"]["type"]="logs"d["event"]={}d["event"]["category"]={"web"}d["event"]["name"]="access"d["event"]["domain"]="nginx.access"d["event"]["kind"]="event"d["event"]["result"]="success"d["event"]["type"]={"access"}d["http"]={}d["http"]["request"]={}d["http"]["request"]["method"]=c.method;d["http"]["response"]={}d["http"]["response"]["bytes"]=tonumber(c.size)d["http"]["response"]["status_code"]=c.code;d["http"]["flavor"]="1.1"d["http"]["url"]=c.path;d["communication"]={}d["communication"]["source"]={}d["communication"]["source"]["address"]="127.0.0.1"d["communication"]["source"]["ip"]=c.remote;return 1,b,d end
34+
call cb_filter
35+
36+
[INPUT]
37+
Name tail
38+
Path /var/log/nginx/access.log
39+
Tag nginx.access
40+
DB /var/log/flb_input.access.db
41+
Mem_Buf_Limit 5MB
42+
Skip_Long_Lines On
43+
44+
[OUTPUT]
45+
Name opensearch
46+
Match nginx.*
47+
Host <OSS_HOST>
48+
Port <OSS_PORT>
49+
Index sso_nginx-access-%Y.%m.%d
50+
```
51+
Here, we specify the input plugin as tail, set the path to the Nginx access log file, and specify a tag to identify the logs in Fluent-bit. We also set some additional parameters such as memory buffer limit and skipping long lines.
52+
53+
For the output, we use the `opensearch` plugin to send the logs to Opensearch. We specify the Opensearch host, port, and index name.
54+
55+
- Modify the Opensearch host and port in the configuration file to match your Opensearch installation.
56+
- Depending on the system where Fluent Bit is installed:
57+
- Start the Fluent-bit service by running the following command:
58+
59+
```text
60+
sudo systemctl start fluent-bit
61+
```
62+
- Verify that Fluent-bit is running by checking its status:
63+
```text
64+
sudo systemctl status fluent-bit
65+
```
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
[INPUT]
2+
Name tail
3+
Path /var/log/nginx/access.log
4+
Tag nginx.access
5+
DB /var/log/flb_input.access.db
6+
Mem_Buf_Limit 5MB
7+
Skip_Long_Lines On
8+
9+
[Filter]
10+
Name lua
11+
Match *
12+
code function cb_filter(a,b,c)local d={}local e=os.date("!%Y-%m-%dT%H:%M:%S.000Z")d["observerTime"]=e;d["body"]=c.remote.." "..c.host.." "..c.user.." ["..os.date("%d/%b/%Y:%H:%M:%S %z").."] \""..c.method.." "..c.path.." HTTP/1.1\" "..c.code.." "..c.size.." \""..c.referer.."\" \""..c.agent.."\""d["trace_id"]="102981ABCD2901"d["span_id"]="abcdef1010"d["attributes"]={}d["attributes"]["data_stream"]={}d["attributes"]["data_stream"]["dataset"]="nginx.access"d["attributes"]["data_stream"]["namespace"]="production"d["attributes"]["data_stream"]["type"]="logs"d["event"]={}d["event"]["category"]={"web"}d["event"]["name"]="access"d["event"]["domain"]="nginx.access"d["event"]["kind"]="event"d["event"]["result"]="success"d["event"]["type"]={"access"}d["http"]={}d["http"]["request"]={}d["http"]["request"]["method"]=c.method;d["http"]["response"]={}d["http"]["response"]["bytes"]=tonumber(c.size)d["http"]["response"]["status_code"]=c.code;d["http"]["flavor"]="1.1"d["http"]["url"]=c.path;d["communication"]={}d["communication"]["source"]={}d["communication"]["source"]["address"]="127.0.0.1"d["communication"]["source"]["ip"]=c.remote;return 1,b,d end
13+
call cb_filter
14+
15+
[OUTPUT]
16+
Name os
17+
Match nginx.*
18+
Host <OSS_HOST>
19+
Port <OSS_PORT>
20+
Index sso_nginx-access-%Y.%m.%d
2.39 KB
Loading

integrations/nginx/info/nginx.png

31.7 KB
Loading
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Samples
2+
The sample folder contains any type of sampled data that explains and demonstrates the expected input signals.
3+
4+
Specifically this folder contains two inner folder
5+
- **preloaded** containing a ready-made nginx access logs with detailed instructions on how to load them into the appropriate opensearch data-stream.
6+
- **results** a folder containing the expected json structure that conforms to `sso` simple schema for logs in opensearch.
7+
8+
Any other internal folder can be added that represents additional aspects of this integration expected ingesting content.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Nginx Dashboard Playground
2+
For the purpose of playing and reviewing the nginx dashboard, this tutorial uses the nginx preloaded access-logs data. This sample data was generated using nginx fluent-bit data generator repo and translated it using the
3+
fluent-bit nginx lua parser - that appears in the test mention below.
4+
- [Fluent-bit](https://github.com/fluent/fluent-bit)
5+
- [Services Playground](../../test/README.md)
6+
7+
The [sample logs](bulk_logs.json) are added here under the preloaded data folder and are ready to be ingested into open search.
8+
9+
## Demo Instructions
10+
11+
1. Start docker-compose docker compose up --build.
12+
This will load both opensearch server & dashboards
13+
- `$ docker compose up`
14+
- Ensure vm.max_map_count has been set to 262144 or higher (`sudo sysctl -w vm.max_map_count=262144`).
15+
16+
2. Load the Simple Schema Logs index templates [Loading Logs](../../../../schema/observability/logs/Usage.md)
17+
18+
- `curl -XPUT localhost:9200/_component_template/http_template -H "Content-Type: application/json" --data-binary @http.mapping`
19+
20+
- `curl -XPUT localhost:9200/_component_template/communication_template -H "Content-Type: application/json" --data-binary @communication.mapping`
21+
22+
- `curl -XPUT localhost:9200/_index_template/logs -H "Content-Type: application/json" --data-binary @logs.mapping`
23+
3. Bulk load the Nginx access logs preloaded data into the `sso_logs-nginx-prod` data_stream
24+
- `curl -XPOST "localhost:9200/sso_logs-nginx-prod/_bulk?pretty&refresh" -H "Content-Type: application/json" --data-binary @bulk_logs.json`
25+
26+
4. We can now load the Nginx dashboards to display the preloaded nginx access logs [dashboards](../../assets/display/sso-logs-dashboard-new.ndjson)
27+
- First add an index pattern `sso_logs-*-*`
28+
- `curl -X POST localhost:5601/api/saved_objects/index-pattern/sso_logs -H 'osd-xsrf: true' -H 'Content-Type: application/json' -d '{ "attributes": { "title": "sso_logs-*-*", "timeFieldName": "@timestamp" } }'`
29+
30+
- Load the [dashboards](../../assets/display/sso-logs-dashboard-new.ndjson)
31+
- `curl -X POST "localhost:5601/api/saved_objects/_import?overwrite=true" -H "osd-xsrf: true" --form file=@sso-logs-dashboard.ndjson`
32+
5. Open the dashboard and view the preloaded access logs
33+
- Go to [Dashbords](http://localhost:5601/app/dashboards#/list?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'2023-02-24T17:10:34.442Z',to:'2023-02-24T17:46:44.056Z'))
34+
- data-stream name :`sso_logs-nginx-prod`
35+
36+
![](img/nginx-dashboard.png)

0 commit comments

Comments
 (0)