Skip to content

Commit 240f1a0

Browse files
committed
Initial Commit
0 parents  commit 240f1a0

File tree

14 files changed

+1115
-0
lines changed

14 files changed

+1115
-0
lines changed

.dockerignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
.git
2+
.gitignore

.gitignore

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
*.egg-info/
24+
.installed.cfg
25+
*.egg
26+
MANIFEST
27+
28+
# PyInstaller
29+
# Usually these files are written by a python script from a template
30+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
31+
*.manifest
32+
*.spec
33+
34+
# Installer logs
35+
pip-log.txt
36+
pip-delete-this-directory.txt
37+
38+
# Unit test / coverage reports
39+
htmlcov/
40+
.tox/
41+
.coverage
42+
.coverage.*
43+
.cache
44+
nosetests.xml
45+
coverage.xml
46+
*.cover
47+
.hypothesis/
48+
.pytest_cache/
49+
50+
# Translations
51+
*.mo
52+
*.pot
53+
54+
# Django stuff:
55+
*.log
56+
local_settings.py
57+
db.sqlite3
58+
59+
# Flask stuff:
60+
instance/
61+
.webassets-cache
62+
63+
# Scrapy stuff:
64+
.scrapy
65+
66+
# Sphinx documentation
67+
docs/_build/
68+
69+
# PyBuilder
70+
target/
71+
72+
# Jupyter Notebook
73+
.ipynb_checkpoints
74+
75+
# pyenv
76+
.python-version
77+
78+
# celery beat schedule file
79+
celerybeat-schedule
80+
81+
# SageMath parsed files
82+
*.sage.py
83+
84+
# Environments
85+
.env
86+
.venv
87+
env/
88+
venv/
89+
ENV/
90+
env.bak/
91+
venv.bak/
92+
93+
# Spyder project settings
94+
.spyderproject
95+
.spyproject
96+
97+
# Rope project settings
98+
.ropeproject
99+
100+
# mkdocs documentation
101+
/site
102+
103+
# mypy
104+
.mypy_cache/

.travis.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
language: python
2+
os: linux
3+
dist: bionic
4+
python:
5+
- "3.6"
6+
- "3.7"
7+
- "3.8"
8+
script:
9+
- python -m unittest

Dockerfile

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
FROM python:3.8-slim
2+
3+
WORKDIR /app
4+
5+
RUN apt-get update \
6+
&& apt-get install -y git \
7+
&& apt-get clean \
8+
&& rm -rf /var/lib/apt/lists/*
9+
10+
COPY requirements.txt /app/
11+
RUN pip install --no-cache-dir --upgrade pip \
12+
&& pip install --no-cache-dir -r requirements.txt
13+
14+
COPY *.py /app/
15+
COPY utils/*.py /app/utils/
16+
COPY kong_log_bridge/*.py /app/kong_log_bridge/
17+
COPY README.md /app/
18+
19+
EXPOSE 8080
20+
21+
ENTRYPOINT ["python", "-u", "main.py"]

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
The MIT License (MIT)
2+
3+
Copyright (c) 2020 Braedon Vickers
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
Kong Request Log Bridge
2+
====
3+
Transform Kong request logs and forward them to Elasticsearch. Redact request logs for improved privacy and security, and index them directly into Elasticsearch, without the need for complex and heavyweight tools like Logstash.
4+
5+
[Source Code](https://github.com/braedon/kong-log-bridge) | [Docker Image](https://hub.docker.com/r/braedon/kong-log-bridge)
6+
7+
# Usage
8+
The service is distributed as a docker image. Released versions can be found on Docker Hub (note that no `latest` version is provided):
9+
10+
```bash
11+
> sudo docker pull braedon/kong-log-bridge:<version>
12+
```
13+
14+
The docker image exposes a REST API on port `8080`. It is configured by passing options after the image name:
15+
```bash
16+
> sudo docker run --rm --name kong-log-bridge \
17+
-p <host port>:8080 \
18+
braedon/kong-log-bridge:<version> \
19+
-e <elasticsearch node> \
20+
--convert-ts \
21+
--hash-ip \
22+
--hash-auth \
23+
--hash-cookie
24+
```
25+
Run with the `-h` flag to see details on all the available options.
26+
27+
Note that all options can be set via environment variables. The environment variable names are prefixed with `KONG_LOG_BRIDGE_OPT`, e.g. `KONG_LOG_BRIDGE_OPT_CONVERT_TS=true` is equivalent to `--convert-ts`. CLI options take precedence over environment variables.
28+
29+
## Input
30+
Kong JSON request logs can be `POST`ed to the `/logs` endpoint. This is designed for logs to be sent by the [Kong HTTP Log plugin](https://docs.konghq.com/hub/kong-inc/http-log/). See the Kong documentation for details on how to enable and configure the plugin.
31+
32+
This is currently the only supported input method, but more may be added in the future.
33+
34+
## Transformation
35+
Request logs are passed through unchanged by default, but you probably want to enable at least one transformation.
36+
37+
### Timestamp Conversion `--convert-ts`
38+
Kong request logs include a number of UNIX timestamps (some in milliseconds rather than seconds). These are not human readable, and require explicit mappings to be used in Elasticsearch. Enabling this option will convert these timestamps to [RFC3339 date-time strings](https://www.ietf.org/rfc/rfc3339.txt) for readability and automatic Elasticsearch mapping.
39+
40+
Fields converted:
41+
```
42+
- service.created_at
43+
- service.updated_at
44+
- route.created_at
45+
- route.updated_at
46+
- started_at
47+
- tries[].balancer_start
48+
```
49+
50+
### Client IP Hashing `--hash-ip`
51+
This option enables hashing the `client_ip` field to avoid storing sensitive user IP addresses.
52+
53+
### Authorization Hashing `--hash-auth`
54+
This option enables hashing the `credentials` part of the [`Authorization` request header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Authorization) (`request.headers.authorization` field) to avoid storing credentials/tokens.
55+
56+
```
57+
Authorization: Bearer some_secret_token -> Bearer 7ftgstREEBqhHrQNgj6MVA
58+
```
59+
60+
### Cookie Hashing `--hash-cookie`
61+
This option enables hashing the `value` part of the [`Cookie` request header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cookie) (`request.headers.cookie` field) and [`Set-Cookie` response header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie) (`response.headers.set-cookie` field) to avoid storing sensitive cookies.
62+
63+
```
64+
Cookie: some_cookie=some_session -> some_cookie=q1EXmTUdD0Bvm8_jHrQizw
65+
Set-Cookie: some_cookie=some_session; Secure; HttpOnly; SameSite=Lax -> some_cookie=q1EXmTUdD0Bvm8_jHrQizw; Secure; HttpOnly; SameSite=Lax
66+
```
67+
68+
### Field Hashing and Nulling `--hash-path`/`--null-path`
69+
Arbitrary request log fields can be hashed or converted to null by specifying their path with these options. Provide the desired option multiple times to specify multiple paths.
70+
71+
Paths describe how to traverse the JSON structure of the request logs to find a field. They consist of a hierarchy of object fields to traverse from the root JSON object, separated by periods (`.`). The `[]` suffix on a field indicates its value is an array, and should be iterated.
72+
73+
e.g. `--hash-path tries[].ip` will hash the `ip` of every upstream "try" in the `tries` array.
74+
75+
Paths don't need to end at specific value - they can specify an entire object or array.
76+
77+
e.g. `--null-path request.headers` will convert the entire `request.headers` object to null, effectively removing it from the log.
78+
79+
If a path doesn't match any field in a given request log it will be ignored.
80+
81+
## Output
82+
Transformed logs are indexed in Elasticsearch.
83+
84+
This is currently the only supported output method, but more may be added in the future.
85+
86+
### Elasticsearch Nodes `-e`/`--es-node` (required)
87+
The address of at least one Elasticsearch node must be provided via this option. The port should be included if non-standard (`9200`). Provide the option multiple times to specify multiple nodes in a cluster.
88+
89+
### Elasticsearch Index `-es-index`
90+
The Elasticsearch index to send logs to. [Elasticsearch index date math](https://www.elastic.co/guide/en/elasticsearch/reference/current/date-math-index-names.html) can be used. Defaults to `<kong-requests-{now/d}>`.
91+
92+
### Elasticsearch Security
93+
A number of options exist to support Elasticsearch server and client SSL, and basic authentication. See the `-h` output for details.
94+
95+
# Development
96+
To run directly from the git repo, run the following in the root project directory:
97+
```bash
98+
> pip3 install -r requirements.txt
99+
> python3 main.py [OPTIONS]
100+
```
101+
To run tests (as usual, from the root project directory), use:
102+
```bash
103+
> python3 -m unittest
104+
```
105+
Note that these tests currently only cover the log transformation functionality - there are no automated system tests as of yet.
106+
107+
To build a docker image directly from the git repo, run the following in the root project directory:
108+
```bash
109+
> sudo docker build -t <your repository name and tag> .
110+
```
111+
112+
To develop in a docker container, first build the image, and then run the following in the root project directory:
113+
```bash
114+
> sudo docker run --rm -it --name kong-log-bridge --entrypoint bash -v $(pwd):/app <your repository name and tag>
115+
```
116+
This will mount all the files inside the container, so editing tests or application code will be synced live. You can run the tests with `python -m unittest`.
117+
118+
Send me a PR if you have a change you want to contribute!

kong_log_bridge/__init__.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
import json
2+
import simplejson
3+
4+
from bottle import Bottle, abort, request, response
5+
6+
from .transform import transform_log
7+
8+
9+
def json_default_error_handler(http_error):
10+
response.content_type = 'application/json'
11+
return json.dumps({'error': http_error.body}, separators=(',', ':'))
12+
13+
14+
def construct_app(es_client, es_index, **kwargs):
15+
app = Bottle()
16+
app.default_error_handler = json_default_error_handler
17+
18+
@app.get('/status')
19+
def status():
20+
return 'OK'
21+
22+
@app.post('/logs')
23+
def logs():
24+
if request.headers.get('Content-Type') != 'application/json':
25+
abort(415, 'Require "Content-Type: application/json"')
26+
27+
try:
28+
log = request.json
29+
except simplejson.JSONDecodeError:
30+
abort(400, 'POST data is not valid JSON')
31+
32+
if not isinstance(log, dict):
33+
abort(400, 'POST body must be a JSON object')
34+
35+
log = transform_log(log,
36+
do_convert_ts=kwargs['convert_ts'],
37+
do_hash_ip=kwargs['hash_ip'],
38+
do_hash_auth=kwargs['hash_auth'],
39+
do_hash_cookie=kwargs['hash_cookie'],
40+
hash_paths=kwargs['hash_path'],
41+
null_paths=kwargs['null_path'])
42+
43+
es_client.index(index=es_index, body=log, request_timeout=30)
44+
45+
response.status = 204
46+
47+
return app

0 commit comments

Comments
 (0)