GitHub - WillTheFarmer/mysql-http-logs: used with files-to-MySQL this schema creates audit trail between Apache and NGINX http access and error log files and the access_log and error_log TABLES with a foreign key.

Database designed for HTTP log data analysis

Ingest years of Apache & NGINX log files into a fully-normalized database. IP Geolocation & User-Agent data integration.

DDL for tables, indexes, views, function, procedures & constraints

All SQL files and Python build file used in schema database application development is here. Individual Function and Procedure files make applying modifications much simpler in Visual Studio Code.

The Python build script assembles all SQL script files together into the single create_http_logs.sql file that is included in repository files-to-mysql repository.

1. NGINX formats and procedural code

From documentation read NGINX standard access logformat is same as Apache combined. I have not verified by examining NGINX data yet.

Apache log formats have been thoroughly researched and tested. NGINX log formats have not.

Repository NGINX files are standard access and error formats from new NGINX server

NGINX log files in /data/nginx_combined/ and /data/nginx_error/ are from new NGINX server.

Each log format has a Stored Procedure. More information will be added over new few days.

2. Database

Before running create_http_logs.sql if User Account root@localhost does not exist on installation server open file and perform a Find and Replace using a User Account with DBA Role on installation server. Copy below:

root`@`localhost`

Rename above ^user to a ^user on your server. For example - root@localhost to dbadmin@localhost

The easiest way to install is use database Command Line Client. Login as User with DBA Role and execute the following:

source path/create_http_logs.sql

Only MySQL server must be configured in my.ini, mysqld.cnf or my.cnf depending on platform with following:

[mysqld]
local-infile=1

3. Create database USER & GRANTS

To minimize data exposure and breach risks create a database USER for Python module with GRANTS to only schema objects and privileges required to execute import processes. Replace hostname from localhost to hostname of installed database if different. Use scripts/mysql_user_and_grants.sql in repository.

Database runs on MySQL & MariaDB

Python handles polling of log file folders and executing database LOAD DATA, Procedures, Functions and SQL Statements. Python drives the application but MySQL or MariaDB does all Data Manipulation & Processing.

This is a fast, reliable processing application with detailed logging and two stages of data parsing.

First stage is performed in LOAD DATA LOCAL INFILE statements with data-driven "log format" selection of LOAD settings and load_ staging tables.

Second stage is performed in MySQL modules: parse_access_apache, parse_access_nginx, parse_error_apache, parse_error_nginx

HTTP Access and Error data normalization and import is done in MySQL modules: import_access_apache, import_access_nginx, import_error_apache, import_error_nginx

Client IP GeoData is retrieved with Python. Data normalization & import is done in MySQL module: normalize_client

User Agent String is parsed with Python. Data normalization & import is done in MySQL module: normalize_useragent

Application determines what files have been processed using import_file TABLE.

Each imported file has record with name, path, size, created, modified attributes inserted during process_logs.

Application runs on Windows, Linux & MacOS

Python handles File Processing & Database handles Data Processing

The Python application repository is files-to-mysql which is a JSON data-driven Python application to automate importing Access & Error files, normalizing log data into database and generating a well-documented data lineage audit trail.

Multiple Access and Error logs and formats can be loaded, parsed and imported along with User Agent parsing and IP Address Geolocation retrieval processes within a single main:process_files execution.

Visual Interface App

in my development queue is mysql-to-echarts a Web interface for the database schema in Typescript.

mysql-to-echarts is a visualization tool built with Express web application frameworks and W2UI drill-down data grids for data point details while Apache ECharts frameworks is for Data Visualization.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github		.github
constraints		constraints
data		data
functions		functions
procedures		procedures
scripts		scripts
tables		tables
views		views
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
build_schema_ddl_script.py		build_schema_ddl_script.py
change_schema_name_here_before_build.sql		change_schema_name_here_before_build.sql
license_header.txt		license_header.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Database designed for HTTP log data analysis

DDL for tables, indexes, views, function, procedures & constraints

1. NGINX formats and procedural code

2. Database

3. Create database USER & GRANTS

Database runs on MySQL & MariaDB

Application runs on Windows, Linux & MacOS

Python handles File Processing & Database handles Data Processing

Visual Interface App

Other Documents

About

Uh oh!

Releases 2

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

License

WillTheFarmer/mysql-http-logs

Folders and files

Latest commit

History

Repository files navigation

Database designed for HTTP log data analysis

DDL for tables, indexes, views, function, procedures & constraints

1. NGINX formats and procedural code

2. Database

3. Create database USER & GRANTS

Database runs on MySQL & MariaDB

Application runs on Windows, Linux & MacOS

Python handles File Processing & Database handles Data Processing

Visual Interface App

Other Documents

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages