`HTTP Log Pipeline`

What is this project ?

This project demonstrates a HTTP log pipeline including generation and analysis (Producer-Consumer) of logs with separate services.

Features

Seperation of Concerns
- Generator and Analyzer services solves different problems with no knowledge of each other internals.

Structured Log Generation

{
    "method": "GET", 
    "endpoint": "/api/tasks/999", 
    "status_code": "404", 
    "ip": "142.250.72.198", 
    "response_time": 34, 
    "user_agent": "Safari", 
    "timestamp": "2025-12-24T00:00:00.001000"
}

Log Streaming
- Logs are streamed one by one not stored in memory.
Scale-Safe Aggregation
- Since Logs are not stored in memory, it can handle very large files without failing.
Meaningful Metrics
- Analyzer aggregates meaningful metrics while streaming logs.

`CLI`

What problem it solves ?

It analyzes structured HTTP request logs and aggregates operational metrics to understand traffic patterns, failures, and performance bottlenecks.

What metrics it provides ?

1. Total Requests
2. Total Successful Requests
3. Total Failed Requests
4. Fastest Endpoint
5. Slowest Endpoint
6. Busiest Endpoint
7. Worst Window [start - end, total_reqs, 4xx errors, 5xx errors, suspicious_ips]
8. Overall Suspicious IPs 
9. Error rates for each endpoint.

Design Constraints

No database or external libraries.
Logs are processed in a single streaming pass.
Analyzer does not assume log file size fits in memory.
Generator and analyzer are decoupled by schema, not implementation.

Folder Structure

|- main.py          Entry point of the application.
|- services/
|   - generator.py  Generates structured logs 
    - analyzer.py   Aggregates insights by streaming logs
|- utils.py         Utility functions

How to run it ?

`Requirements`

Python version >= 3.12.0

Clone the repo:

git clone https://github.com/ayvkma/http-log-pipeline.git

```
python main.py
```

Key Learnings

Designing a producer–consumer pipeline.
Enforcing separation of concerns across services.
Streaming large NDJSON datasets safely.
Time-based aggregation and windowing.
Translating raw logs into operational metrics.

This project focuses on correctness and clarity over completeness and intentionally avoids overengineering.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
services		services
.gitignore		.gitignore
README.md		README.md
image.png		image.png
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`HTTP Log Pipeline`

What is this project ?

Features

`CLI`

What problem it solves ?

What metrics it provides ?

Design Constraints

Folder Structure

How to run it ?

`Requirements`

Clone the repo:

Key Learnings

About

Uh oh!

Releases

Packages

Languages

ayvkma/http-log-pipeline

Folders and files

Latest commit

History

Repository files navigation

HTTP Log Pipeline

What is this project ?

Features

CLI

What problem it solves ?

What metrics it provides ?

Design Constraints

Folder Structure

How to run it ?

Requirements

Clone the repo:

Key Learnings

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`HTTP Log Pipeline`

`CLI`

`Requirements`

Packages