Status: ✅ Completed
Focus: SOC Analyst Training & Threat Detection Logic
Tech Stack: Python 3.11, Regex, Log Parsing
This project simulates a critical SOC (Security Operations Center) responsibility: analyzing web server logs to identify suspicious behavior. The Python script scans Apache access logs to:
- Detect 401 Unauthorized login attempts
- Flag high-volume IP addresses
- Identify possible brute-force patterns
▶️ Click to view the animated demo
This short demo shows the full script in action — including regex-based parsing, failed login detection, and top IP extraction.
This project was designed to build practical cybersecurity skills, including:
- Extracting structured data from logs using regular expressions
- Building basic detection logic without relying on third-party security platforms
- Grouping and analyzing large volumes of requests
- Interpreting behavioral patterns in raw data
These are baseline tasks expected of entry-level SOC analysts and incident responders.
Log-File-Analysis/
├── data/
│ ├── sample_logs/
│ └── access.log
│
├── docs/
| ├── screenshots/
│ ├── read-log-file-output.png
│ ├── regex-parse-output.png
│ ├── failed-login-detection-output.png
│ ├── top-ips-output.png
│ └── repeated-failed-logins-output.png
├── src/
│ └── log_parser.py
├── .gitignore
├── requirements.txt
└── README.md
Below is a step-by-step summary of the main logic flow in the script.
Click to expand for a breakdown of how suspicious behaviour is detected:
⇨ Detection Flow
- Read each log entry from Apache access logs
- Parse lines using regex to extract IPs, timestamps, status codes
- Detect failed login attempts (HTTP 401 responses)
- Count total requests per IP address
- Identify top IPs by request volume
- Group failed logins by IP
- Flag IPs with multiple failures (e.g., 2+ 401s)
- Output summaries in terminal for quick analysis
| Skill | Description |
|---|---|
| Log Parsing | Used regex to extract structured fields from raw Apache logs |
| Threat Pattern Recognition | Detected brute-force login behavior by analyzing frequency & error codes |
| Python Tooling | Used collections.Counter, file handling, and basic CLI logic |
| SOC Awareness | Focused on identifying indicators of suspicious access attempts |
These visuals illustrate key stages of the log analysis, including regex parsing, failed login detection, and suspicious IP grouping.
Click to expand full screenshots
This project builds muscle memory for:
- Reading and interpreting real-world logs
- Spotting anomalies without SIEM platforms
- Thinking like a threat analyst
- Turning raw data into actionable insights
You’re not just scripting — you’re simulating the detection mindset.
| Pattern Detected | Real-World Risk | Mitigation Insight |
|---|---|---|
| Multiple 401s from 1 IP | Brute-force login attempt | Account lockout / rate limiting |
| High request volume from 1 IP | Scanning or enumeration | IP block or alerting in SIEM |
Requests to /login only |
Credential stuffing attempt | MFA or CAPTCHA recommendations |
| Feature | Value Add |
|---|---|
| 📊 Data Visualization | Graph failed logins/IP activity using matplotlib |
| 🌍 GeoIP Lookup | Enrich IP data with geolocation |
| ⏱️ Time-Based Filtering | Detect brute-force within short time windows |
| 📁 SIEM Output Format | Export results for further analysis or alerting |
Hussien Kofi
Aspiring Cybersecurity Analyst
📧 Email
🔗 LinkedIn
💻 GitHub
This project wasn’t just about writing a script — it was about learning how to think like an analyst. I translated raw logs into actionable intelligence, practiced detection logic, and took a step closer to real-world SOC workflows.
- Language: Python 3.11
- Focus: Threat detection via log analysis
- Skills: Regex, log parsing, frequency analysis, brute-force identification
- Outcome: Reinforced key SOC-level capabilities with a clean, documented solution
- Demo: See script in action ↗





