This project analyzes real-world password patterns using the RockYou leaked password dataset.
The goal of this analysis is to understand how users create passwords and identify common weaknesses in password security.
Using Python, Pandas, and Matplotlib, the script extracts password features, performs statistical analysis, and generates visualizations to reveal password security trends.
The analysis extracts several security-related characteristics from each password:
- Password length
- Presence of uppercase letters
- Presence of numbers
- Presence of special characters
- Detection of year patterns (e.g., 1990, 2001)
These features help evaluate how complex or predictable passwords are.
This project uses the RockYou password dataset, one of the most widely studied leaked password datasets in cybersecurity research.
Dataset size:
- 14,316,879 passwords
| Metric | Result |
|---|---|
| Weak passwords (<8 characters) | 33.03% |
| Passwords with uppercase letters | 9.31% |
| Passwords with numbers | 68.1% |
| Passwords with special characters | 6.91% |
| Passwords containing years | 6.63% |
- About 1 in 3 passwords are shorter than the recommended security length.
- Only 9% of passwords include uppercase letters, indicating low password complexity.
- Numbers appear frequently, but numbers alone do not significantly improve password strength.
- Special characters are rarely used, appearing in less than 7% of passwords.
- Many passwords include year patterns, which makes them easier to predict.
These findings highlight how predictable human password behavior can be.
- Python
- Pandas
- Matplotlib
Install the required Python libraries:
pip install pandas matplotlibDownload the RockYou password dataset from Kaggle:
https://www.kaggle.com/datasets/wjburns/common-password-list-rockyoutxt
After downloading, extract the file:
rockyou.txt
Create a folder named Data inside the project directory and place the dataset inside it.
Your project folder should look like this:
password-security-analysis │ ├── analysis.py ├── README.md ├── password_length_distribution.png ├── character_type_usage.png │ └── Data └── rockyou.txt
Run the Python script: python analysis.py
The program will:
- analyze password patterns
- print statistics in the terminal
- generate visualization charts
This project is intended for educational and research purposes only.
The goal is to analyze password security patterns and understand weaknesses in real-world password behavior.

