Skip to content

Ema-Umar/password-security-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Password Security Analysis

Python Pandas Matplotlib Status

This project analyzes real-world password patterns using the RockYou leaked password dataset.
The goal of this analysis is to understand how users create passwords and identify common weaknesses in password security.

Using Python, Pandas, and Matplotlib, the script extracts password features, performs statistical analysis, and generates visualizations to reveal password security trends.


Features Analyzed

The analysis extracts several security-related characteristics from each password:

  • Password length
  • Presence of uppercase letters
  • Presence of numbers
  • Presence of special characters
  • Detection of year patterns (e.g., 1990, 2001)

These features help evaluate how complex or predictable passwords are.


Dataset

This project uses the RockYou password dataset, one of the most widely studied leaked password datasets in cybersecurity research.

Dataset size:

  • 14,316,879 passwords

⚠️ The dataset is not included in this repository because of its large size.


Key Findings

Metric Result
Weak passwords (<8 characters) 33.03%
Passwords with uppercase letters 9.31%
Passwords with numbers 68.1%
Passwords with special characters 6.91%
Passwords containing years 6.63%

Observations

  • About 1 in 3 passwords are shorter than the recommended security length.
  • Only 9% of passwords include uppercase letters, indicating low password complexity.
  • Numbers appear frequently, but numbers alone do not significantly improve password strength.
  • Special characters are rarely used, appearing in less than 7% of passwords.
  • Many passwords include year patterns, which makes them easier to predict.

These findings highlight how predictable human password behavior can be.


Visualizations

Password Length Distribution

Password Length Distribution

Character Type Usage

Character Type Usage


Technologies Used

  • Python
  • Pandas
  • Matplotlib

How to Run the Project

1. Install Required Libraries

Install the required Python libraries:

pip install pandas matplotlib

2. Download the Dataset

Download the RockYou password dataset from Kaggle:

https://www.kaggle.com/datasets/wjburns/common-password-list-rockyoutxt

After downloading, extract the file:

rockyou.txt

3. Place the Dataset in the Correct Folder

Create a folder named Data inside the project directory and place the dataset inside it.

Your project folder should look like this:

password-security-analysis │ ├── analysis.py ├── README.md ├── password_length_distribution.png ├── character_type_usage.png │ └── Data └── rockyou.txt

4. Run the Script

Run the Python script: python analysis.py

The program will:

  • analyze password patterns
  • print statistics in the terminal
  • generate visualization charts

Project Purpose

This project is intended for educational and research purposes only.
The goal is to analyze password security patterns and understand weaknesses in real-world password behavior.

About

Password security analysis using the RockYou dataset with Python and Pandas.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages