cis6930fa24 -- Project 3

Name: Pradhyumna Reddy Madhulapally

Project Description

This project extends the original Norman PD incident analysis tool by providing a web interface for visualizing incident data. The application allows users to analyze incident reports from the Norman, Oklahoma police department through interactive visualizations, supporting both URL submissions and file uploads of PDF incident reports.

Features

Web-based interface for data submission
Support for both URL and file upload inputs
Interactive data visualizations:
- Clustering analysis of incidents by location and nature
- Top 15 most frequent incident types
- Heatmap of incidents across locations
Real-time data processing
SQLite database storage
Error handling and user feedback

Project Structure

cis6930fa24-project3/
├── src/
│   ├── core/
│   │   └── project0.py
│   ├── visualization/
│   │   └── visualizations.py
│   └── web/
│       ├── app.py
│       └── templates/
│           └── index.html
├── tests/
│   └── test_project.py
└── README.md

How to install

# Install dependencies
pipenv install flask matplotlib scikit-learn numpy pandas seaborn pypdf

How to run

# Start the web server
pipenv run python src/web/app.py

Access the interface at http://localhost:5001

Testing

# Start the web server
# Run all tests
pipenv run pytest

# Run with verbose output
pipenv run pytest -v

![video](https://youtu.be/m8eKV--7VoY)

Functions

project0.py

fetchincidents(url)
- Parameters: URL string for PDF fetch
- Returns: String with extracted text data
extractincidents(data)
- Parameters: Raw text data from PDF
- Returns: List of parsed incidents
parse_incident(line)
- Parameters: Single line of incident data
- Returns: List of fields or None if invalid
createdb()
- Creates: SQLite database with incidents table
- Returns: Database connection object
populatedb(conn, incidents)
- Parameters: Database connection and incident list
- Action: Inserts incidents into database

visualizations.py Functions:

visualize_data(incidents)

Parameters:
- incidents: List of incidents, each containing [time, number, location, nature, ORI]
What it does:
- Creates DBSCAN clustering of incidents by location and nature
- Generates bar graph of top 15 incident types
- Produces heatmap of top 10 locations and incident types
Returns:
- List of matplotlib figure objects containing the three visualizations

app.py Functions:

convert_plot_to_base64(fig)

Parameters:
- fig: Matplotlib figure object
What it does:
- Converts matplotlib figure to base64 encoded string
- Sets high DPI and proper formatting
Returns:
- Base64 encoded string of the figure

process_pdf_data(pdf_data)

Parameters:
- pdf_data: Binary PDF data from file upload or URL
What it does:
- Processes PDF data using PdfReader
- Extracts text content with proper formatting
Returns:
- String containing extracted text from PDF

process_incidents()

Parameters: None
What it does:
- Handles URL or file upload processing
- Validates and extracts incident data
- Creates database entries
- Generates visualizations
Returns:
- JSON response with status, message, and visualization data

index()

Parameters: None
What it does:
- Renders the main web interface
Returns:
- HTML template for the web interface

Test Cases

test_fetch_incidents()

Purpose: Tests the PDF fetching functionality from a URL
What it tests:
- Verifies successful data retrieval from a valid URL
- Checks if returned data is either string or bytes type
- Ensures non-null response

test_extract_incidents()

Purpose: Tests the incident extraction from sample data
What it tests:
- Proper parsing of incident data format
- Verification of all required fields
- Correct number of fields (5) in parsed output

test_database_creation()

Purpose: Tests SQLite database initialization
What it tests:
- Successful database connection establishment
- Proper table creation
- Correct table name ('incidents') existence

test_database_population()

Purpose: Tests database population with sample data
What it tests:
- Successful insertion of sample incidents
- Row count verification
- Database connection handling

test_visualization_generation()

Purpose: Tests the generation of visualizations
What it tests:
- Creation of all three required visualizations
- Proper figure object properties
- Visualization saving capabilities

test_invalid_url()

Purpose: Tests error handling for invalid URLs
What it tests:
- Exception raising for non-existent URLs
- Proper error handling behavior

test_empty_data()

Purpose: Tests handling of empty input data
What it tests:
- Proper handling of empty strings
- Returns empty list for empty input

test_malformed_incident()

Purpose: Tests handling of incorrect data format
What it tests:
- Processing of invalid incident data
- Returns empty list for malformed input

Assumptions and Limitations

PDF reports follow consistent formatting
Incident data contains standard fields
File size limited to 16MB
Requires stable internet connection for URL fetching

Known Issues

Format changes in PDF structure may affect parsing
Multiple simultaneous uploads not supported
PDF text extraction depends on consistent spacing

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
resources		resources
src		src
tests		tests
video		video
COLLABORATORS.md		COLLABORATORS.md
LICENSE.md		LICENSE.md
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
main.py		main.py
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cis6930fa24 -- Project 3

Project Description

Features

Project Structure

How to install

How to run

Testing

Functions

Test Cases

Assumptions and Limitations

Known Issues

About

Uh oh!

Releases

Packages

Languages

License

pradhyumna186/NormanPD-Data-Proccessing

Folders and files

Latest commit

History

Repository files navigation

cis6930fa24 -- Project 3

Project Description

Features

Project Structure

How to install

How to run

Testing

Functions

Test Cases

Assumptions and Limitations

Known Issues

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages