Skip to content

An engineering and analysis project for a fictional software development company.

License

Notifications You must be signed in to change notification settings

kozmik-moore/pet-software-developer-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project: Creating a Data Pipeline for HappyPaws

A DataCamp project

Table of Contents

Overview

This project performs cleaning and analysis using Python. This is done on three datasets supplied by a software developer: animal activity, animal health, and owner information. The goal is to merge the datasets into a single dataset usable by the development team and provide some basic insights from their data.

This is a portfolio project created to demonstrate my proficiency in data cleaning, analysis, and visualization, as well as creating functions to support that workflow using Python. It highlights my ability to work with real-world datasets, derive meaningful insights, and communicate results clearly through code and visualizations.

Project Structure

└── 📁pet-software-developer-pipeline
    └── 📁assets
        ├── image.png
    └── 📁code
        └── 📁utilities
            ├── __init__.py
            ├── config.py
            ├── features.py
            ├── processes.py
            ├── visuals.py
        ├── notebook.ipynb
    └── 📁data
        ├── cleaned.csv
        ├── pet_activities.csv
        ├── pet_health.csv
        ├── users.csv
    └── 📁products
        └── 📁images
            ├── Activity counts (non-health).jpg
            ├── Average monthly activity counts by owner age group.jpg
            ├── Average monthly activity counts by pet type.jpg
            ├── Distribution of activity counts per month by pet type.jpg
            ├── Distribution of activity counts per month.jpg
            ├── Distribution of health visits by pet type.jpg
            ├── Distribution of health visits for all pet types.jpg
            ├── Distribution of monthly health visits by pet type.jpg
            ├── Distribution of monthly health visits for all pet types.jpg
            ├── Distribution of time between health visits - Annual Checkup (owner age group, pet type).jpg
            ├── Monthly average activity count by owner age group and pet type.jpg
            ├── Pet counts by owner age group.jpg
            ├── Pet proportions by owner age group.jpg
            ├── Unique pet counts.jpg
        ├── report.md
    ├── .gitignore
    ├── LICENSE
    ├── README.md
    └── requirements.txt

Data Source(s)

Installation

Prerequisites

  • Python 3.11+
  • pip (Python package manager)

Install dependencies

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\\Scripts\\activate
pip install -r requirements.txt

Clone the repository and install required packages:

git clone https://github.com/kozmik-moore/pet-software-developer-pipeline.git
cd pet-software-developer-pipeline
pip install -r requirements.txt

Usage

Run a Jupyter Notebook

Start the Jupyter server:

jupyter notebook

Open and run notebooks from the /code directory to explore data and generate visualizations.

Conclusions

See full visual report in /products/report.md.

Technologies Used

Contributing

Contributions are welcome. To contribute:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature-branch)
  3. Make your changes
  4. Commit your changes (git commit -m "Add feature")
  5. Push to your branch (git push origin feature-branch)
  6. Open a pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

Kozmik Moore
Email: [email protected]
GitHub: @kozmik-moore
LinkedIn: @kozmik-moore

About

An engineering and analysis project for a fictional software development company.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published