Skip to content

gperdrizet/4Geeks_datascience_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

4Geeks data science project boilerplate

Codespaces Prebuilds

Minimal Python 3.11 repository for 4Geeks data science assignments. Several useful Python packages and VSCode extensions are installed on Codespace boot-up. Directories for models and data are created within the Codespace but excluded from tracking. The notebooks directory contains notebook.ipynb, run this notebook to verify the environment. It can then be deleted or renamed to use for your project.

1. Getting started

Option 1: GitHub Codespaces (Recommended)

  1. Fork the Repository

    • Click the "Fork" button on the top right of the GitHub repository page
    • 4Geeks students: set 4GeeksAcademy as the owner - 4Geeks pays for your codespace usage. All others, set yourself as the owner
    • Give the fork a descriptive name. 4Geeks students: I recommend including your GitHub username to help in finding the fork if you loose the link
    • Click "Create fork"
    • 4Geeks students: bookmark or otherwise save the link to your fork
  2. Create a GitHub Codespace

    • On your forked repository, click the "Code" button
    • Select "Create codespace on main"
    • If the "Create codespace on main" option is grayed out - go to your codespaces list from the three-bar menu at the upper left and delete an old codespace
    • Wait for the environment to load (dependencies are pre-installed)
  3. Start Working

    • Open notebooks/notebook.ipynb in the Jupyter interface
    • Run the notebook to verify that your environment is working correctly - if there are no errors, you are all set!

Option 2: Local Development

  1. Prerequisites

    • Git
    • Python >= 3.10
  2. Fork the repository

    • Click the "Fork" button on the top right of the GitHub repository page
    • Optional: give the fork a new name and/or description
    • Click "Create fork"
  3. Clone the repository

    • From your fork of the repository, click the green "Code" button at the upper right
    • From the "Local" tab, select HTTPS and copy the link
    • Run the following commands on your machine, replacing <LINK> and <REPO_NAME>
    git clone <LINK>
    cd <REPO_NAME>
  4. Set Up Environment

    python -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
  5. Launch Jupyter & start the notebook

    jupyter notebook notebooks/notebook.ipynb

    Once the notebook opens in your web browser, run it once to verify that your environment is working correctly - if there are no errors, you are all set!

2. Environment

2.1. Repository structure

.
├──.devcontainer
│   └── devcontainer.json  # Codespace/devcontainer configuration
│
├── data/                  # Empty directory for data
├── models/                # Empty directory for models
├── notebooks              # Notebooks directory
│   └── notebook.ipynb     # Test notebook with library version checks
│
├── .gitignore             # Files and directories listed will be ignored by git
├── LICENSE                # Open source GNU license - copy, modify and distribute this repo freely
├── README.md              # This file
└── requirements.txt       # List of Python packages installed during Codespace creation

2.2. Python

Base image: Python 3.11

Packages installed via requirements.txt:

  1. Jupyter 1.1.1
  2. matplotlib 3.10.3
  3. numpy 2.3.2
  4. pandas 2.3.1
  5. pyarrow 21.0.0
  6. scipy 1.16.1
  7. scikit-learn 1.7.1
  8. seaborn 0.13.2

If you need to install additional Python packages, you can do so via the terminal with: pip install packagename.

2.3. VSCode extensions

Sepcified via devcontainier.json.

  1. ms-python.python
  2. ms-toolsai.jupyter
  3. streetsidesoftware.code-spell-checker

VSCode extensions can be added via the Extensions tab located on the activities panel at the left once inside the Codespace.

About

Boilerplate template repository for 4Geeks data science assignments

Topics

Resources

License

Stars

Watchers

Forks