Follow this once (10–15 min). After this you can start any competition fast.
- Go to https://www.kaggle.com and Sign Up / Log In.
- Complete profile (photo, short bio, location) – it increases trust when others view your notebooks.
- (Optional but good) Enable 2FA under Settings > Account.
Before you can download data you MUST open the competition page and click: Join Competition / I Understand & Accept. Do this for every new competition (even “Getting Started” ones) or the API will return 403 errors. Make sure you understand what you're agreeing to as well.
To interact with kaggle using API, we need our kaggle configuration file.
- Click your profile avatar (top-right) → Settings.
- Scroll to “API” section → Create New Token.
- This downloads
kaggle.json(contains username + key). Keep it private.
You can work in Google Colab (recommended for beginners) or locally (VS Code). Both steps below.
Run these cells at the top of every new notebook (only the upload once per session):
# 1. Install (usually fast)
!pip install -q kaggle
# 2. Upload kaggle.json (downloaded earlier)
from google.colab import files
files.upload() # choose kaggle.json
# 3. Move it into the hidden .kaggle folder with correct permissions
!mkdir -p ~/.kaggle
!mv kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
# 4. Quick test
!kaggle --version
!pip install <'replace with your packages versions'>💡 Tip: To avoid uploading every time, you can store the token in a private Google Drive folder and copy it programmatically, but manual upload is fine for now.
pip install kaggle
mkdir -p ~/.kaggle
mv ~/Downloads/kaggle.json ~/.kaggle/ #Sometimes you should place it in ~/.config/kaggle/
chmod 600 ~/.kaggle/kaggle.json
kaggle --version//TODO not sure, I never encountered that// If you use VS Code Python virtual environments, install inside that env.
project_root/
data/
raw/ # untouched downloaded files
processed/ # cleaned / engineered
notebooks/
01_eda.ipynb
02_model_baseline.ipynb
models/
submissions/
reports/ **//TODO???//**
README.md
Replace <competition-name> with the Kaggle slug, e.g. titanic, house-prices-advanced-regression-techniques.
Note: The command line for downloading each dataset from Kaggle is usually available on the competition page, typically at the bottom. Always check there for the exact command.
Colab / local (same commands):
!kaggle competitions download -c <competition-name> -p data/raw --forceThen unzip (example):
!unzip -o data/raw/<competition-name>.zip -d data/raw/OR for datasets (not competitions):
!kaggle datasets download -d <owner>/<dataset-slug> -p data/raw//TODO why use python not ls and cat//???? we have been using command line for each step so far, so it makes sense to execute cat data/raw/train.csv | head -n 1
But I don't know the equivelent on Windows //TOOD//???
import os, pandas as pd
os.listdir('data/raw')
train = pd.read_csv('data/raw/train.csv')
train.head()In Colab: Runtime → Change runtime type → Hardware accelerator: GPU (or T4 / A100 if available). Re-run the setup cell after switch.
Never commit kaggle.json to GitHub. Add to .gitignore if working locally:
echo 'kaggle.json' >> .gitignore
| Issue | Symptom | Fix |
|---|---|---|
| 403 - Forbidden | Download blocked | You didn’t accept competition rules. Visit page & click Join. |
| 401 - Unauthorized | API key rejected | Re-create token in Settings, replace old kaggle.json. |
| File not found | No such file after unzip |
Check actual zip name in data/raw, maybe extra version suffix. |
| Permission denied | kaggle.json ignored |
Ensure chmod 600 ~/.kaggle/kaggle.json. |
| Slow / timeout | Large dataset | Download locally then upload subset; or use Kaggle Notebook with data attached. |
- Start in Colab for experimentation.
- When stable, export notebook (
File → Download .ipynb) and store innotebooks/in GitHub repo. - Convert to script if needed:
jupyter nbconvert --to script 01_eda.ipynb(local). - Automate submissions with a small shell or Python helper (later stage).
Once this section is done you are ready to proceed to competition selection.
Kaggle Notebooks are an in-browser Jupyter environment provided by Kaggle for running code, sharing analyses, and making submissions directly on the platform.
-
Create a Notebook:
- Go to any competition or dataset page.
- Click "Code" → "New Notebook".
-
Attach Data:
- Use the "Add Data" button to attach competition datasets or public datasets.
-
Write & Run Code:
- Use Python or R.
- Install additional packages with
!pip install <package>if needed.
-
Save & Share:
- Save your work frequently.
- Share notebooks publicly or keep them private.
-
Submit to Competition:
- For competitions, generate your submission file (e.g.,
submission.csv). - Click "Submit to Competition" from the notebook interface.
- For competitions, generate your submission file (e.g.,
- GPU/TPU resources are available for deep learning tasks.
- Notebooks are versioned; you can revert to previous versions.
- Use "Copy & Edit" to fork public notebooks for your own experiments.
# Example: Save predictions for submission
submission.to_csv('submissions/submission.csv', index=False) # saved to our submission folder.Then use the notebook's "Submit" button to upload your file.
Or use kaggle API for submission, for example !kaggle competitions submit -c <competition-name> -f submissions/submission.csv -m "<Your message for submission>"
- Browse competitions at Kaggle Competitions.
- Click on a competition, read the description and rules.
- Click "Join Competition" and accept the rules.
- Download data or use Kaggle Notebooks as described above.
- Make submissions and track your leaderboard position.