Skip to content

Commit 58f6a0c

Browse files
committed
Update README to reflect new Kaggle setup
1 parent 7e7509a commit 58f6a0c

File tree

1 file changed

+10
-7
lines changed

1 file changed

+10
-7
lines changed

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -19,27 +19,30 @@ any data.
1919
- [02 State Hourly Electricity Demand](https://www.kaggle.com/code/catalystcooperative/02-state-hourly-electricity-demand)
2020
- [03 EIA-930 Sanity Checks](https://www.kaggle.com/code/catalystcooperative/03-eia-930-sanity-checks)
2121
- [04 Renewable Generation Profiles](https://www.kaggle.com/code/catalystcooperative/04-renewable-generation-profiles)
22+
- [05 FERC-714 Electricity Demand Forecast Biases](https://www.kaggle.com/code/catalystcooperative/05-ferc-714-electricity-demand-forecast-biases)
2223

2324
You'll find the [PUDL data dictionary](https://catalystcoop-pudl.readthedocs.io/en/latest/data_dictionaries/pudl_db.html)
2425
helpful for interpreting the data.
2526

2627
## Running Jupyter locally
2728

2829
If you're already familiar with git, Python environments, filesystem paths, and running
29-
upyter notebooks locally, you can also work with these notebooks and the PUDL data locally:
30+
Jupyter notebooks locally, you can also work with these notebooks and the PUDL data locally:
3031

3132
- Create a Python environment that includes common data science packages. We like to use
3233
the [mamba](https://github.com/mamba-org/mamba) package manager and the
3334
[conda-forge](https://conda-forge.org/#about) channel.
3435
- Clone this repository.
35-
- [Download the PUDL dataset from Kaggle](https://www.kaggle.com/datasets/catalystcooperative/pudl-project/download)
36-
(it's ~20GB!) and unzip it somewhere conveniently accessible from the notebooks in the
37-
cloned repo.
3836
- Start your JupyterLab or Jupyter Notebook server and navigate to the notebooks in
3937
the cloned repo.
40-
- You'll need to adjust the file paths in the notebooks to point at the directory where
41-
you put the PUDL data, and might need to adjust the packages installed in your Python
42-
environment to work with the notebooks.
38+
- If all the necessary packages are installed, you should be able to run the notebooks
39+
without worrying about where the data is, since it is read directly from our public
40+
AWS S3 bucket.
41+
- If you would rather work with the data locally, you can [Download the PUDL dataset from Kaggle](https://www.kaggle.com/datasets/catalystcooperative/pudl-project/download)
42+
(it's ~20GB!) and unzip it somewhere conveniently accessible from the notebooks in the
43+
cloned repo.
44+
- In this case you'll need to adjust the file paths in the notebooks to point at the
45+
directory where you put the PUDL data.
4346

4447
## Other Data Access Methods
4548

0 commit comments

Comments
 (0)