You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# This should print "1.0.0". If it doesn't, update by running:
32
+
# This should print "1.1.0". If it doesn't, update by running:
33
33
pip install -U wilds
34
34
```
35
35
@@ -42,15 +42,15 @@ pip install -e .
42
42
43
43
### Requirements
44
44
- numpy>=1.19.1
45
+
- ogb>=1.2.6
46
+
- outdated>=0.2.0
45
47
- pandas>=1.1.0
46
48
- pillow>=7.2.0
47
-
- torch>=1.7.0
48
-
- tqdm>=4.53.0
49
49
- pytz>=2020.4
50
-
- outdated>=0.2.0
51
-
- ogb>=1.2.3
50
+
- torch>=1.7.0
52
51
- torch-scatter>=2.0.5
53
52
- torch-geometric>=1.6.1
53
+
- tqdm>=4.53.0
54
54
55
55
Running `pip install wilds` or `pip install -e .` will automatically check for and install all of these requirements
56
56
except for the `torch-scatter` and `torch-geometric` packages, which require a [quick manual install](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html#installation-via-binaries).
@@ -70,39 +70,69 @@ To run these scripts, you will need to install these additional dependencies:
70
70
71
71
All baseline experiments in the paper were run on Python 3.8.5 and CUDA 10.1.
72
72
73
-
## Usage
74
-
### Default models
75
-
In the `examples/` folder, we provide a set of scripts that we used to train models on the WILDS package. These scripts are configured with the default models and hyperparameters that we used for all of the baselines described in our paper. All baseline results in the paper can be easily replicated with commands like:
73
+
74
+
## Using the example scripts
75
+
76
+
In the `examples/` folder, we provide a set of scripts that can be used to download WILDS datasets and train models on them.
77
+
These scripts are configured with the default models and hyperparameters that we used for all of the baselines described in our paper. All baseline results in the paper can be easily replicated with commands like:
76
78
77
79
```bash
78
-
cd examples
79
-
python run_expt.py --dataset iwildcam --algorithm ERM --root_dir data
80
-
python run_expt.py --dataset civilcomments --algorithm groupDRO --root_dir data
80
+
python examples/run_expt.py --dataset iwildcam --algorithm ERM --root_dir data
81
+
python examples/run_expt.py --dataset civilcomments --algorithm groupDRO --root_dir data
81
82
```
82
83
83
84
The scripts are set up to facilitate general-purpose algorithm development: new algorithms can be added to `examples/algorithms` and then run on all of the WILDS datasets using the default models.
84
85
85
86
The first time you run these scripts, you might need to download the datasets. You can do so with the `--download` argument, for example:
86
87
```
87
-
python run_expt.py --dataset civilcomments --algorithm groupDRO --root_dir data --download
88
+
python examples/run_expt.py --dataset civilcomments --algorithm groupDRO --root_dir data --download
88
89
```
89
90
91
+
Alternatively, you can use the standalone `wilds/download_datasets.py` script to download the datasets, for example:
92
+
93
+
```bash
94
+
python wilds/download_datasets.py --root_dir data
95
+
```
96
+
97
+
This will download all datasets to the specified `data` folder. You can also use the `--datasets` argument to download particular datasets.
98
+
99
+
These are the sizes of each of our datasets, as well as their approximate time taken to train and evaluate the default model for a single ERM run using a NVIDIA V100 GPU.
100
+
101
+
| Dataset command | Modality | Download size (GB) | Size on disk (GB) | Train+eval time (Hours) |
While the `camelyon17` dataset is small and fast to train on, we advise against using it as the only dataset to prototype methods on, as the test performance of models trained on this dataset tend to exhibit a large degree of variability over random seeds.
113
+
114
+
The image datasets (`iwildcam`, `camelyon17`, `fmow`, and `poverty`) tend to have high disk I/O usage. If training time is much slower for you than the approximate times listed above, consider checking if I/O is a bottleneck (e.g., by moving to a local disk if you are using a network drive, or by increasing the number of data loader workers). To speed up training, you could also disable evaluation at each epoch or for all splits by toggling `--evaluate_all_splits` and related arguments.
115
+
116
+
We have an [executable version](https://wilds.stanford.edu/codalab) of our paper on CodaLab that contains the exact commands, code, and data used for the experiments reported in our paper. Trained model weights for all datasets can also be found there.
117
+
118
+
119
+
## Using the WILDS package
90
120
### Data loading
91
121
92
122
The WILDS package provides a simple, standardized interface for all datasets in the benchmark.
93
123
This short Python snippet covers all of the steps of getting started with a WILDS dataset, including dataset download and initialization, accessing various splits, and preparing a user-customizable data loader.
Most `eval` methods take in predicted labels for `all_y_pred` by default, but the default inputs vary across datasets and are documented in the `eval` docstrings of the corresponding dataset class.
174
205
175
206
## Citing WILDS
176
207
If you use WILDS datasets in your work, please cite [our paper](https://arxiv.org/abs/2012.07421) ([Bibtex](https://wilds.stanford.edu/assets/files/bibtex.md)):
177
208
178
-
-**WILDS: A Benchmark of in-the-Wild Distribution Shifts** (2020). Pang Wei Koh*, Shiori Sagawa*, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, and Percy Liang.
209
+
-**WILDS: A Benchmark of in-the-Wild Distribution Shifts** (2021). Pang Wei Koh*, Shiori Sagawa*, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton A. Earnshaw, Imran S. Haque, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, and Percy Liang.
179
210
180
211
Please also cite the original papers that introduce the datasets, as listed on the [datasets page](https://wilds.stanford.edu/datasets/).
0 commit comments