Skip to content

Commit a1113bb

Browse files
authored
Reproducibility for DOI 10.5281/zenodo.6853067 (#8)
* DOI 10.5281/zenodo.6853067 * Template files and gitignore updates * Add EOL NL in metadata json
1 parent b201870 commit a1113bb

16 files changed

+758
-3
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ _env
1010

1111
# Ignore caches
1212
__pycache__
13+
.jammies.toml
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
**/clean
2+
**/env
3+
**/src
4+
**/.dockerignore
5+
**/Dockerfile*
6+
**/README.md
7+
**/instructions.md
8+
**/issues.md

10-5281_zenodo-6853067/.gitignore

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Ignore clean and src directories
2+
/clean
3+
/src
4+
5+
# Ignore environments
6+
/env
7+
8+
# Ignore IDEs
9+
.vscode
10+
11+
# Ignore caches
12+
__pycache__
13+
.jammies.toml

10-5281_zenodo-6853067/Dockerfile

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Set global arguments
2+
ARG JAMMIES_VER=0.4.3
3+
4+
# Get and patch project for working directory
5+
FROM python:3.11.2-alpine3.17 as projects
6+
7+
## Set local arguments
8+
ARG JAMMIES_VER
9+
10+
## Keeps Python from generating .pyc files in the container
11+
ENV PYTHONDONTWRITEBYTECODE=1
12+
13+
## Turns off buffering for easier container logging
14+
ENV PYTHONUNBUFFERED=1
15+
16+
## Copy files to directory
17+
COPY . ./
18+
19+
## Add git to alpine to pull necessary repositories
20+
RUN apk update
21+
RUN apk add git
22+
23+
## Install jammies and run
24+
RUN python3 -m pip install "jammies[all]==${JAMMIES_VER}"
25+
RUN jammies patch src
26+
27+
# Setup project specific info
28+
FROM nvidia/cuda:11.8.0-runtime-ubuntu22.04
29+
30+
## Disable interactions with package configurations
31+
ARG DEBIAN_FRONTEND=noninteractive
32+
33+
## Setup necessary libraries
34+
RUN apt-get update && apt-get install -y --no-install-recommends \
35+
make \
36+
build-essential \
37+
libssl-dev \
38+
zlib1g-dev \
39+
libbz2-dev \
40+
libreadline-dev \
41+
libsqlite3-dev \
42+
wget \
43+
ca-certificates \
44+
curl \
45+
llvm \
46+
libncurses5-dev \
47+
xz-utils \
48+
tk-dev \
49+
libxml2-dev \
50+
libxmlsec1-dev \
51+
libffi-dev \
52+
liblzma-dev \
53+
mecab-ipadic-utf8 \
54+
git
55+
56+
## Install pyenv and chosen Python version
57+
ENV PYENV_ROOT /.pyenv
58+
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
59+
60+
RUN curl https://pyenv.run | bash \
61+
&& pyenv install 3.11.4 \
62+
&& pyenv global 3.11.4 \
63+
&& pyenv rehash
64+
65+
## Keeps Python from generating .pyc files in the container
66+
ENV PYTHONDONTWRITEBYTECODE=1
67+
68+
## Turns off buffering for easier container logging
69+
ENV PYTHONUNBUFFERED=1
70+
71+
## Copy project files from previous stage here
72+
RUN mkdir /src
73+
COPY --from=projects /src /src
74+
75+
## Setup python
76+
RUN python3 -m pip install -r /src/requirements.txt
77+
78+
## Setup script run
79+
WORKDIR /src
80+
CMD [ "python3", "synthetic_experiment_Qlearning.py" ]

10-5281_zenodo-6853067/README.md

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
# [Sparse Factor Autoencoders for Item Response Theory](https://doi.org/10.5281/zenodo.6853067)
2+
3+
![Essentially Reproducible](https://img.shields.io/badge/Status-Essentially%20Reproducible-green)
4+
5+
This is a project constructor for the paper [*Sparse Factor Autoencoders for Item Response Theory*](https://doi.org/10.5281/zenodo.6853067) by [Benjamin Paassen](https://orcid.org/0000-0002-3899-2450), Malwina Dywel, Melanie Fleckenstein, [Niels Pinkwart](https://orcid.org/0000-0001-7076-9737).
6+
7+
### Associated Metadata
8+
9+
#### Tested Systems
10+
11+
![Debian (GPU): bullseye (11) | bookworm (12)](https://img.shields.io/badge/Debian%20%28GPU%29-bullseye%20%2811%29%20%7C%20bookworm%20%2812%29-informational)
12+
![Docker NVIDIA (GPU): 20.10 | 23.0](https://img.shields.io/badge/Docker%20NVIDIA%20%28GPU%29-20.10%20%7C%2023.0-informational)
13+
14+
#### Languages
15+
![Python: 3.11.2 | 3.11.4](https://img.shields.io/badge/Python-3.11.2%20%7C%203.11.4-informational)
16+
17+
#### Resources
18+
19+
* [Sparse Factor Autoencoders for Item Response Theory](https://doi.org/10.5281/zenodo.6853067) (Public)
20+
* Contains paper under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/)
21+
* [GitHub](https://github.com/bpaassen/sparfae) (Public)
22+
* Contains data under [GPL-3.0-or-later](https://spdx.org/licenses/GPL-3.0-or-later.html)
23+
* Contains materials under [GPL-3.0-or-later](https://spdx.org/licenses/GPL-3.0-or-later.html)
24+
* [NeurIPS 2020 Education Challenge Dataset](https://eedi.com/projects/neurips-education-challenge) (Public)
25+
* Contains data under [CC-BY-NC-ND-4.0](https://creativecommons.org/licenses/by-nc-nd/4.0//)
26+
27+
## Project Files
28+
29+
The constructor downloads the following files:
30+
* [Cloned GitHub](https://github.com/ahaim5357/sparfae) under [GPL-3.0-or-later](https://spdx.org/licenses/GPL-3.0-or-later.html)
31+
* [NeurIPS 2020 Education Challenge Dataset](https://eedi.com/projects/neurips-education-challenge) under [CC-BY-NC-ND-4.0](https://creativecommons.org/licenses/by-nc-nd/4.0//)
32+
33+
## Setup Instructions
34+
35+
A NVIDIA Graphics Card is necessary to make full use of the models provided in this codebase. Additionally, this makes use of the [CUDA Toolkit][cuda] 11.8 and [cuDNN][cudnn] 8.8.0. You can follow the instructions on how to setup on [their website][cuda_docs]. It is highly recommended, and sometimes required, to use a Linux distribution for use with any implementations of the CUDA Toolkit in other software.
36+
37+
### Method 1: Docker
38+
39+
This project contains the necessary files needed to setup a [docker container][docker] with the [NVIDIA Container Toolkit runtime][nvidia_docker]. Make sure you have both Docker and the NVIDIA runtime installed before attempting anything below.
40+
41+
To build the docker container, navigate to this directory and run the following command:
42+
43+
```sh
44+
docker build -t <image_name> .
45+
```
46+
47+
`image_name` should be replaced with whatever name you would like to refer to the docker container as. It will take around 30 minutes to an hour to build the image.
48+
49+
From there, you can load into the terminal via:
50+
51+
```sh
52+
docker run --rm --runtime=nvidia --gpus all -itv <local_directory>:/volume <image_name> sh
53+
```
54+
55+
A `volume` directory will be created within the image which will link to the `local_directory` specified. You can specify the current directory of execution via `${PWD}`.
56+
57+
> We are loading into the terminal instead of into Python to copy any generated figures onto the local machine as they cannot otherwise be easily viewed.
58+
59+
Once in the docker terminal, you can run the Python script via:
60+
61+
```sh
62+
python3 synthetic_experiment_Qlearning.py
63+
python3 eedi_experiment-fixedQ.py
64+
python3 eedi_experiment-Qlearning.py
65+
```
66+
67+
You can look through the terminal output and compare the numbers within the paper. To view the figures on the local machine, you can copy them to the volume via:
68+
69+
```sh
70+
cp -R ./images /volume
71+
```
72+
73+
### Method 2: Local Setup
74+
75+
This project uses the Python package `jammies[all]` to setup and fix any issues in the codebase. For instructions on how to download and generate the project from this directory, see the [`jammies`][jammies] repository.
76+
77+
The following instructions have been reproduced using [Python][python] 3.11.4. This project does not make any guarantees that this will work outside of the specified version. Make sure you have Python, along with gcc for Cython, before attempting anything below.
78+
79+
First, you will need to navigate to the generated `src` directory. You will need to install the required dependencies into the global Python instance or a virtual environment via:
80+
81+
```sh
82+
python3 -m pip install -r requirements.txt
83+
```
84+
85+
> `python3` is replaced with `py` on Windows machines. Additionally, the `python3 -m` prefix is unnecessary if `pip` is properly added to the path.
86+
87+
After installing the required dependencies, run the Python script via:
88+
89+
```sh
90+
python3 synthetic_experiment_Qlearning.py
91+
python3 eedi_experiment-fixedQ.py
92+
python3 eedi_experiment-Qlearning.py
93+
```
94+
95+
You can look through the terminal output and compare the numbers within the paper. Additionally, the figures are generated within the `images` directory within `src`.
96+
97+
Additionally, you can run the notebook versions instead. The images will be generated as part of its output.
98+
99+
[cuda]: https://developer.nvidia.com/cuda-toolkit
100+
[cudnn]: https://developer.nvidia.com/cudnn
101+
[cuda_docs]: https://docs.nvidia.com/cuda/
102+
[docker]: https://www.docker.com/
103+
[nvidia_docker]: https://github.com/NVIDIA/nvidia-docker
104+
[jammies]: https://github.com/ahaim5357/jammies
105+
[python]: https://www.python.org/
106+
107+
## Issues
108+
109+
### Figures 2-5
110+
111+
Based on some analysis of the graphs, this assumes that condition A is condition 1 and condition B is condition 3.
112+
113+
* The graphs reported for Figure 2 roughly follow similar trends after running multiple attempts of the codebase; however, the individaul shape differs quite a bit, likely due to randomness.
114+
* Aside from AUC, the graphs reported for Figure 3 can widely greatly, having different trends across multiple attempts.
115+
* Aside from AUC and rθ, the graphs reported for Figure 4 can widely greatly, having different trends across multiple attempts.
116+
* The train time for hyperparameters in Figure 5 is similar except for the SPARFA model, which had an odd training time on the reproduced machine.
117+
118+
### Reported Results in Table 2
119+
120+
* After running the fixed methods multiple times, there are likely a few typos in the result:
121+
* SparFAE_f (AUC)
122+
* 0.88 +- 0.04 -> 0.88 +- 0.05
123+
* The following are likely due to differences in the machines used:
124+
* VIBO_f (Training Time)
125+
* 8.01 +- 5.59 -> 10.62 +- 6.97
126+
* SparFAE_f (Training Time)
127+
* 0.05 +- 0.03 -> 0.05 +- 0.02
128+
* VIBO_f (Prediction Time)
129+
* 1.31 +- 2.76 -> 2.01 +- 2.56
130+
* SparFAE_f (Prediction Time)
131+
* 0.15 +- 0.13 -> 0.18 +- 0.00
132+
133+
* After running the experiment methods multiple times, there are likely a few typos in the result:
134+
* SPARFA (Sparsity)
135+
* 0.16 +- 0.06 -> 0.16 +- 0.07
136+
* VIBO (Sparsity)
137+
* 0.00 +- 0.00 -> 0.00 +- 0.01
138+
* SparFAE2 (Sparsity)
139+
* 0.33 +- 0.10 -> 0.33 +- 0.11
140+
* The following are likely due to differences in the machines used; however, the results are quite similar to those reported, so it could also be interpreted as a potential typo:
141+
* Training Time
142+
* SPARFA: 31.0 +- 20.9 -> 31.0 +- 21.2
143+
* VIBO: 7.83 +- 5.12 -> 7.83 +- 5.21
144+
* SparFAE1: 1.94 +- 1.78 -> 1.94 +- 1.95
145+
* SparFAE2: 15.7 +- 15.9 -> 15.7 +- 18.8
146+
* Prediction Time
147+
* SPARFA: 633 +- 444 -> 634 +- 458
148+
* VIBO: 0.31 +- 0.18 -> 0.31 +- 0.50
149+
* SparFAE1: 0.19 +- 0.12 -> 0.19 +- 0.35
150+
* SparFAE2: 0.20 +- 0.13 -> 0.20 +- 0.36
151+
152+
* The Wilcoxon Signed-Rank Test is reported backwards, meaning that the sentence should read, "Method 1 has lower AUC than Method 2".
153+
154+
### Figures 6-8
155+
156+
* Figures 7 and 8 are not generated in the codebase.
157+
* Only the number of skills in Figure 6 is generated, but without the lines of best fit. However, the points are essentially accurate to those in the paper.
158+
159+
*[GPL-3.0-or-later]: GNU General Public License v3.0 or later
160+
*[Cloned GitHub]: Cloned GitHub Repository
161+
*[CC-BY-4.0]: Creative Commons Attribution 4.0 International
162+
*[GitHub]: GitHub Repository
163+
*[CC-BY-NC-ND-4.0]: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
## Setup Instructions
2+
3+
A NVIDIA Graphics Card is necessary to make full use of the models provided in this codebase. Additionally, this makes use of the [CUDA Toolkit][cuda] 11.8 and [cuDNN][cudnn] 8.8.0. You can follow the instructions on how to setup on [their website][cuda_docs]. It is highly recommended, and sometimes required, to use a Linux distribution for use with any implementations of the CUDA Toolkit in other software.
4+
5+
### Method 1: Docker
6+
7+
This project contains the necessary files needed to setup a [docker container][docker] with the [NVIDIA Container Toolkit runtime][nvidia_docker]. Make sure you have both Docker and the NVIDIA runtime installed before attempting anything below.
8+
9+
To build the docker container, navigate to this directory and run the following command:
10+
11+
```sh
12+
docker build -t <image_name> .
13+
```
14+
15+
`image_name` should be replaced with whatever name you would like to refer to the docker container as. It will take around 30 minutes to an hour to build the image.
16+
17+
From there, you can load into the terminal via:
18+
19+
```sh
20+
docker run --rm --runtime=nvidia --gpus all -itv <local_directory>:/volume <image_name> sh
21+
```
22+
23+
A `volume` directory will be created within the image which will link to the `local_directory` specified. You can specify the current directory of execution via `${PWD}`.
24+
25+
> We are loading into the terminal instead of into Python to copy any generated figures onto the local machine as they cannot otherwise be easily viewed.
26+
27+
Once in the docker terminal, you can run the Python script via:
28+
29+
```sh
30+
python3 synthetic_experiment_Qlearning.py
31+
python3 eedi_experiment-fixedQ.py
32+
python3 eedi_experiment-Qlearning.py
33+
```
34+
35+
You can look through the terminal output and compare the numbers within the paper. To view the figures on the local machine, you can copy them to the volume via:
36+
37+
```sh
38+
cp -R ./images /volume
39+
```
40+
41+
### Method 2: Local Setup
42+
43+
This project uses the Python package `jammies[all]` to setup and fix any issues in the codebase. For instructions on how to download and generate the project from this directory, see the [`jammies`][jammies] repository.
44+
45+
The following instructions have been reproduced using [Python][python] 3.11.4. This project does not make any guarantees that this will work outside of the specified version. Make sure you have Python, along with gcc for Cython, before attempting anything below.
46+
47+
First, you will need to navigate to the generated `src` directory. You will need to install the required dependencies into the global Python instance or a virtual environment via:
48+
49+
```sh
50+
python3 -m pip install -r requirements.txt
51+
```
52+
53+
> `python3` is replaced with `py` on Windows machines. Additionally, the `python3 -m` prefix is unnecessary if `pip` is properly added to the path.
54+
55+
After installing the required dependencies, run the Python script via:
56+
57+
```sh
58+
python3 synthetic_experiment_Qlearning.py
59+
python3 eedi_experiment-fixedQ.py
60+
python3 eedi_experiment-Qlearning.py
61+
```
62+
63+
You can look through the terminal output and compare the numbers within the paper. Additionally, the figures are generated within the `images` directory within `src`.
64+
65+
Additionally, you can run the notebook versions instead. The images will be generated as part of its output.
66+
67+
[cuda]: https://developer.nvidia.com/cuda-toolkit
68+
[cudnn]: https://developer.nvidia.com/cudnn
69+
[cuda_docs]: https://docs.nvidia.com/cuda/
70+
[docker]: https://www.docker.com/
71+
[nvidia_docker]: https://github.com/NVIDIA/nvidia-docker
72+
[jammies]: https://github.com/ahaim5357/jammies
73+
[python]: https://www.python.org/

10-5281_zenodo-6853067/issues.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
## Issues
2+
3+
### Figures 2-5
4+
5+
Based on some analysis of the graphs, this assumes that condition A is condition 1 and condition B is condition 3.
6+
7+
* The graphs reported for Figure 2 roughly follow similar trends after running multiple attempts of the codebase; however, the individaul shape differs quite a bit, likely due to randomness.
8+
* Aside from AUC, the graphs reported for Figure 3 can widely greatly, having different trends across multiple attempts.
9+
* Aside from AUC and rθ, the graphs reported for Figure 4 can widely greatly, having different trends across multiple attempts.
10+
* The train time for hyperparameters in Figure 5 is similar except for the SPARFA model, which had an odd training time on the reproduced machine.
11+
12+
### Reported Results in Table 2
13+
14+
* After running the fixed methods multiple times, there are likely a few typos in the result:
15+
* SparFAE_f (AUC)
16+
* 0.88 +- 0.04 -> 0.88 +- 0.05
17+
* The following are likely due to differences in the machines used:
18+
* VIBO_f (Training Time)
19+
* 8.01 +- 5.59 -> 10.62 +- 6.97
20+
* SparFAE_f (Training Time)
21+
* 0.05 +- 0.03 -> 0.05 +- 0.02
22+
* VIBO_f (Prediction Time)
23+
* 1.31 +- 2.76 -> 2.01 +- 2.56
24+
* SparFAE_f (Prediction Time)
25+
* 0.15 +- 0.13 -> 0.18 +- 0.00
26+
27+
* After running the experiment methods multiple times, there are likely a few typos in the result:
28+
* SPARFA (Sparsity)
29+
* 0.16 +- 0.06 -> 0.16 +- 0.07
30+
* VIBO (Sparsity)
31+
* 0.00 +- 0.00 -> 0.00 +- 0.01
32+
* SparFAE2 (Sparsity)
33+
* 0.33 +- 0.10 -> 0.33 +- 0.11
34+
* The following are likely due to differences in the machines used; however, the results are quite similar to those reported, so it could also be interpreted as a potential typo:
35+
* Training Time
36+
* SPARFA: 31.0 +- 20.9 -> 31.0 +- 21.2
37+
* VIBO: 7.83 +- 5.12 -> 7.83 +- 5.21
38+
* SparFAE1: 1.94 +- 1.78 -> 1.94 +- 1.95
39+
* SparFAE2: 15.7 +- 15.9 -> 15.7 +- 18.8
40+
* Prediction Time
41+
* SPARFA: 633 +- 444 -> 634 +- 458
42+
* VIBO: 0.31 +- 0.18 -> 0.31 +- 0.50
43+
* SparFAE1: 0.19 +- 0.12 -> 0.19 +- 0.35
44+
* SparFAE2: 0.20 +- 0.13 -> 0.20 +- 0.36
45+
46+
* The Wilcoxon Signed-Rank Test is reported backwards, meaning that the sentence should read, "Method 1 has lower AUC than Method 2".
47+
48+
### Figures 6-8
49+
50+
* Figures 7 and 8 are not generated in the codebase.
51+
* Only the number of skills in Figure 6 is generated, but without the lines of best fit. However, the points are essentially accurate to those in the paper.

0 commit comments

Comments
 (0)