|
| 1 | +# [Sparse Factor Autoencoders for Item Response Theory](https://doi.org/10.5281/zenodo.6853067) |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | +This is a project constructor for the paper [*Sparse Factor Autoencoders for Item Response Theory*](https://doi.org/10.5281/zenodo.6853067) by [Benjamin Paassen](https://orcid.org/0000-0002-3899-2450), Malwina Dywel, Melanie Fleckenstein, [Niels Pinkwart](https://orcid.org/0000-0001-7076-9737). |
| 6 | + |
| 7 | +### Associated Metadata |
| 8 | + |
| 9 | +#### Tested Systems |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | + |
| 14 | +#### Languages |
| 15 | + |
| 16 | + |
| 17 | +#### Resources |
| 18 | + |
| 19 | +* [Sparse Factor Autoencoders for Item Response Theory](https://doi.org/10.5281/zenodo.6853067) (Public) |
| 20 | + * Contains paper under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) |
| 21 | +* [GitHub](https://github.com/bpaassen/sparfae) (Public) |
| 22 | + * Contains data under [GPL-3.0-or-later](https://spdx.org/licenses/GPL-3.0-or-later.html) |
| 23 | + * Contains materials under [GPL-3.0-or-later](https://spdx.org/licenses/GPL-3.0-or-later.html) |
| 24 | +* [NeurIPS 2020 Education Challenge Dataset](https://eedi.com/projects/neurips-education-challenge) (Public) |
| 25 | + * Contains data under [CC-BY-NC-ND-4.0](https://creativecommons.org/licenses/by-nc-nd/4.0//) |
| 26 | + |
| 27 | +## Project Files |
| 28 | + |
| 29 | +The constructor downloads the following files: |
| 30 | +* [Cloned GitHub](https://github.com/ahaim5357/sparfae) under [GPL-3.0-or-later](https://spdx.org/licenses/GPL-3.0-or-later.html) |
| 31 | +* [NeurIPS 2020 Education Challenge Dataset](https://eedi.com/projects/neurips-education-challenge) under [CC-BY-NC-ND-4.0](https://creativecommons.org/licenses/by-nc-nd/4.0//) |
| 32 | + |
| 33 | +## Setup Instructions |
| 34 | + |
| 35 | +A NVIDIA Graphics Card is necessary to make full use of the models provided in this codebase. Additionally, this makes use of the [CUDA Toolkit][cuda] 11.8 and [cuDNN][cudnn] 8.8.0. You can follow the instructions on how to setup on [their website][cuda_docs]. It is highly recommended, and sometimes required, to use a Linux distribution for use with any implementations of the CUDA Toolkit in other software. |
| 36 | + |
| 37 | +### Method 1: Docker |
| 38 | + |
| 39 | +This project contains the necessary files needed to setup a [docker container][docker] with the [NVIDIA Container Toolkit runtime][nvidia_docker]. Make sure you have both Docker and the NVIDIA runtime installed before attempting anything below. |
| 40 | + |
| 41 | +To build the docker container, navigate to this directory and run the following command: |
| 42 | + |
| 43 | +```sh |
| 44 | +docker build -t <image_name> . |
| 45 | +``` |
| 46 | + |
| 47 | +`image_name` should be replaced with whatever name you would like to refer to the docker container as. It will take around 30 minutes to an hour to build the image. |
| 48 | + |
| 49 | +From there, you can load into the terminal via: |
| 50 | + |
| 51 | +```sh |
| 52 | +docker run --rm --runtime=nvidia --gpus all -itv <local_directory>:/volume <image_name> sh |
| 53 | +``` |
| 54 | + |
| 55 | +A `volume` directory will be created within the image which will link to the `local_directory` specified. You can specify the current directory of execution via `${PWD}`. |
| 56 | + |
| 57 | +> We are loading into the terminal instead of into Python to copy any generated figures onto the local machine as they cannot otherwise be easily viewed. |
| 58 | +
|
| 59 | +Once in the docker terminal, you can run the Python script via: |
| 60 | + |
| 61 | +```sh |
| 62 | +python3 synthetic_experiment_Qlearning.py |
| 63 | +python3 eedi_experiment-fixedQ.py |
| 64 | +python3 eedi_experiment-Qlearning.py |
| 65 | +``` |
| 66 | + |
| 67 | +You can look through the terminal output and compare the numbers within the paper. To view the figures on the local machine, you can copy them to the volume via: |
| 68 | + |
| 69 | +```sh |
| 70 | +cp -R ./images /volume |
| 71 | +``` |
| 72 | + |
| 73 | +### Method 2: Local Setup |
| 74 | + |
| 75 | +This project uses the Python package `jammies[all]` to setup and fix any issues in the codebase. For instructions on how to download and generate the project from this directory, see the [`jammies`][jammies] repository. |
| 76 | + |
| 77 | +The following instructions have been reproduced using [Python][python] 3.11.4. This project does not make any guarantees that this will work outside of the specified version. Make sure you have Python, along with gcc for Cython, before attempting anything below. |
| 78 | + |
| 79 | +First, you will need to navigate to the generated `src` directory. You will need to install the required dependencies into the global Python instance or a virtual environment via: |
| 80 | + |
| 81 | +```sh |
| 82 | +python3 -m pip install -r requirements.txt |
| 83 | +``` |
| 84 | + |
| 85 | +> `python3` is replaced with `py` on Windows machines. Additionally, the `python3 -m` prefix is unnecessary if `pip` is properly added to the path. |
| 86 | +
|
| 87 | +After installing the required dependencies, run the Python script via: |
| 88 | + |
| 89 | +```sh |
| 90 | +python3 synthetic_experiment_Qlearning.py |
| 91 | +python3 eedi_experiment-fixedQ.py |
| 92 | +python3 eedi_experiment-Qlearning.py |
| 93 | +``` |
| 94 | + |
| 95 | +You can look through the terminal output and compare the numbers within the paper. Additionally, the figures are generated within the `images` directory within `src`. |
| 96 | + |
| 97 | +Additionally, you can run the notebook versions instead. The images will be generated as part of its output. |
| 98 | + |
| 99 | +[cuda]: https://developer.nvidia.com/cuda-toolkit |
| 100 | +[cudnn]: https://developer.nvidia.com/cudnn |
| 101 | +[cuda_docs]: https://docs.nvidia.com/cuda/ |
| 102 | +[docker]: https://www.docker.com/ |
| 103 | +[nvidia_docker]: https://github.com/NVIDIA/nvidia-docker |
| 104 | +[jammies]: https://github.com/ahaim5357/jammies |
| 105 | +[python]: https://www.python.org/ |
| 106 | + |
| 107 | +## Issues |
| 108 | + |
| 109 | +### Figures 2-5 |
| 110 | + |
| 111 | +Based on some analysis of the graphs, this assumes that condition A is condition 1 and condition B is condition 3. |
| 112 | + |
| 113 | +* The graphs reported for Figure 2 roughly follow similar trends after running multiple attempts of the codebase; however, the individaul shape differs quite a bit, likely due to randomness. |
| 114 | +* Aside from AUC, the graphs reported for Figure 3 can widely greatly, having different trends across multiple attempts. |
| 115 | +* Aside from AUC and rθ, the graphs reported for Figure 4 can widely greatly, having different trends across multiple attempts. |
| 116 | +* The train time for hyperparameters in Figure 5 is similar except for the SPARFA model, which had an odd training time on the reproduced machine. |
| 117 | + |
| 118 | +### Reported Results in Table 2 |
| 119 | + |
| 120 | +* After running the fixed methods multiple times, there are likely a few typos in the result: |
| 121 | + * SparFAE_f (AUC) |
| 122 | + * 0.88 +- 0.04 -> 0.88 +- 0.05 |
| 123 | +* The following are likely due to differences in the machines used: |
| 124 | + * VIBO_f (Training Time) |
| 125 | + * 8.01 +- 5.59 -> 10.62 +- 6.97 |
| 126 | + * SparFAE_f (Training Time) |
| 127 | + * 0.05 +- 0.03 -> 0.05 +- 0.02 |
| 128 | + * VIBO_f (Prediction Time) |
| 129 | + * 1.31 +- 2.76 -> 2.01 +- 2.56 |
| 130 | + * SparFAE_f (Prediction Time) |
| 131 | + * 0.15 +- 0.13 -> 0.18 +- 0.00 |
| 132 | + |
| 133 | +* After running the experiment methods multiple times, there are likely a few typos in the result: |
| 134 | + * SPARFA (Sparsity) |
| 135 | + * 0.16 +- 0.06 -> 0.16 +- 0.07 |
| 136 | + * VIBO (Sparsity) |
| 137 | + * 0.00 +- 0.00 -> 0.00 +- 0.01 |
| 138 | + * SparFAE2 (Sparsity) |
| 139 | + * 0.33 +- 0.10 -> 0.33 +- 0.11 |
| 140 | +* The following are likely due to differences in the machines used; however, the results are quite similar to those reported, so it could also be interpreted as a potential typo: |
| 141 | + * Training Time |
| 142 | + * SPARFA: 31.0 +- 20.9 -> 31.0 +- 21.2 |
| 143 | + * VIBO: 7.83 +- 5.12 -> 7.83 +- 5.21 |
| 144 | + * SparFAE1: 1.94 +- 1.78 -> 1.94 +- 1.95 |
| 145 | + * SparFAE2: 15.7 +- 15.9 -> 15.7 +- 18.8 |
| 146 | + * Prediction Time |
| 147 | + * SPARFA: 633 +- 444 -> 634 +- 458 |
| 148 | + * VIBO: 0.31 +- 0.18 -> 0.31 +- 0.50 |
| 149 | + * SparFAE1: 0.19 +- 0.12 -> 0.19 +- 0.35 |
| 150 | + * SparFAE2: 0.20 +- 0.13 -> 0.20 +- 0.36 |
| 151 | + |
| 152 | +* The Wilcoxon Signed-Rank Test is reported backwards, meaning that the sentence should read, "Method 1 has lower AUC than Method 2". |
| 153 | + |
| 154 | +### Figures 6-8 |
| 155 | + |
| 156 | +* Figures 7 and 8 are not generated in the codebase. |
| 157 | +* Only the number of skills in Figure 6 is generated, but without the lines of best fit. However, the points are essentially accurate to those in the paper. |
| 158 | + |
| 159 | +*[GPL-3.0-or-later]: GNU General Public License v3.0 or later |
| 160 | +*[Cloned GitHub]: Cloned GitHub Repository |
| 161 | +*[CC-BY-4.0]: Creative Commons Attribution 4.0 International |
| 162 | +*[GitHub]: GitHub Repository |
| 163 | +*[CC-BY-NC-ND-4.0]: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International |
0 commit comments