Skip to content

Reproducibility Bugs

Hackalog edited this page Mar 5, 2020 · 7 revisions

Every time we encounter a reproducibility related bug/issue (in our attempts to reproduce others' work), we'll document it here. To the best of our abilities.

Documentation/Process bugs

  • (WHERE-DO-I-START) README doesn't tell me where to start.
  • (NOTEBOOK-ORDER) Ran notebooks out of order. No indication of where to start, or where to go next.
  • (VARIABLE-SCARCITY) "We'll overwrite our variable (which is generally bad) in order to save on code reuse from the last notebook."
  • (COPIED-NOTEBOOK) Copied notebooks for code reuse, instead of generalizing to functions/module
  • (STALE-COMMENT, COPIED-NOTEBOOK) Markdown cell(s) copied and wrong. (comments in copied notebooks that weren't updated)
    • Markdown cell(s) copied and wrong. (comments in copied notebooks that weren't updated)
    • Markdown cell(s) copied and wrong. (comments in copied notebooks that weren't updated)
  • (EYEBALL-TEST, PRNG-FAIL) Only way to check if I got the same results was to compare against outputs in the original notebook and images (but the images didn't match because of randomness)

Licenses

  • (NO-DATA-LICENSE) No data license.
  • (NO-CODE-LICENSE) No repo license. Continued with expressed permission.

Environment Reproducibility

  • (NO-ENVIRONMENT-INSTRUCTIONS) Chicken and egg issue with environments. No environment.yml file or the like. (Even if there are some instructions in a notebook).
  • (NO-VERSION-PIN) Versions not pinned. E.g. uses a dev branch without a clear indication of when it became released.
  • (HARDCODED-PATH) A file contains a hardcoded path, so the project will not run elsewhere without manual editing

Hidden State/Notebook non-linearity

  • (HIDDEN-STATE) Variable undefined. (Hidden state error, cells run out of sequence, copy notebook error?)

Randomness

  • (PRNG-ARCH-FAIL) Fixed random seed, different result. This time, different results on different architectures.
  • (NO-PRNG-SEED) No fixed random seed

Data

  • (NO-DATA-HASH) No data versioning/hash. No way to tell if the data changed since it was originally accessed.

Clone this wiki locally