Skip to content

Commit f716ff3

Browse files
jchiquetmathurinmgdurif
authored
Adding several entries in FAQ (#18)
* advances in faq * archiving post describing obsolete submission process * updating post for gh-page activation * post for other languages * reproductibility + lon-running code * reproductibility + lon-running code * split questions in FAQ, partially fixing #1 * Update _posts/2023-03-24-others-languages.md * Update _posts/2023-03-24-what-reproducibility.md * Update _posts/2023-06-21-data.md * add comments about other data repositories --------- Co-authored-by: mathurinm <[email protected]> Co-authored-by: gdurif <[email protected]>
1 parent b55191f commit f716ff3

9 files changed

+64
-9
lines changed

_config.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -238,8 +238,7 @@ jekyll-archives:
238238
tag: '/blog/tag/:name/'
239239
category: '/blog/category/:name/'
240240

241-
display_tags: ['formatting', 'reproducibility', 'data', 'code'] # this tags will be dispalyed on the front page of your blog
242-
241+
display_tags: ['formatting', 'reproducibility', 'data', 'code'] # this tags will be displayed on the front page of your blog
243242
# -----------------------------------------------------------------------------
244243
# Jekyll Scholar
245244
# -----------------------------------------------------------------------------

_pages/submit.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -70,10 +70,6 @@ if you are attached to Jupyter book or do not prefer to use Quarto, you are of c
7070

7171
</div>
7272

73-
## Data and large files
74-
75-
If your submission materials contain files larger than 50MB, **especially data files**, they won’t fit on a git repository as is. For this reason, we encourage you to put your data or any materials you deem necessary on an external “open data” centered repository hub such a [Zenodo](https://zenodo.org/) or [OSF](https://osf.io/).
76-
7773
## Submit your work
7874

7975
Once your are happy with your notebook AND the continuous integration (Github action or Gitlab CI) is successful, you must submit your PDF with [OpenReview, our platform for peer-reviewing](https://openreview.net/group?id=Computo).

_posts/2021-04-23-submission-process.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: post
33
title: How does Computo work?
4-
date: 2021-04-23 00:00:00
4+
date: 3021-04-23 00:00:00
55
description: Diagrams that describe the submission process
66
---
77

_posts/2023-03-17-HTML-to-website.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@ We review here the full process for more clarity.
1515

1616
If you used one of our template repository, the build action (in `.github/workflows/build.yml`) should look like this:
1717

18-
19-
2018
{% highlight yaml linenos %}
2119
name: build
2220

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
layout: post
3+
title: 'I use gitlab instead of github: what should I do?'
4+
date: 2030-03-24 00:00:00
5+
tags: reproducibility
6+
description: Discuss integration of Computo's contribution in Gitlab instances
7+
---
8+
9+
_Under Construction_
10+
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
layout: post
3+
title: 'I use a different language than Python, R or Julia: would Computo accept my contributions?'
4+
date: 2023-03-24 00:00:00
5+
tags: [reproducibility, code]
6+
description: Describe how to handle other languages than R, Julia or Python
7+
---
8+
9+
In principle, we are open to any kind of language.
10+
11+
In practice, we need to integrate reproducible and compilable code into our quarto template. Natively, we support, `R`, `Python` and `Julia` and provide dedicated templates. For others, if the language is supported by a Jupyter kernel ([there are kernels for many languages](https://gist.github.com/chronitis/682c4e0d9f663e85e3d87e97cd7d1624), [quarto allows code execution](https://quarto.org/docs/computations/execution-options.html#engine-binding).
12+
13+
When writing your contribution though, keep in mind that some languages are not designed for interactivity and that there will be a formatting effort to support your point in your manuscript (which could be as expensive as interfacing this code with Python or R via `pybind11`, `Rcpp` or equivalent). It's your choice.
14+
15+
From our side, we will do our best for the technical aspects to help with the integration of any language, but the editorial board and reviewers will also do the work to make sure the contribution is within the bounds scientifically and in the spirit of reproducibility.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
layout: post
3+
title: What is expected exactly in terms of reproducibility?
4+
date: 2023-04-24 00:00:00
5+
tags: reproducibility
6+
description: Discuss the different kinds of reproducibility at play in Computo, and what is expected from the authors.
7+
---
8+
9+
Computo is not just about publishing a notebook and proving that it can be compiled with CI! This part of the process is what we call _"Editorial Reproducibility"_. _"Scientific"_ or _"numerical"_ reproducibility of the analyses is also mandatory, on top of classical peer-review evaluation.
10+
11+
We don't ask people reproducing their data... yet! We also don't ask for "bit-wise computational" reproducibility (i.e. obtaining exactly the same results bit-by-bit) but rather a "statistical" reproducibility, i.e. obtaining results leading to the same conclusion, with potential non-significant statistical variability.

_posts/2023-06-21-data.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
layout: post
3+
title: I have large or sensible data. How should I proceed?
4+
date: 2023-06-21 00:00:00
5+
tags: reproducibility
6+
description: Describe how to handle large or sensible data files when submitting to Computo
7+
---
8+
9+
## Large data sets
10+
11+
If your submission materials contain files larger than 50MB, **especially data files**, they won’t fit on a git repository as is. For this reason, we encourage you to put your data or any materials you deem necessary on an external “open data” centered repository hub such a [Zenodo](https://zenodo.org/) or [OSF](https://osf.io/).
12+
13+
You could also use any long-term (emphasis on long-term) data repository that is standard in your scientific community (or for a specific type of data/scientific application), and for which it is straight-forward to retrieve the data using a script code/notebook code (we highly encourage to use open platforms, ideally institutionally hosted).
14+
15+
## Sensible data sets
16+
17+
Since the reproducibility of numerical results is a necessary condition for publication in *Computo*, your submissions must include all necessary data (e.g. via Zenodo repositories). However, if you have sensible data (for example, biomedical data that needs to be anonymized), you are invited to contact the editorial committee to explain and justify your position. In any case, we will ask you to make public at least a sample of the original data, and a demonstration of its use in your article for Computo. The results of the analyses carried out on the totality of the data should be made available in the form of a binary file, in order to produce the statistical summaries necessary to illustrate your assertions.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
layout: post
3+
title: My data analysis takes several hours/days/weeks... How to address the issue of reproducibility?
4+
date: 2023-06-21 00:00:00
5+
tags: reproducibility
6+
description: Discuss the reproducibility for long-running code
7+
---
8+
9+
If your analyses, model tuning or training phase take a prohibitively long time to compile and integrate, you can include the results of the trained methods in the form of a binary file. However, you must provide the code enabling the user to fully reproduce the training phase, and illustrate your code in a small, toy-sized example.

0 commit comments

Comments
 (0)