Skip to content

Commit 3089cf9

Browse files
committed
joss paper for submission
1 parent 33e3230 commit 3089cf9

File tree

6 files changed

+254
-9
lines changed

6 files changed

+254
-9
lines changed

.github/workflows/draft-pdf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
on:
44
push:
55
paths:
6-
- joss-paper/**
6+
- joss-paper/*
77
- .github/workflows/draft-pdf.yaml
88

99
jobs:

joss-paper/figures/figure-1.png

194 KB
Loading

joss-paper/figures/figure-2.png

850 KB
Loading

joss-paper/figures/figure-3.png

770 KB
Loading

joss-paper/paper.bib

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,33 @@
11
# Bibliographic references
2+
3+
@article{chaabane2023,
4+
author = {Chaabane, Sonia and {de Garidel-Thoron}, Thibault and Giraud, Xavier and Schiebel, Ralf and Beaugrand, Gregory and Brummer, Geert-Jan and Casajus, Nicolas and Greco, Mattia and Grigoratou, Maria and Howa, Hélène and Jonkers, Lukas and Kucera, Michal and Kuroyanagi, Azumi and Meilland, Julie and Monteiro, Fanny and Mortyn, Graham and Almogi-Labin, Ahuva and Asahi, Hirofumi and Avnaim-Katav, Simona and Bassinot, Franck and Davis, Catherine V. and Field, David B. and Hernández-Almeida, Iván and Herut, Barak and Hosie, Graham and Howard, Will and Jentzen, Anna and Johns, David G. and Keigwin, Lloyd and Kitchener, John and Kohfeld, Karen E. and Lessa, Douglas V. O. and Manno, Clara and Marchant, Margarita and Ofstad, Siri and Ortiz, Joseph D. and Post, Alexandra and Rigual-Hernandez, Andres and Rillo, Marina C. and Robinson, Karen and Sagawa, Takuya and Sierro, Francisco and Takahashi, Kunio T. and Torfstein, Adi and Venancio, Igor and Yamasaki, Makoto and Ziveri, Patrizia},
5+
year = {2023},
6+
title = {The {FORCIS} database: {A} global census of planktonic foraminifera from ocean waters},
7+
journal = {Scientific Data},
8+
volume = {10},
9+
pages = {354},
10+
doi = {10.1038/s41597-023-02264-2}
11+
}
12+
13+
@article{chaabane2024,
14+
title = {Migrating is not enough for modern planktonic foraminifera in a changing ocean},
15+
url = {https://www.nature.com/articles/s41586-024-08191-5},
16+
doi = {10.1038/s41586-024-08191-5},
17+
volume = {636},
18+
pages = {390-396},
19+
journaltitle = {Nature},
20+
author = {Chaabane, Sonia and {de Garidel-Thoron}, Thibault and Meilland, Julie and Sulpis, Olivier and Chalk, Thomas B. and Brummer, Geert-Jan A. and Mortyn, P. Graham and Giraud, Xavier and Howa, Hélène and Casajus, Nicolas and Kuroyanagi, Azumi and Beaugrand, Gregory and Schiebel, Ralf},
21+
date = {2024}
22+
}
23+
24+
@article{degaridel2022,
25+
author={{de Garidel-Thoron}, Thibault and Chaabane, Sonia and Giraud, Xavier and Meilland, Julie and Jonkers, Lukas and Kucera, Michal and Brummer, Geert-Jan A. and Grigoratou, Maria and Monteiro, Fanny M. and Greco, Mattia and Mortyn, P. Graham and Kuroyanagi, Azumi and Howa, Hélène and Beaugrand, Gregory and Schiebel, Ralf},
26+
title={The foraminiferal response to climate stressors project: Tracking the community response of planktonic foraminifera to historical climate change},
27+
journal={Frontiers in Marine Science},
28+
volume={9},
29+
pages={827962},
30+
date={2022},
31+
url={https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2022.827962},
32+
doi={10.3389/fmars.2022.827962}
33+
}

joss-paper/paper.md

Lines changed: 221 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
2-
title: 'forcis: An R package for accessing, handling and analysing the FORCIS Foraminifera database'
2+
title: 'forcis: An R package for accessing, handling and analysing the FORCIS database'
33
tags:
44
- r
55
- database
66
- planktonic foraminifera
77
- biodiversity
88
- species abundance
99
- data visualisation
10-
date: "9 June 2025"
10+
date: "19 September 2025"
1111
output: pdf_document
1212
authors:
1313
- name:
@@ -60,25 +60,238 @@ affiliations:
6060

6161
# Summary
6262

63-
...
63+
`forcis` is an R package designed to streamline access to the recently published
64+
FORCIS (Foraminifera Response to Climatic Stress) database [@chaabane2023]. This
65+
package enables users to easily download the database directly into an R
66+
environment, filter and select relevant data, convert species counts across
67+
formats, and visualise the results.
6468

6569

6670
# Statement of need
6771

68-
...
72+
The recently developed FORCIS (Foraminifera Response to Climatic Stress)
73+
database provides one of the most comprehensive collections of global planktonic
74+
foraminifera living census data, comprising over 163,000 samples collected via
75+
various sampling devices (Continuous Plankton Recorder — CPR —, plankton nets,
76+
pumps, and sediment traps). These samples span a wide temporal range (1910 to
77+
2018), vertical depths (surface to 5,000 m), and spatial coverage
78+
[@chaabane2023; @degaridel2022]. FORCIS data are crucial
79+
for advancing insights into potential spatial and vertical migrations and
80+
understanding the impacts of global climate change on planktonic foraminifera
81+
biogeography and their seasonal and vertical distribution patterns observed in
82+
recent decades. Additionally, FORCIS’s long temporal scope offers a valuable
83+
resource for investigating the influence of anthropogenic changes on planktonic
84+
foraminifera distribution and ecology [@chaabane2024].
85+
86+
However, working with the FORCIS database presents significant challenges due
87+
to the heterogeneity of the data, which has been compiled from 140 sources,
88+
each using its own taxonomic framework and reporting formats (\autoref{fig:fig1}). This
89+
results in variability in data units, such as concentrations, frequencies, and
90+
raw counts, requiring extensive standardisation for meaningful comparison.
91+
Furthermore, the metadata associated with each sample — such as location,
92+
sampling depth, time, and environmental parameters — adds another layer of
93+
complexity, making data extraction and analysis challenging for users.
94+
95+
![Heterogeneity of data within the FORCIS database. a) Number of taxa from different taxonomic frameworks present in the FORCIS database (net data). b) Different count formats from net samples included in the FORCIS database.\label{fig:fig1}](figures/figure-1.png){ width=100% }
96+
97+
98+
To overcome these obstacles, we developed the `forcis` package, an easy-to-use
99+
tool made especially for using the R programming environment to access, filter,
100+
harmonise, and visualise the FORCIS data. The `forcis` package enables users to
101+
download data directly from [Zenodo](https://doi.org/10.5281/zenodo.7390791) the
102+
latest version of the FORCIS database,
103+
filter and select data according to user-specified criteria, harmonise
104+
taxonomic resolution, convert species counts into uniform units, and visualise
105+
patterns in diversity and abundance. By combining these features, the package
106+
enables researchers to access and analyse the data within the FORCIS database
107+
efficiently, streamlining their investigative efforts.
108+
69109

70110

71111
# Main features
72112

73-
...
113+
To facilitate efficient management and analysis of the FORCIS database, the
114+
`forcis` R package provides a comprehensive set of features fully described in
115+
the [package vignettes](https://docs.ropensci.org/forcis/articles/), where users can find extensive documentation and
116+
tutorials on the major features of the package. The recommended workflow and
117+
the relevant main functions are illustrated in \autoref{fig:fig2}.
118+
119+
![Recommended workflow and main features of the `forcis` R package.\label{fig:fig2}](figures/figure-2.png){ width=100% }
120+
121+
122+
## Download and import FORCIS database in R
123+
124+
The `forcis` R package contains functions that simplify downloading and
125+
importing FORCIS datasets from [Zenodo](https://doi.org/10.5281/zenodo.7390791).
126+
The FORCIS database's most recent version can be retrieved using the function
127+
`download_forcis_db()`.
128+
129+
130+
```r
131+
# Create a data/ directory in the current directory ----
132+
dir.create("data")
133+
134+
# Download the latest version of the FORCIS database ----
135+
download_forcis_db(path = "data", timeout = 300)
136+
```
137+
138+
139+
The `read_*_data()` function family helps users in importing dataset specific
140+
to a particular sampling device, enabling focused analyses.
141+
142+
143+
```r
144+
# Import plankton nets data (previously downloaded) ----
145+
net_data <- read_plankton_nets_data(path = "data")
146+
```
147+
148+
149+
Once the data is imported in R, users can reduce the dataset to include only
150+
the metadata they are interested in by using the function
151+
`select_forcis_columns()`.
152+
153+
154+
## Harmonising taxonomy
155+
156+
To utilise most features of the `forcis` R package, users need to specify the
157+
taxonomic framework they wish to apply (\autoref{fig:fig2}). The FORCIS database includes
158+
counts at three different taxonomic levels: Original Taxonomy (OT), Lumped
159+
Taxonomy (LT), and Validated Taxonomy (VT). For a detailed explanation of the
160+
differences between these three taxonomic levels, we refer the reader to the
161+
FORCIS data descriptor [@chaabane2023]. For selecting the taxonomic
162+
framework of choice, the users can use the function `select_taxonomy()`
163+
following the example below:
164+
165+
166+
```r
167+
# Select a taxonomic framework ----
168+
net_data_vt <- net_data |>
169+
select_taxonomy(taxonomy = "VT")
170+
```
171+
172+
## Filter data
173+
174+
After selecting the taxonomic framework, the `forcis` R package offers multiple
175+
functions to efficiently subset the FORCIS datasets. Users may be interested in
176+
analysing community structure at a specific time, or location, or even
177+
examining the counts of species of interest. Given the wide range of potential
178+
research questions, we have implemented six filtering functions within the
179+
`filter_by_*()` function family, allowing users to customise data extraction
180+
according to their investigation needs (\autoref{fig:fig2}).
181+
182+
183+
```r
184+
# Filter data by year(s) ----
185+
net_data_sub <- net_data_vt |>
186+
filter_by_year(years = 1992)
187+
188+
# Filter data by spatial bounding box ----
189+
net_data_sub <- net_data_vt |>
190+
filter_by_bbox(bbox = c(45, -61, 82, -24))
191+
192+
# Filter data by ocean name ----
193+
net_data_sub <- net_data_vt |>
194+
filter_by_ocean(ocean = "Indian Ocean")
195+
196+
# Filter data by species ----
197+
net_data_sub <- net_data_vt |>
198+
filter_by_species(species = "n_pachyderma_VT")
199+
```
200+
201+
202+
## Transform data
203+
204+
The `compute_*()` function family allows users to convert FORCIS data between
205+
raw abundance, number concentration, and relative abundance, enabling them to
206+
use the units that best suit their analyses and facilitating comparison between
207+
the FORCIS data and their own.
208+
These functions utilise sample metadata to perform unit conversions.
209+
Specifically, conversions between raw abundance and number concentration in
210+
the `forcis` R package are calculated for each taxon using the following
211+
equations:
212+
213+
$$C_{number} = \frac{N_{raw}}{V_{filtered}}$$
214+
215+
where $C_{number}$ is the number concentration, $N_{raw}$ is the raw abundance
216+
(count of individuals), and $V_{filtered}$ is the volume of water filtered
217+
(in $m^3$ or L, depending on the dataset).
218+
219+
$$Frequency = 100 \cdot \frac{N_{raw}}{N_{total}}$$
220+
221+
where $Frequency$ is the relative abundance (in percentage), $N_{raw}$ is the
222+
raw abundance (count of individuals) of a given taxon, and $N_{total}$ is the
223+
total raw abundance (sum of all individuals in the sample or subsample).
224+
225+
The users can decide whether to convert counts at a sample or subsample level
226+
(see @chaabane2023) as the `compute_*()` functions propose the
227+
`aggregate` argument. If `aggregate = TRUE`, the function will return the
228+
transformed counts of each species using the sample as the unit. If
229+
`aggregate = FALSE`, it will re-calculate the species' abundance by subsample.
230+
231+
232+
233+
## Visualisation
234+
235+
The `forcis` package also includes multiple functions to visualise the spatial
236+
distribution of samples selected by users. The `ggmap_data()` function
237+
generates publication-ready maps, displaying sample locations at a global scale
238+
(\autoref{fig:fig3}a). Additionally, users can visualise sample records by various time
239+
units (season, month, year) and by depth, using the functions from the
240+
`plot_record_by_*()` function family (\autoref{fig:fig3}b-d).
241+
These functions can be seamlessly combined with the `filter_by_*()` family of
242+
functions, allowing users to customise their sample selections according to
243+
their specific research needs.
244+
245+
![Overview of visualisations available in the `forcis` R package. a) World map produced by the function `ggmap_data()` to show the location of the data. b) Barplot of number of samples per month produced by the function `plot_records_by_month()`. c) Barplot of number of samples per depth class produced by the function `plot_records_by_depth()`. d) Barplot of number of samples per year produced by the function `plot_records_by_year()`.\label{fig:fig3}](figures/figure-3.png){ width=100% }
246+
247+
```r
248+
# Map raw net data ----
249+
ggmap_data(net_data)
250+
251+
# Plot number of records by year of sampling ----
252+
plot_record_by_year(net_data)
253+
254+
# Plot number of records by month of sampling ----
255+
plot_record_by_month(net_data)
256+
257+
# Plot number of records by depth of sampling ----
258+
plot_record_by_depth(net_data)
259+
```
260+
261+
262+
`forcis` provides five vignettes to learn more about the package:
263+
264+
- the [Get started](https://docs.ropensci.org/forcis/articles/forcis.html)
265+
vignette describes the core features of the package
266+
- the [Database versions](https://docs.ropensci.org/forcis/articles/database-versions.html)
267+
vignette provides information on how to deal with the versioning of the database
268+
- the [Select and filter data](https://docs.ropensci.org/forcis/articles/select-and-filter-data.html) vignette shows examples to handle the FORCIS data
269+
- the [Data conversion](https://docs.ropensci.org/forcis/articles/data-conversion.html)
270+
vignette describes the conversion functions available in `forcis` to compute abundances, concentrations, and frequencies
271+
- the [Data visualization](https://docs.ropensci.org/forcis/articles/data-visualization.html)
272+
vignette describes the plotting functions available in `forcis`
74273

75274

76275
# Acknowledgements
77276

78-
...
277+
The FORCIS project is supported by the French Foundation for Biodiversity
278+
Research ([FRB](https://www.fondationbiodiversite.fr)) through its Centre for
279+
the Synthesis and Analysis of Biodiversity
280+
([CESAB](https://www.fondationbiodiversite.fr/en/about-the-foundation/le-cesab/))
281+
and co-funded by INSU LEFE program and the Max Planck Institute for Chemistry
282+
(MPIC) in Mainz. M.G. was supported by a Juan de la Cierva-formacion 2021
283+
fellowship (FJC2021–047494-I/MCIN/AEI/10.13039/501100011033) from the European
284+
Union “NextGenerationEU”/PRTR and by the Beatriu de Pinós programme
285+
(2022 BP 00209) funded by the Direcció General de Recerca (DGR) del Departament
286+
de Recerca i Universitats (REU) of the Government of Catalonia. In addition,
287+
his work received support from the French government under the France 2030
288+
investment plan, as part of the Initiative d’Excellence d’Aix-Marseille
289+
Université (A*MIDEX AMX-20-TRA-029). The authors would like to thank Beatriz
290+
Milz, Scott Chamberlain and Air Forbes for theirs valuable comments during the
291+
peer review process in
292+
[rOpenSci](https://github.com/ropensci/software-review/issues/660).
79293

80294

81295

82-
# References
83296

84-
...
297+
# References

0 commit comments

Comments
 (0)