You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.Rmd
+47-3Lines changed: 47 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -22,13 +22,13 @@ knitr::opts_chunk$set(
22
22
23
23
A *data reporting template* is a standardized spreadsheet file (in either xls or xlsx format) used for reporting and processing experimental data. These templates significantly reduce the time required for data analysis and encourage users to present their data in a structured format, minimizing errors and misinterpretations.
24
24
25
-
The **excelDataGuide** package eliminates the need for users to write and maintain complex code for reading data from intricate spreadsheet DRTs. Additionally, it offers a robust framework for validating data, ensuring the correct data types are utilized, and facilitating data wrangling when necessary. This functionality supports *Interoperability* for DRTs, a key aspect of the [FAIR](https://www.go-fair.org/fair-principles/) principles.
25
+
The **excelDataGuide** package eliminates the need for data analysts to write and maintain complex code for reading data from various complex spreadsheet DRTs. Additionally, it offers a robust framework for validating data, ensuring that the correct data types are utilized, and facilitating data wrangling when necessary. This functionality supports *Interoperability* for DRTs, a key aspect of the [FAIR](https://www.go-fair.org/fair-principles/) principles.
26
26
27
27
The package features a user-friendly interface for extracting data from Excel files and converting it into R objects. It accommodates three types of data structures: key-value pairs, tabular data, and microplate-formatted data. The locations of these structures within the Excel template are specified by a **data guide**, which is a YAML file — a structured format that is both human- and machine-readable.
28
28
29
29
## Installation
30
30
31
-
You can install the development version of excelDataGuide from [GitHub](https://github.com/) with:
31
+
You can install the development version of excelDataGuide in a recent version of R from GitHub with:
32
32
33
33
```r
34
34
# install.packages("pak")
@@ -48,6 +48,50 @@ data <- read_data(datafile, guidefile)
48
48
49
49
The output of the `read_data()` function is a list object the format of which is determined for a large part by the design of the data guide.
50
50
51
+
## Details
52
+
53
+
### Data guide
54
+
55
+
The *data guide* is a human readable and editable file in [YAML](https://yaml.org/spec/1.2.2/) format that specifies the structure and location of the data in the Excel file. It contains a list of data types, each of which is defined by a name and a set of parameters. As the name suggests, the *data guide* is used by the **excelDataGuide** package as a guide to extract all indexed data from the Excel file and convert it into proper R objects. An example of part of a *data guide* is shown below:
56
+
57
+
```
58
+
guide.version: '1.0'
59
+
template.name: competition
60
+
template.min.version: '9.3'
61
+
template.max.version: ~
62
+
plate.format: 96
63
+
locations:
64
+
- sheet: description
65
+
type: cells
66
+
varname: .template
67
+
translate: false
68
+
variables:
69
+
- name: version
70
+
cell: B2
71
+
- sheet: description
72
+
type: keyvalue
73
+
translate: true
74
+
atomicclass:
75
+
- character
76
+
- character
77
+
- character
78
+
- character
79
+
- character
80
+
- date
81
+
- character
82
+
- numeric
83
+
- character
84
+
- numeric
85
+
- character
86
+
- numeric
87
+
- character
88
+
- character
89
+
varname: metadata
90
+
ranges:
91
+
- A10:B21
92
+
- A24:B25
93
+
```
94
+
51
95
## Future work
52
96
53
-
We want to provide guide and template structures for data types without upper size limit, like time series with no pre-determined length.
97
+
We want to provide guide and template structures for data types without upper size limit, typically time series with no pre-determined length.
0 commit comments