@@ -7,10 +7,163 @@ parent: User Guide
77grand_parent : CORA.RDR
88---
99
10- ## Explanation for each dataset files
10+ # File structure in CORA.RDR / Dataverse
1111
12+ The main goal of CORA.RDR is to store and preserve digital data.
1213
14+ Digital data related is stored in ** Datasets** . You can understand each dataset in CORA.RDR as a computer folder that
15+ can contain files or other folders, just as any
16+ folder of your
17+ computer.
1318
19+ Digital data are stored in software-specific file formats. The choice of software and format depends on the intended
20+ use. For example:
1421
22+ * Spreadsheets support formulas, sorting, and filtering — features not preserved in word processors.
23+ * Word processors support complex formatting such as page numbers and tables of contents.
1524
25+ However, saving a file in a program’s default format does ** not** guarantee long-term usability. Risks include:
26+
27+ * Dependency on specific software or software versions
28+ * Obsolescence of proprietary formats
29+ * Exclusive or expensive software required to access files
30+ * Loss of significant characteristics if the format does not support them
31+
32+ To minimize these risks and improve long-term accessibility, use file formats with a high likelihood of remaining usable for many years.
33+
34+ Here are some guidelines on which formats are accepted and the recommended folder structure for a dataset.
35+ We encourage that you enforce these guidelines on the datasets that you upload to CORA.RDR, but it is not a
36+ technical limitation, which means that in some case the guidelines can be bypassed with a justified reason.
37+
38+ For example, we do recommend using CSV for tabular data, but usually users use Excel files to store their tabular data.
39+ It may be possible that when transforming this Excel file into a CSV file,
40+ the original Excel loses an important part of format of the Excel, affecting to the quality of the data. In this case
41+ and others it may be reasonable to bypass the guidelines on the recommended file formats.
42+
43+
44+ ## Preferred and Non-Preferred File Formats (DANS)
45+
46+ Preferred formats are file formats that DANS – based on international agreements – is confident offer the best long-term
47+ guarantees in terms of usability, accessibility, and sustainability. Deposits of research data in preferred formats
48+ will ** always** be accepted by DANS.
49+
50+ Non-preferred formats are widely used in addition to preferred formats and are expected to remain moderately to
51+ reasonably usable, accessible, and robust in the long term.
52+
53+
54+ ## General Guidelines
55+
56+ DANS believes that file formats best suited for long-term sustainability and accessibility:
57+
58+ * Are frequently used
59+ * Have open specifications
60+ * Are independent of specific software, developers, or vendors
61+
62+ In practice, it is not always possible to use formats that satisfy all criteria.
63+
64+ It may be desirable to deposit certain original data in ** non-preferred** formats because these are common usage formats
65+ (e.g., Esri Shapefiles, Microsoft Access databases, SPSS ` .sav ` files).
66+ In those cases, DANS requests you to deposit:
67+
68+ * The original format ** and**
69+ * A preferred format for long-term sustainability.
70+
71+
72+ ### File Format Overview
73+
74+ This information is based upon [ this article] ( https://dans.knaw.nl/en/file-formats/ ) and has suffered some
75+ modifications.
76+
77+ #### Text Documents
78+
79+ | Type | Preferred Formats | Non-Preferred Formats |
80+ | ---------------------------| -------------------------------------------------------------------------------------------------| -----------------------------|
81+ | ** Text documents** | PDF/A (.pdf), ODT (.odt), Microsoft Word (.doc), Office Open XML (.docx), Rich Text File (.rtf) | PDF other than PDF/A (.pdf) |
82+ | ** Plain text** | Unicode text (.txt) | Non-Unicode text (.txt) |
83+ | ** Markup languages** | XML (.xml), HTML (.html) + related (.css, .xslt, .js, .es), Markdown (.md) | — |
84+ | ** Programming languages** | MATLAB, NetCDF, Text-Fabric, Python | — |
85+
86+ #### Spreadsheets
87+
88+ | Preferred | Non-Preferred |
89+ | ------------------------| ---------------------------------------------------------------|
90+ | ODS (.ods), CSV (.csv) | Microsoft Excel (.xls), Office Open XML (.xlsx), PDF/A (.pdf) |
91+
92+ #### Databases
93+
94+ | Preferred | Non-Preferred |
95+ | ----------------------------------------| ------------------------------------------------------------------------|
96+ | SQL (.sql), SIARD (.siard), CSV (.csv) | Microsoft Access (.mdb, .accdb), dBase (.dbf), HDF5 (.hdf5, .he5, .h5) |
97+
98+ #### Statistical Data
99+
100+ | Preferred | Non-Preferred |
101+ | ----------------------------------------------------------| ----------------------------------------------------------------------------------------|
102+ | SPSS (.dat/.sps), STATA (.dat/.do), JASP (.csv/.html), R | SPSS Portable (.por), SPSS (.sav), STATA (.dta), SAS (.7dat, .sd2, .tpt), JASP (.jasp) |
103+
104+ #### Raster Images
105+
106+ | Preferred | Non-Preferred |
107+ | ------------------------------------------------------------------------------------| ---------------|
108+ | JPEG (.jpg, .jpeg), TIFF (.tif, .tiff), PNG (.png), JPEG 2000 (.jp2), DICOM (.dcm) | — |
109+
110+ #### Vector Images
111+
112+ | Preferred | Non-Preferred |
113+ | ------------| -----------------------------------------------------------------------|
114+ | SVG (.svg) | Adobe Illustrator (.ai), EPS (.eps), WMF/EMF (.wmf, .emf), CDR (.cdr) |
115+
116+ #### Audio
117+
118+ | Preferred | Non-Preferred |
119+ | --------------------------------------------------------------------------| --------------------------------------------------------------|
120+ | BWF (.bwf), MXF (.mxf), Matroska (.mka), FLAC (.flac), OPUS, WAVE (.wav) | MP3 (.mp3), AAC (.aac, .m4a), AIFF (.aif, .aiff), OGG (.ogg) |
121+
122+ #### Video
123+
124+ | Preferred | Non-Preferred |
125+ | -------------------------------------------------------------------------------------------------| -----------------------------------|
126+ | MXF (.mxf), Matroska (.mkv), MPEG-4 (.mp4, .m4a, .m4v, …), MPEG-2 (.mpg, .mpeg, .m2v, .mpg2, …) | AVI (.avi), QuickTime (.mov, .qt) |
127+
128+ #### CAD (Computer-Aided Design)
129+
130+ | Preferred | Non-Preferred |
131+ | --------------------------------------------| ------------------------------------------------------|
132+ | AutoCAD DXF R12 (ASCII) (.dxf), SVG (.svg) | AutoCAD DXF (other versions), DWG (.dwg), DGN (.dgn) |
133+
134+ #### GIS (Geographical Information Systems)
135+
136+ | Preferred | Non-Preferred |
137+ | ----------------------------------------------------------------------| -----------------------------------------------------------------------------------------------------------------------------------------|
138+ | GML (.gml), MIF/MID (.mif/.mid), GeoJSON (.json), GeoPackage (.gpgk) | Esri Shapefiles (.shp + related), MapInfo (.tab + related), KML (.kml, .kmz), Esri Geodatabase (.gdb), Project files (.mxd, .wor, .qgs) |
139+
140+ #### Georeferenced Images
141+
142+ | Preferred | Non-Preferred |
143+ | -------------------------------------------------------------------------------------| ----------------------|
144+ | GeoTIFF (.tif, .tiff), TIFF World File (.tfw + .tif), JPEG World File (.jgw + .jpg) | ERDAS IMAGINE (.img) |
145+
146+ #### Raster GIS
147+
148+ | Preferred | Non-Preferred |
149+ | -------------------------| ------------------------------------------------------------------|
150+ | ASCII GRID (.asc, .txt) | Esri GRID (.grd), Surfer Grid (.grd, .srf), ERDAS IMAGINE (.img) |
151+
152+ #### 3D
153+
154+ | Preferred | Non-Preferred |
155+ | ----------------------------------------------------------------------------------------------------------| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
156+ | OBJ (.obj), PLY (.ply), X3D (.x3d), glTF 2.0 (.gltf, .glb), COLLADA (.dae), LAS (.las, .laz), IFC (.ifc) | Autodesk FBX (.fbx), Blender (.blend), glTF 1.0, 3D PDF (.pdf), Google Draco (.drc), Artec (.a3d), Agisoft Metashape (.psx, .psz), STL (.stl), VRML (.wrl, .wrz, .vrml) |
157+
158+ #### RDF
159+
160+ | Preferred | Non-Preferred |
161+ | ---------------------------------------------------------------| ---------------|
162+ | RDF/XML, Trig (.trig), Turtle (.ttl), NTriples (.nt), JSON-LD | — |
163+
164+ #### CAQDAS (Qualitative Data)
165+
166+ | Preferred | Non-Preferred |
167+ | -----------| ------------------------------------------|
168+ | REFI-QDA | ATLAS.TI copy bundle, NVivo project file |
16169
0 commit comments