You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/_index.md
+42-6Lines changed: 42 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,14 +3,50 @@ title: "The CodeMeta Project"
3
3
layout: landing
4
4
---
5
5
6
-
## Motivation
6
+
## Introducing CodeMeta
7
7
8
-
Research relies heavily on scientific software, and a large and growing fraction of researchers are engaged in developing software as part of their own research ([Hannay et al 2009](https://doi.org/10.1109/SECSE.2009.5069155 "How do scientists develop and use scientific software?")). Despite this, _infrastructure to support the preservation, discovery, reuse, and attribution of software_ lags substantially behind that of other research products such as journal articles and research data. This lag is driven not so much by a lack of technology as it is by a lack of unity: existing mechanisms to archive, document, index, share, discover, and cite software contributions are heterogeneous among both disciplines and archives and rarely meet best practices ([Howison 2015](https://doi.org/10.1002/asi.23538 "Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature")). Fortunately, a rapidly growing movement to improve preservation, discovery, reuse and attribution of academic software is now underway: a recent [NIH report](http://softwarediscoveryindex.org), conferences and working groups of [FORCE11](https://www.force11.org/), [WSSSPE](http://wssspe.researchcomputing.org.uk/) & [Software Sustainability Institute](http://www.software.ac.uk/), and the rising adoption of repositories like [GitHub](https://github.com), [Zenodo](https://zenodo.org), [figshare](https://figshare.com) & [DataONE](https://www.dataone.org) by academic software developers. Now is the time to improve how these resources can talk to each other.
8
+
Software development creates changes, and scientific research requires
9
+
reproducibility. Since software is routinely used in research projects,
10
+
this can cause problems.
9
11
10
-
## What can software metadata do for you?
12
+
_**CodeMeta acts as a vital bridge between these disciplines.**_
11
13
12
-
What metadata you want from software is determined by your use case. If your primary concerns are credit for academic software, then you're most interested in _citation_ metadata. If you're trying to replicate some analysis, you worry more about versions and dependencies than about authors and titles. And if you seek to discover software you don't already know about that is suitable for a particular task, well then you are interested more in keywords and descriptions. Frequently, developers of scientific software, repositories that host that software, and users themselves are interested in more than one of these objectives, and others besides.
14
+
When used and maintained in code repositories and software distribution
15
+
systems, CodeMeta enables the exact version of a library or application,
16
+
that has been used and cited in scientific and other research, to be
17
+
reliably identified and reused.
13
18
14
-
Different software repositories, software languages and scientific domains denote this information in different ways, which makes it difficult or impossible for tools to work across these different sources without losing valuable information along the way. For instance, a fantastic collaboration between GitHub and figshare provides researchers a way to import software on the former into the persistent archive of the latter, getting a permanent identifier, a DOI in the process. To assign a DOI, figshare must then pass metadata about the object to DataCite, the central DOI provider for all repositories. While this makes DataCite a powerful aggregator, the lack of a crosswalk table means that much valuable metadata is currently lost along the way, such as the original software license, platform, and so forth. Any tool or approach working across software repositories faces similar challenges without a crosswalk table to translate between these.
19
+
Created in 2016 by a consortium of researchers, using existing technologies
20
+
such as Schema.org terms and JSON-LD, CodeMeta is now a recognised framework
21
+
that is employed by a worldwide community of developers, researchers, and
22
+
archivists.
15
23
16
-
For more detail, [visit the project on GitHub](https://github.com/codemeta/codemeta) or check back here soon.
Research relies heavily on scientific software, and a large and growing fraction of researchers are engaged in developing software as part of their own research ([Hannay et al 2009](https://doi.org/10.1109/SECSE.2009.5069155 "How do scientists develop and use scientific software?")). Despite this, _infrastructure to support the preservation, discovery, reuse, and attribution of software_ lags substantially behind that of other research products such as journal articles and research data. This lag is driven not so much by a lack of technology as it is by a lack of unity: existing mechanisms to archive, document, index, share, discover, and cite software contributions are heterogeneous among both disciplines and archives and rarely meet best practices ([Howison 2015](https://doi.org/10.1002/asi.23538 "Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature")). Fortunately, a rapidly growing movement to improve preservation, discovery, reuse and attribution of academic software is now underway: a recent [NIH report](http://softwarediscoveryindex.org), conferences and working groups of [FORCE11](https://www.force11.org/), [WSSSPE](http://wssspe.researchcomputing.org.uk/) & [Software Sustainability Institute](http://www.software.ac.uk/), and the rising adoption of repositories like [GitHub](https://github.com), [Zenodo](https://zenodo.org), [figshare](https://figshare.com) & [DataONE](https://www.dataone.org) by academic software developers. Now is the time to improve how these resources can talk to each other.
9
+
10
+
## What can software metadata do for you?
11
+
12
+
What metadata you want from software is determined by your use case. If your primary concerns are credit for academic software, then you're most interested in _citation_ metadata. If you're trying to replicate some analysis, you worry more about versions and dependencies than about authors and titles. And if you seek to discover software you don't already know about that is suitable for a particular task, well then you are interested more in keywords and descriptions. Frequently, developers of scientific software, repositories that host that software, and users themselves are interested in more than one of these objectives, and others besides.
13
+
14
+
Different software repositories, software languages and scientific domains denote this information in different ways, which makes it difficult or impossible for tools to work across these different sources without losing valuable information along the way. For instance, a fantastic collaboration between GitHub and figshare provides researchers a way to import software on the former into the persistent archive of the latter, getting a permanent identifier, a DOI in the process. To assign a DOI, figshare must then pass metadata about the object to DataCite, the central DOI provider for all repositories. While this makes DataCite a powerful aggregator, the lack of a crosswalk table means that much valuable metadata is currently lost along the way, such as the original software license, platform, and so forth. Any tool or approach working across software repositories faces similar challenges without a crosswalk table to translate between these.
15
+
16
+
For more detail, [visit the project on GitHub](https://github.com/codemeta/codemeta) or check back here soon.
0 commit comments