|
| 1 | +# About Pacifica |
| 2 | + |
| 3 | +## History |
| 4 | + |
| 5 | +Pacifica was created from a collaboration in the |
| 6 | +[United States National Laboratory](https://en.wikipedia.org/wiki/United_States_Department_of_Energy_national_laboratories) |
| 7 | +system. The Environmental Molecular Sciences Laboratory ([EMSL](https://www.emsl.pnnl.gov)) |
| 8 | +collaborated with its host institution Pacific Northwest National Laboratory |
| 9 | +([PNNL](https://www.pnnl.gov)) to help manage the scientific data generated |
| 10 | +by researchers. |
| 11 | + |
| 12 | +EMSL is a [user facility](https://www.energy.gov/science/science-innovation/office-science-user-facilities) |
| 13 | +sponsored by the Department of Energy ([DOE](https://www.energy.gov)) |
| 14 | +Office of Science ([here](https://www.energy.gov/science)) Biological and |
| 15 | +Environmental Research ([BER](https://www.energy.gov/science/ber/biological-and-environmental-research)). |
| 16 | + |
| 17 | +Pacifica was created by EMSL and PNNL in response to |
| 18 | +[DOE Data Management Policy](https://www.energy.gov/datamanagement/doe-policy-digital-research-data-management) |
| 19 | +to manage open research data to make it more transparent. EMSL and |
| 20 | +PNNL both have similar administrative challenges in that a single |
| 21 | +project can not shoulder the burden of meeting the policy on its |
| 22 | +own. This brought both EMSL and PNNL to the table to collaborate on |
| 23 | +developing an institutional data management system for open research |
| 24 | +science. |
| 25 | + |
| 26 | +## The Data Management Challenge |
| 27 | + |
| 28 | +The challenge is defined within the nexus of a large institution's |
| 29 | +operating model, and the data policies the institution is required |
| 30 | +to follow. |
| 31 | + |
| 32 | +### DOE Data Management Policy |
| 33 | + |
| 34 | +The DOE Data Management policy covers a lot, but we will only highlight |
| 35 | +the guiding principles here. |
| 36 | + |
| 37 | +#### Effective Data Management |
| 38 | + |
| 39 | +``` |
| 40 | + 1. Effective data management has the potential to increase the pace of |
| 41 | + scientific discovery and promote more efficient and effective use of |
| 42 | + government funding and resources. Data management planning should be |
| 43 | + an integral part of research planning. |
| 44 | +``` |
| 45 | + |
| 46 | +Pacifica addresses these challenges by offering an effective mechanism |
| 47 | +for data sharing, thus maximizing the impact of research across all |
| 48 | +participating organizations. |
| 49 | + |
| 50 | +#### Sharing Proves Integrity |
| 51 | + |
| 52 | +``` |
| 53 | + 2. Sharing and preserving data are central to protecting the integrity of |
| 54 | + science by facilitating validation of results and to advancing science |
| 55 | + by broadening the value of research data to disciplines other than the |
| 56 | + originating one and to society at large. To the greatest extent, with |
| 57 | + the fewest constraints possible, and consistent with the requirements |
| 58 | + and other principles stated in this document, data sharing should make |
| 59 | + digital research data available to and useful for the scientific |
| 60 | + community, industry, and the public. |
| 61 | +``` |
| 62 | + |
| 63 | +Sharing scientific data with the public proves the scientific work being done |
| 64 | +is of the highest integrity. A dependency of this is to share the software |
| 65 | +system managing the data in an open source model. Without an open source |
| 66 | +software model the integrity of the data and the science could be suspect. |
| 67 | +This is why Pacifica is an open source software project. |
| 68 | + |
| 69 | +#### Preserve What You Can |
| 70 | + |
| 71 | +``` |
| 72 | + 3. Not all data need to be shared or preserved. The costs and benefits of |
| 73 | + doing so should be considered in data management planning. |
| 74 | +``` |
| 75 | + |
| 76 | +The task of sharing and preserving all data generated by open science is |
| 77 | +costly, especially for small projects. The overhead required to adhere to |
| 78 | +meeting the policy must be addressed by the supporting institution. |
| 79 | + |
| 80 | +### The Operating Model |
| 81 | + |
| 82 | +Research institutions allocate and spend funds they recieve in different |
| 83 | +ways. Whatever that model is, data is created at every level of the |
| 84 | +model. Hense, it is important to capture the funding model and the |
| 85 | +relationships to data the internally, Pacifica can fill that requirement. |
| 86 | + |
| 87 | +However, the operating model does not reflect the evolution of scientific |
| 88 | +data. In order to understand what data were created within the spending |
| 89 | +scope, including related projects and sub-tasks, a system needs to be |
| 90 | +integrated in the project from initiation to end. |
| 91 | + |
| 92 | +### The Scientific Data Life-Cycle |
| 93 | + |
| 94 | +The scientific data life-cycle more closely maps to a tree or bush. The |
| 95 | +start of the tree extends back in time. Each branch of the tree a piece |
| 96 | +of scientific data critical to supporting science being done today. The |
| 97 | +leaves of the branch represent the cutting edge of science. Some leaves |
| 98 | +prove to be more sturdy than others and go on to grow into their own |
| 99 | +branch. The most important part of the analogy is that the tree did |
| 100 | +start from somewhere, and we know a little about the immediate future. |
| 101 | +However, the sky is the limit and we do not know how big the tree will |
| 102 | +grow. So, we need to keep the knowledge and data generated in the past |
| 103 | +preserved to keep the tree strong and healthy. |
0 commit comments