Skip to content

FederatedMethods/five-safes-mapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Five safes and RO-Crates in DataSHIELD

Background

Five Safes Framework

The five safes framework is a conceptual framework for data access and sharing that emphasizes five key principles: Safe Projects, Safe People, Safe Settings, Safe Data, and Safe Outputs. It is designed to ensure that data is used responsibly and ethically while maximizing its utility for research and analysis. By ensuring each safe is appropriately managed, pragmatic decisions can be made about mitigating risks associated with data sharing and access. The goal isn't to maximise the controls in each safe, but to ensure that the controls are appropriate for the risks associated with the data and the intended use, this may mean that e.g. one of the safes has very strict controls, which means that one of the others could have less.

RO-Crates

RO-Crates are a way to package and share research data and metadata in a standardized format. They provide a structured way to organize data, code, and documentation, making it easier to share and reuse research outputs. RO-Crates are designed to be machine-readable and human-readable, ensuring that the data can be easily understood and used by others.

DataSHIELD

DataSHIELD has multiple components which contribute controls to the five safes framework, but they are not co-ordinated or in the language of the five safes framework.

5 Safes RO-Crates

There is a five safes RO-Crates profile () which was partly developed as part part of TRE-FX. It brings together both five safes and workflows into one entity.

Use cases

To assess the fit of 5S RO-Crates with DataSHIELD, and if we need to modify it or develop a new DataSHIELD profile which inherits (or the inverse) we need specify our use cases. These are some examples which RO-Crates could be used for in DataSHIELD:

Intra-TRE audit/reporting

A TRE acting in isolation, or as part of a federated network, may want to audit or report on its own use of DataSHIELD. This could include information about the projects, people, settings, data, and outputs derived from DataSHIELD within the TRE. This could be packaged in an easy to understand dashboard to provide a summary of the TRE's DataSHIELD activities.

Inter-TRE audit

Where a TRE is in a federated network, there is an agreed degree of trust in each TRE to ensure that data sent between them is as expected. In the context of DataSHIELD the assumption is that correct statistical disclosure control has been applied before data is sent to another TRE. This is difficult to verify, e.g. how would TRE 1 know that TRE 2 has applied the correct SDC? We could package informatation about the SDC applied e.g. the disclosure thresholds etc, so that each TRE has a record of what has happened to the data before it receieves it, allowing post hoc audit.

Inter-TRE actionable decision making

This is the same scenaio as above, except that instead of post hoc auditing, the information about the five safes is used to make real time decisions about whether to accept data from another TRE. This could be used to ensure that the data meets the required standards for disclosure control before it is accepted into the TRE.

SDC ouput documentation

DataSHIELD may be set up in an environment where the results of analyses by the client software are required to have manual SDC carried out on them before they can leave the network. We could package the information about the methods used in the analysis and the relevant thresholds for SDC along with the result requested out the network. This would allow the manual SDC to have an audit trail of what was done and would act as a decision support tool to help understand the risks associated with the output.

Reproducibility

To enable an analysis to be reproduced at a later date, it is important to have a record of the data, code, and methods used in the analysis. We could package this up in an RO-Crate.

Development plan

Five safes mapping

In most of the use cases above there is a requirement to have information mapped to the five safes framework. This is where we should start. Assuming we use Opal, we need to understand how we can populate the five safes information from existing library and API calls. five_safes_mapping.R is a first attempt at this. Using opalr, DSI, and the opal API we can get begin this mapping. We won't worry about formatting it as an RO-Crate for now.

ACTION ALL: update five_safes_mapping.R to include more information relevant to the five safes framework.

There is likely other information which we would REQUIRE to include.

ACTION ALL: Think about other information which we would REQUIRE to include in the five safes mapping.

Cre8or outputs an RO-Crate with lots of upstream information which would likely be useful in our five safes mapping. We should work through an example to see how it maps.

ACTION RVD: Get an example output from cre8tor from Mike.

RO-Crate engine locaton

Something is going to have to collate the information for the RO-Crate. Where this sits and how it is invoked needs to be decided. It might be that it sits next to DSI. It might be invoked once at the end of an analysis or it might be invoked on every iteration of an analysis.

ACTION ALL: Think about where the RO-Crate engine should sit and how it is invoked.

Scope of work

Let's stat with a simple DataSHIELD function - ds.mean and work outwards from there.

Relevant R Packages

This package contains functions to pack DataSHIELD analyses into RO-Crates.

This package contains functions to generate RO-Crates, including the 5s-crates profile.

Opal R Client for the Opal data warehouse. Most of the web services of Opal can be reached by an opalr function: import/export, data dictionaries, projects, tables, resources, permissions, users, DataSHIELD profiles etc.

Armadillo implementation of DSI to be DataSHIELD ready, part of the MOLGENIS suite.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages