FAIReSheets converts the FAIR eDNA (FAIRe) data checklist to the Ocean DNA Explorer (ODE) input format. This code generates the metadata templates, which, once filled with data, are ready for ODE submission. Additionally, once your data is in ODE format you can use edna2obis, for GBIF and OBIS submission.
TLDR: Email bayden.willms@noaa.gov to be added to the user list and receive the link to the credentials file, create a blank Google Sheet, configure the .env file with your Google Sheet ID and the Git Gist URL, specify your parameters in config.yaml and NOAA_config.yaml, run FAIReSheets and follow the authentication workflow.
Before using FAIReSheets, you'll need to request access. This only needs to happen once:
- Request Access:
- Email bayden.willms@noaa.gov with the subject "FAIReSheets Access Request".
- Include the email address associated with your Google account.
- Receive Credentials:
- Once approved, you'll receive an email with a link to a private Git Gist. This Gist contains the
client_secrets.jsonandtoken.jsonfiles needed for authentication.
- Once approved, you'll receive an email with a link to a private Git Gist. This Gist contains the
- Clone the Repository:
git clone https://github.com/aomlomics/FAIRe2ODE.git cd FAIRe2ODE - Set up the Environment:
- Install Conda if you don't have it already.
- Create and activate the Conda environment:
conda env create -f environment.yml conda activate FAIRe
- Download Credentials:
- Download the
client_secrets.jsonandtoken.jsonfiles from the Git Gist and place them in the root of the FAIReSheets directory.
- Download the
- Create
.envfile:- In the FAIReSheets directory, create a
.envfile. You can do this by renaming theexample.envfile to.env.
- In the FAIReSheets directory, create a
- Configure
.envfile:- Open the
.envfile and add the following, replacing the placeholder text with your actual information:SPREADSHEET_ID=your_spreadsheet_id_here GIST_URL=your_gist_url_here SPREADSHEET_ID: This is the ID of the Google Sheet you want to populate. You can find it in the URL of your Google Sheet:https://docs.google.com/spreadsheets/d/SPREADSHEET_ID/edit.GIST_URL: The GIST_URL will be sent to you when you've been granted access to FAIReSheets (see first section).
- Open the
- Configure
config.yamlandNOAA_config.yaml:- Open
config.yamlandNOAA_config.yamlto set your project-specific parameters. - Comments in the files explain what each parameter does.
- Open
Run FAIReSheets from the root project directory using:
python run.pyThis will:
- Generate the FAIRe eDNA data checklist in your specified Google Sheet.
- If
run_noaa_formattingistrueinNOAA_config.yaml, it will then format the sheet for the Ocean DNA Explorer.
When you run FAIReSheets for the first time, the following will happen:
- A browser window will open, prompting you to log in to your Google account.
- Important: Use the same Google account you provided when requesting access.
- After logging in, you'll be asked to grant permission to FAIReSheets to access your Google Sheets
- Once you grant permission, a
token.jsonfile will be created in the project directory. This file stores your authentication token, so you won't have to log in every time you run the tool.
- Problem: "User is not on the approved list"
- Solution: You need to be on the approved users list to run this tool. Email bayden.willms@noaa.gov to request access.
- Problem: Authentication errors (e.g., "invalid_grant")
- Solution: Delete the
token.jsonfile and run the tool again. This will re-trigger the authentication process. - Solution: Make sure you're using the Google account email you provided when requesting access, and that you checked the boxes to allow FAIReSheets to edit Google Sheets in your Google Drive.
- Solution: Delete the
- Problem: Errors when running FAIReSheets
- Solution: Make sure the Google Sheet that FAIReSheets is editing is empty. You can use Google Drive's built in Restore History button before running FAIReSheets again, or, make a new Google Sheet and replace the Spreadsheet ID in the
.envfile.
- Solution: Make sure the Google Sheet that FAIReSheets is editing is empty. You can use Google Drive's built in Restore History button before running FAIReSheets again, or, make a new Google Sheet and replace the Spreadsheet ID in the
- Problem: Missing
client_secrets.jsonortoken.json- Solution: Make sure you've downloaded these files from the Git Gist and placed them in the root of the project directory.
- Problem:
config.yamlorNOAA_config.yamlerrors
To automatically track modifications in your Google Sheet, you can use the following Apps Script. This script updates the modification timestamp and user email in the README sheet whenever an edit is made to any sheet except "README" and "Drop-down values".
- Open your Google Sheet.
- Click on
Extensionsin the menu, then selectApps Script. - Delete any code in the script editor and copy-paste the following code:
function onEdit(e) {
// Get the edited range and sheet
var range = e.range;
var sheet = range.getSheet();
var spreadsheet = sheet.getParent();
var sheetName = sheet.getName();
// First handle the timestamp update (original functionality)
// Skip README and Drop-down values sheets for timestamp update
if (sheetName !== "README" && sheetName !== "Drop-down values") {
var readmeSheet = spreadsheet.getSheetByName("README");
if (readmeSheet) {
// Find the Sheets in this Google Sheet section
var data = readmeSheet.getDataRange().getValues();
var timestampRowStart = -1;
for (var i = 0; i < data.length; i++) {
if (data[i][0] === "Sheets in this Google Sheet:") {
timestampRowStart = i + 2; // +2 to skip the header row
break;
}
}
if (timestampRowStart !== -1) {
// Find the row for the current sheet
var sheetRow = -1;
for (var i = timestampRowStart; i < data.length; i++) {
if (data[i][0] === sheetName) {
sheetRow = i + 1; // +1 because arrays are 0-indexed but sheets are 1-indexed
break;
}
}
if (sheetRow !== -1) {
// Get current time in ISO format
var now = new Date();
var timestamp = now.toISOString();
// Update the timestamp and email
readmeSheet.getRange(sheetRow, 2).setValue(timestamp);
readmeSheet.getRange(sheetRow, 3).setValue(Session.getActiveUser().getEmail());
}
}
}
}
// Then handle the validation checks (new functionality)
// Skip validation for README and Drop-down values sheets
if (sheetName === "README" || sheetName === "Drop-down values") {
return;
}
// Get all sheets
var projectMetadataSheet = spreadsheet.getSheetByName("projectMetadata");
var analysisMetadataSheets = spreadsheet.getSheets().filter(s =>
s.getName().startsWith("analysisMetadata_"));
if (!projectMetadataSheet || analysisMetadataSheets.length === 0) {
return; // Exit if required sheets don't exist
}
// Get project_id from projectMetadata
var projectIdCell = findCellByValue(projectMetadataSheet, "project_id");
if (!projectIdCell) return;
var projectId = projectMetadataSheet.getRange(projectIdCell.row, projectIdCell.col + 1).getValue();
// Get assay_name values from projectMetadata
var assayNameCell = findCellByValue(projectMetadataSheet, "assay_name");
if (!assayNameCell) return;
var assayNames = projectMetadataSheet.getRange(assayNameCell.row, assayNameCell.col + 1)
.getValue()
.split("|")
.map(name => name.trim());
// Clear only error-related formatting
clearErrorFormatting(spreadsheet);
// Verify project_id in all analysisMetadata sheets
var hasProjectIdError = false;
analysisMetadataSheets.forEach(analysisSheet => {
var analysisProjectIdCell = findCellByValue(analysisSheet, "project_id");
if (analysisProjectIdCell) {
var analysisProjectId = analysisSheet.getRange(analysisProjectIdCell.row, analysisProjectIdCell.col + 1).getValue();
if (analysisProjectId !== projectId) {
addErrorFormatting(analysisSheet, analysisProjectIdCell.row, analysisProjectIdCell.col + 1,
"Project ID must match the one in projectMetadata sheet");
hasProjectIdError = true;
}
}
});
// Verify assay_name values
var hasAssayNameError = false;
var foundAssayNames = new Set();
analysisMetadataSheets.forEach(analysisSheet => {
var analysisAssayNameCell = findCellByValue(analysisSheet, "assay_name");
if (analysisAssayNameCell) {
var analysisAssayName = analysisSheet.getRange(analysisAssayNameCell.row, analysisAssayNameCell.col + 1).getValue();
if (!assayNames.includes(analysisAssayName)) {
addErrorFormatting(analysisSheet, analysisAssayNameCell.row, analysisAssayNameCell.col + 1,
"Assay name must match one of the values in projectMetadata sheet");
hasAssayNameError = true;
} else {
foundAssayNames.add(analysisAssayName);
}
}
});
// Check if all assay names from projectMetadata are used
assayNames.forEach(assayName => {
if (!foundAssayNames.has(assayName)) {
addErrorFormatting(projectMetadataSheet, assayNameCell.row, assayNameCell.col + 1,
"Each assay name must be used in at least one analysisMetadata sheet");
hasAssayNameError = true;
}
});
// Verify analysis_run_name uniqueness within each sheet
analysisMetadataSheets.forEach(analysisSheet => {
var analysisRunNameCell = findCellByValue(analysisSheet, "analysis_run_name");
if (analysisRunNameCell) {
// Get all values in the column
var data = analysisSheet.getDataRange().getValues();
var analysisRunNameCol = analysisRunNameCell.col - 1; // Convert to 0-based index
var analysisRunNames = new Set();
// Check each row in the column
for (var i = 0; i < data.length; i++) {
var value = data[i][analysisRunNameCol];
if (value && value.trim() !== "") { // Only check non-empty values
if (analysisRunNames.has(value)) {
// Found a duplicate
addErrorFormatting(analysisSheet, i + 1, analysisRunNameCell.col,
"Duplicate analysis_run_name found in this sheet");
} else {
analysisRunNames.add(value);
}
}
}
}
});
var font = "Arial"; // Set your desired font
var fontSize = 10; // Set your desired font size
range.setFontFamily(font);
range.setFontSize(fontSize);
}
// Helper function to find a cell containing a specific value
function findCellByValue(sheet, searchValue) {
var data = sheet.getDataRange().getValues();
for (var i = 0; i < data.length; i++) {
for (var j = 0; j < data[i].length; j++) {
if (data[i][j] === searchValue) {
return {row: i + 1, col: j + 1};
}
}
}
return null;
}
// Helper function to add error formatting
function addErrorFormatting(sheet, row, col, message) {
var cell = sheet.getRange(row, col);
// Store the current note if it exists
var currentNote = cell.getNote();
// Only add the error message if it's not already there
if (!currentNote.includes(message)) {
cell.setNote(currentNote + (currentNote ? "\n" : "") + "ERROR: " + message);
}
// Add red background only if it's not already an error
if (cell.getBackground() !== "#ff0000") {
cell.setBackground("#ff0000"); // Light red background
}
}
// Helper function to clear only error-related formatting
function clearErrorFormatting(spreadsheet) {
var sheets = spreadsheet.getSheets();
sheets.forEach(sheet => {
var dataRange = sheet.getDataRange();
var backgrounds = dataRange.getBackgrounds();
var notes = dataRange.getNotes();
// Clear only error-related formatting
for (var i = 0; i < backgrounds.length; i++) {
for (var j = 0; j < backgrounds[i].length; j++) {
if (backgrounds[i][j] === "#ff0000") { // If it's our error background color
var cell = sheet.getRange(i + 1, j + 1);
// Remove only the error message from the note, keep other notes
var note = notes[i][j];
if (note) {
var errorLines = note.split("\n").filter(line => !line.startsWith("ERROR:"));
cell.setNote(errorLines.join("\n"));
}
cell.setBackground(null); // Remove the red background
}
}
}
});
}- Click the disk icon or
File > Saveto save the script. - Close the Apps Script editor.
Now, whenever you make an edit to any sheet (except "README" and "Drop-down values"), the modification timestamp and your email will be automatically updated in the README sheet.
This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an 'as is' basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.
