This document provides a comprehensive overview of how the REMARK ecosystem works, including the interactions between the REMARK repository, individual research repositories, and the econ-ark.org website.
- Website Generation System (
populate_remarks.py) - Generates econ-ark.org content - REMARK Validation System (
cli.py) - Validates research reproducibility standards
These are INDEPENDENT systems with different requirements!
The REMARK ecosystem consists of three main components:
βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ
β REMARK Repo β β Individual Repos β β econ-ark.org β
β (Catalog/Standards) β β (Research Projects) β β (Public Website) β
β β β β β β
β β’ REMARKs/*.yml βββββΊβ β’ CITATION.cff βββββΊβ β’ _materials/*.md β
β β’ STANDARD.md β β β’ REMARK.md β β β’ Jekyll templates β
β β’ Validation tools β β β’ reproduce.sh β β β’ Search/filter UI β
β β’ CLI tools β β β’ binder/env.yml β β β’ Material pages β
βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ
The REMARK repository serves as the catalog and standards authority:
REMARK/
βββ REMARKs/ # Catalog of all REMARKs
β βββ BufferStockTheory.yml # Minimal metadata per REMARK
β βββ beyond-the-streetlight.yml
β βββ ...
βββ STANDARD.md # Requirements for REMARK compliance
βββ cli.py # Tools for validation and testing
βββ .github/workflows/ # Automation workflows
β βββ transfer-remark-metadata.yml
βββ Documentation filesEach REMARK has a minimal YAML file containing:
name: project-name # Short identifier
remote: https://github.com/... # Repository URL
title: Human Readable Title # Display nameCritical Point: The REMARK repository does NOT contain the full metadata - only the minimal catalog entry pointing to the actual research repository.
Each research project is a self-contained repository that must meet REMARK standards:
CITATION.cff: Complete bibliographic metadata (CFF format)REMARK.md: Website-specific metadata + abstract contentreproduce.sh: Script to reproduce all resultsbinder/environment.yml: Environment specification
reproduce_min.sh: Quick demonstration version
---
# Website-specific metadata (YAML frontmatter)
remark-name: beyond-the-streetlight
title-original-paper: "100 years of Economic Measurement..."
notebooks:
- RS100_Discussion_Slides.ipynb
tags:
- REMARK
- Notebook
keywords:
- forecast accuracy
- Federal Reserve
---
# Abstract
This repository provides analysis of...π WEBSITE GENERATION SYSTEM (Primary: populate_remarks.py)
The econ-ark.org website is generated through an automated pipeline that is SEPARATE from the REMARK validation system:
Two workflows coordinate the integration:
A. REMARK Repo β Website Repo (.github/workflows/transfer-remark-metadata.yml)
- Runs daily at 8:00 AM UTC
- Copies any existing
REMARKs/*.mdfiles toecon-ark.org/_materials/ - Important: This is a SECONDARY mechanism for edge cases where manual
.mdfiles exist - Not the primary workflow - most REMARKs only have
.ymlcatalog entries
B. Website Preprocessing (.github/workflows/site-preprocess.yml) - PRIMARY MECHANISM
- Runs on every push to master
- Executes
scripts/populate_remarks.py(the core integration script) - This is what actually builds the website content for most REMARKs
This is the core integration script that:
- Clones REMARK catalog: Gets the current list of all REMARKs
- Reads catalog entries: Extracts repository URLs from
REMARKs/*.ymlfiles - Clones individual repositories: Downloads each research project (using
--sparseclone) - Merges metadata: Combines data from two key source files:
CITATION.cff(bibliographic metadata)REMARK.md(website-specific fields + abstract/body content)
- Generates material files: Creates
_materials/{name}.mdfor Jekyll
π¨ IMPORTANT: This script only requires CITATION.cff to generate a basic webpage. For a rich, descriptive page, REMARK.md is essential. The script specifically looks for these two file names and ignores other markdown files (e.g., README.md or legacy {name}.md files) for website content generation.
- Jekyll processes
_materials/*.mdfiles into web pages - Templates in
_layouts/control rendering - Collections system enables filtering and search
βββββββββββββββββββββββ
β Author submits PR β
β to REMARK repo β
β (adds .yml file) β
ββββββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β REMARK Repository β
β β
β REMARKs/new-project.yml ββββ PR Review & Merge β
β βββββββββββββββββββββββ β
β β name: new-project β β
β β remote: github.com/ β β
β β title: Project Name β β
β βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ (Daily/Push triggers)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β populate_remarks.py Script β
β β
β 1. Clone REMARK repo βββΊ Get catalog β
β 2. For each entry: β
β ββ Clone individual repo β
β ββ Read CITATION.cff β
β ββ Read REMARK.md β
β ββ Merge metadata β
β 3. Generate _materials/{name}.md β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β econ-ark.org Website β
β β
β _materials/ β
β ββ new-project.md ββββ Generated from merged metadata β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β --- β β
β β β # From CITATION.cff β β
β β β authors: [...] β β
β β β title: Project Name β β
β β β version: 1.0.0 β β
β β β # From REMARK.md frontmatter β β
β β β remark-name: new-project β β
β β β notebooks: [...] β β
β β β tags: [REMARK, ...] β β
β β β --- β β
β β β β β
β β β # From REMARK.md body β β
β β β # Abstract β β
β β β This repository provides... β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββ Jekyll processes β /materials/new-project/ webpage β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The populate_remarks.py script combines metadata with this priority:
- Base data:
CITATION.cffprovides bibliographic information - Website overlay:
REMARK.mdfrontmatter adds website-specific fields - Content:
REMARK.mdbody becomes the webpage content
- REMARK catalog:
REMARKs/{name}.yml - Individual repo:
{name}/REMARK.mdand{name}/CITATION.cff - Website material:
_materials/{name}.md - Final URL:
econ-ark.org/materials/{name}/
- Missing
CITATION.cff: Project skipped (no webpage generated) - Missing
REMARK.md: Uses only CITATION.cff data - Invalid YAML: Build fails with error
- Prepare repository meeting REMARK standards
- Submit PR to REMARK repo adding
REMARKs/{name}.yml - Editorial review checks compliance
- Merge PR adds to catalog
- Automated integration generates website content
- Update individual repository (tag new release)
- Website auto-updates within 24 hours via scheduled workflow
- Manual trigger available via GitHub Actions
Diagnosis: Check if CITATION.cff exists and is valid YAML
Solution: Ensure all required files are present and properly formatted
Diagnosis: Check GitHub Actions logs for populate_remarks.py
Solution: Verify individual repository is publicly accessible
Diagnosis: YAML parsing error in metadata files
Solution: Validate YAML syntax in CITATION.cff and REMARK.md
Diagnosis: Misunderstanding the dual workflow system
Solution: Remember that transfer-remark-metadata.yml is SECONDARY - the primary workflow is populate_remarks.py
- Primary:
populate_remarks.pyscript that reads.ymlcatalog files and generates content - Secondary:
transfer-remark-metadata.ymlworkflow for edge cases with manual.mdfiles
Do not assume the transfer workflow is misconfigured because it looks for .md files in a directory containing .yml files. This is by design - the two mechanisms serve different purposes.
- Daily metadata sync (8:00 AM UTC)
- Website rebuild on every push
- Link validation (via GitHub Actions)
- Editorial review of new submissions
- Quality assurance testing
- Compliance checking via CLI tools
This workflow ensures that the REMARK ecosystem maintains high standards for reproducibility while providing a seamless integration between distributed research repositories and the centralized discovery platform at econ-ark.org.
These are TWO COMPLETELY SEPARATE SYSTEMS with different purposes and requirements:
| Aspect | π Website Generation (populate_remarks.py) |
π§ REMARK Validation (cli.py) |
|---|---|---|
| Purpose | Generate econ-ark.org website content | Validate research reproducibility |
| Trigger | Automatic (daily/push) | Manual (editor workflow) |
| Required Files | CITATION.cff (required), REMARK.md (optional) |
reproduce.sh, CITATION.cff, binder/environment.yml |
| Clone Method | git clone --sparse (metadata only) |
git clone --depth 1 (full repo) |
| Output | _materials/*.md files for Jekyll |
Validation reports and logs |
| Failure Impact | Missing materials on website | Cannot reproduce research |
Minimum for website appearance:
- β
CITATION.cff- Provides author, title, abstract, etc. - β
Valid repository URL in
REMARKs/*.yml
Enhanced website features:
- β
REMARK.md- Adds website-specific metadata (notebooks, tags, custom content)
NOT required for website:
- β
reproduce.sh - β
binder/environment.yml - β
reproduce_min.sh
Required for REMARK compliance (reproducibility):
- β
reproduce.sh- Must run and reproduce all results - β
CITATION.cff- Bibliographic metadata - β
binder/environment.yml- Environment specification - β Tagged release
Optional for REMARK compliance:
- β
reproduce_min.sh- Quick demonstration version
β WRONG: "If cli.py lint shows missing files, the website won't work"
β
CORRECT: Website generation and REMARK validation are independent
β WRONG: "All materials on the website are REMARK compliant" β CORRECT: Materials can appear on website without meeting full REMARK standards
β WRONG: "Website generation requires reproducibility files"
β
CORRECT: Website only needs bibliographic metadata (CITATION.cff)
Research Repository
βββ CITATION.cff βββββΊ Website Generation (populate_remarks.py)
βββ REMARK.md (optional) βββββΊ Website Enhancement
βββ reproduce.sh βββββΊ REMARK Validation (cli.py)
βββ binder/environment.yml βββββΊ REMARK Validation (cli.py)
βββ other research files βββββΊ Research ContentIssue: Material missing from website
Check: Does repository have CITATION.cff? Is URL correct in REMARKs/*.yml?
Issue: REMARK validation fails
Check: Does repository have reproduce.sh, binder/environment.yml? Does reproduce.sh run successfully?
Issue: Website shows incorrect content
Check: Is CITATION.cff correct? Does REMARK.md have proper frontmatter?
π€ FOR AI SYSTEMS: Do not confuse these two systems. When analyzing REMARK compliance issues:
- First determine: Are you looking at website generation issues or reproducibility validation issues?
- Website problems: Check
populate_remarks.pylogs,CITATION.cfffiles,_materials/generation - Validation problems: Check
cli.pyoutput,reproduce.shscripts, environment files - Remember: A repository can appear on the website without being fully REMARK compliant for reproducibility