Skip to content

HydraR: Stateful Agentic Orchestration for Scientific Reproducibility in R #766

@IgnatiusPang

Description

@IgnatiusPang

Submitting Author Name: Ignatius Pang
Submitting Author Github Handle: @IgnatiusPang
Other Package Authors Github handles: @aidantay
Repository: https://github.com/APAF-bioinformatics/HydraR
Version submitted: v0.2.3
Submission type: Standard
Editor: TBD
Reviewers: TBD

Archive: TBD
Version accepted: TBD
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Package: HydraR
Type: Package
Title: Stateful Agentic Orchestration for Scientific Reproducibility
Version: v0.2.3
Authors@R: 
    c(person(given = c("Chi", "Nam", "Ignatius"),
             family = "Pang",
             role = c("aut", "cre"),
             email = "ignatius.pang@mq.edu.au",
             comment = c(ORCID = "0000-0001-9703-5741")),
      person(given = "Aidan",
             family = "Tay",
             role = c("aut", "cre"),
             comment = c(ORCID = "0000-0003-1315-4896")))
Description: A high-performance framework for orchestrating complex 
    agentic workflows in R, specifically designed for scientific 
    reproducibility and auditability. HydraR provides a robust, 
    state-managed engine for Directed Acyclic Graphs (DAGs) and 
    iterative state machines, prioritizing CLI-native LLM 
    interactions (e.g., Gemini, Claude, Copilot) and hardened 
    state persistence (DuckDB/SQLite). It features isolated 
    execution via Git Worktrees for parallel file-modifying 
    tasks, autonomous quality control through integrated 
    auditors, and human-readable visualization using Mermaid.js. 
    Ideal for creating auditable, resumable research assistants 
    and complex bioinformatics pipelines.
URL: https://github.com/APAF-bioinformatics/HydraR
BugReports: https://github.com/APAF-bioinformatics/HydraR/issues
License: LGPL (>= 3)
Encoding: UTF-8
RoxygenNote: 7.3.3
Depends: 
    R (>= 4.1.0)
Imports: 
    DBI,
    digest,
    furrr,
    igraph,
    jsonlite,
    purrr,
    R6,
    yaml
Suggests: 
    testthat (>= 3.0.0),
    duckdb,
    knitr,
    rmarkdown,
    httr2,
    future,
    devtools,
    withr,
    reticulate,
    DiagrammeR
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • translation
    • rOpenSci internal tools
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

    HydraR provides a stateful, auditable engine for orchestrating complex agentic workflows, specifically targeting scientific reproducibility via Git worktree isolation and persistent DuckDB/SQLite checkpointing. It acts as a robust wrapper for large language model (LLM) command line interfaces (CLIs) tailored for bioinformatics research.

  • Who is the target audience and what are scientific applications of this package?

    The target audience includes bioinformaticians, research software engineers, and data scientists building multi-agent systems for automated data analysis, code refactoring, and large-scale literature synthesis.

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

While 'ellmer' focuses on high-level API convenience, 'mall' excels at structured data operations, and 'gptstudio' on IDE integration, HydraR is uniquely centered on persistent session lifecycles and native command-line integration. It enables complex research pipelines with restorable execution states and complete audit trails, permitting safe parallel file modifications via Git worktrees, which are features essential for reproducible scientific computing. Furthermore, unlike solutions that rely on 'reticulate' to bridge R with Python-based AI frameworks (such as LangChain), HydraR is built entirely on R's native architectural frameworks, avoiding the data-translation barriers and compatibility issues often encountered when passing complex statistical structures between languages.

  • (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?

    N/A

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

    N/A

  • Explain reasons for any pkgcheck items which your package is unable to pass.

    The package currently passes R CMD check with 0 Errors and 0 Warnings. Minor NOTEs regarding hidden developer files (.github, .lintr) are intentional for repository management.

Technical checks

Confirm each of the following by checking the box.

This package:

Use of Generative AI

  • Generative AI tools were used to produce some of the material in this submission.

If so, please describe usage, and include links to any relevant aspects of your repository.

The core R6 architecture, state-management patterns, and Git worktree isolation strategies were designed and authored by the human authors (Ignatius Pang and Aidan Tay) and then rigorously tested manually. AI (Antigravity and Gemini CLI) was strategically employed for implementing logic blocks, unit tests, and documentation, following a rigorous "Human-in-the-loop" pattern where every line was manually reviewed and verified. Detailed disclosure is available in agents.md and DESIGN.md. To ensure software reliability, the framework is continuously evaluated against an automated testing suite consisting of 556 unit tests, currently maintaining a 100% pass rate (zero failures).

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)

Note: We also intend to submit to JOSS as the primary scholarly venue for the software.

Code of conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions