Skip to content

feat: add plot-batch manifest and shared plot data loader#123

Open
claytonlin1110 wants to merge 2 commits intollmsresearch:mainfrom
claytonlin1110:feat/plot-batch-manifest
Open

feat: add plot-batch manifest and shared plot data loader#123
claytonlin1110 wants to merge 2 commits intollmsresearch:mainfrom
claytonlin1110:feat/plot-batch-manifest

Conversation

@claytonlin1110
Copy link
Copy Markdown
Contributor

Summary

Adds paperbanana plot-batch so multiple statistical plots (CSV/JSON + intent) can run in one batch with the same outputs/batch_* layout and batch_report.json as diagram batches. Studio Batch tab can run plot manifests. Diagram batches now set batch_kind: methodology; plot batches set statistical_plot. batch-report MD/HTML shows kind when present.

Closes #122

Motivation

Authors often need many data figures per paper; looping plot manually loses a unified report and output tree. This aligns plots with the existing batch workflow.

How to test

  • pytest tests/test_batch.py tests/test_core/test_plot_data.py
  • paperbanana plot-batch --manifest examples/plot_batch_manifest.yaml (requires API keys)
  • paperbanana batch-report --batch-dir <batch_dir> --format markdown

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

@dippatel1994 Any update about this feature?

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

@dippatel1994 Please review

Copy link
Copy Markdown
Member

@dippatel1994 dippatel1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passes, clean extraction of shared data loader. One behavioral concern:

  1. JSON data unwrapping regression — Old Studio run_plot had raw_data = raw if isinstance(raw, list) else raw.get("data", raw) which unwrapped a {"data": [...]} envelope. New load_statistical_plot_payload() returns raw JSON as-is, then callers wrap as raw_data={"data": payload}. If a user passes {"data": [{"x":1}]}, old code produces raw_data={"data": [...]} (unwrapped), new code produces raw_data={"data": {"data": [...]}} (double-wrapped). Decide which behavior is correct and be consistent.

Non-blocking: total_seconds missing from Studio run_plot_batch report (always shows 0.0s). Pre-existing pattern but easy to fix here. Also no test for empty manifest handling.

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

@dippatel1994 Just updated.

Copy link
Copy Markdown
Member

@dippatel1994 dippatel1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON unwrapping regression fixed, total_seconds added to Studio report, empty manifest test added. CI green. LGTM.

@claytonlin1110
Copy link
Copy Markdown
Contributor Author

Thanks @dippatel1994
ready to be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Batch statistical plot generation (plot-batch)

2 participants