feat: update of landing page by m-jahn · Pull Request #60 · snakemake/snakemake-workflow-catalog

m-jahn · 2025-10-22T12:49:54Z

slight change to buttons to look more subtle
added workflow stats to the front page
these are updated every time the catalog is rendered
closes new feature: add statistics or summary figures on catalog landing page #58

Summary by CodeRabbit

New Features
- Added interactive visualization dashboard displaying workflow statistics, trends, and metrics
- Updated homepage with new About section, Workflows in numbers with charts, and GitHub issue reporting link

coderabbitai · 2025-10-22T12:50:13Z

📝 Walkthrough

Walkthrough

The PR implements a feature to display workflow statistics on the catalog landing page. It adds Altair-based chart generation, creates a new module for building visualizations from workflow data, updates the Sphinx build process to generate charts, and extends the landing page with new sections for statistics display.

Changes

Cohort / File(s)	Change Summary
Build Configuration `.gitignore`	Added ignore patterns for generated documentation assets (`source/_static/`) and shell scripts (`.sh`) under Sphinx documentation section.
Dependencies `environment.yml`	Added Altair 5.5.0 and sphinxext-altair packages to enable charting and Sphinx integration.
Chart Generation `source/build_wf_charts.py`	New module with two public functions: `bar_chart()` for rendering horizontal bar charts with Altair, and `build_wf_charts()` orchestrating data loading from `data.json` and `topics_stats.json`, registering a transparent theme, and generating six statistics visualizations (standardized vs. other workflows, ages since updates, stargazer distribution, subscriber distribution, health status, and top topics).
Data Processing `source/build_wf_tables.py`	Added safety guard for empty repos list before metric assignment, persisted topics statistics to `_static/topics_stats.json`, and added explicit `return None` statement.
Sphinx Configuration `source/conf.py`	Registered `sphinxext_altair.altairplot` extension and imported `build_wf_charts` module to execute chart generation during documentation build.
Landing Page `source/index.md`	Added About section describing catalog purpose, updated Shortcuts section with revised button styles and labels, introduced Report Issues on Github block, and added Workflows in numbers section with embedded chart iframes for dynamic statistics display.

Sequence Diagram

sequenceDiagram
    participant Build as Sphinx Build
    participant Conf as conf.py
    participant WFTables as build_wf_tables()
    participant WFCharts as build_wf_charts()
    participant DataFile as data.json
    participant TopicStats as topics_stats.json
    participant ChartHTML as Chart HTMLs

    Build->>Conf: Initialize build
    Conf->>WFTables: Execute table building
    WFTables->>DataFile: Load workflow data
    WFTables->>TopicStats: Save topics statistics
    Conf->>WFCharts: Execute chart building
    WFCharts->>DataFile: Load workflow data
    WFCharts->>TopicStats: Load topics statistics
    WFCharts->>WFCharts: Register transparent theme
    WFCharts->>WFCharts: Generate 6 chart types
    WFCharts->>ChartHTML: Save chart_{metric}.html files
    Build->>Build: Render HTML with embedded charts

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The review involves evaluating new chart-generation logic with multiple data transformations, Altair configuration, and theme registration in build_wf_charts.py, along with integration points in conf.py and build_wf_tables.py. The landing page markdown updates add content complexity. While changes span multiple files, the patterns are generally straightforward and follow a coherent charting workflow.

Suggested reviewers

johanneskoester
cmeesters
tedil

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "feat: update of landing page" is directly related to the main change in the changeset. The pull request's primary objective is to update the landing page by adding workflow statistics and visualization, as confirmed by the PR description and the substantial changes to source/index.md which introduces a new "Workflows in numbers" section with dynamic charts. While the title could be more specific in mentioning the statistics feature, it accurately describes the main aspect of the change and is sufficiently clear for a teammate scanning history to understand the primary update being made.
Linked Issues Check	✅ Passed	The pull request substantially addresses the objectives from issue #58. The implementation adds statistics to the landing page through a new "Workflows in numbers" section with iframe-embedded charts [#58]. The code creates multiple visualizations including workflow standardization metrics, age-since-update distributions, stargazer counts, subscriber counts, health status, and top topics [build_wf_charts.py]. The PR description confirms these stats are updated with each build cycle, matching the weekly cadence requirement [#58]. The visualizations are created using Altair with custom theming for visual appeal [build_wf_charts.py, conf.py]. While not all example metrics mentioned in the issue (such as "new workflows this week" in that exact form) are explicitly implemented, the core requirement of displaying main statistics in a visually pleasing manner on the landing page is fulfilled.
Out of Scope Changes Check	✅ Passed	All code changes in the pull request are directly related to or supporting the objective of adding workflow statistics to the landing page. The .gitignore and environment.yml updates provide necessary infrastructure (documentation patterns and visualization dependencies) for the feature [.gitignore, environment.yml]. The new build_wf_charts.py module implements the core chart generation functionality [build_wf_charts.py]. The build_wf_tables.py modifications add support for persisting topics statistics needed by the new charts [build_wf_tables.py]. The conf.py changes integrate the chart-building step into the Sphinx build process [conf.py]. The index.md changes introduce the landing page display of these statistics along with minor button styling improvements mentioned in the PR description [index.md]. No out-of-scope changes are detected.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

.gitignore (1)

74-75: Consider narrowing the shell script ignore pattern.

The *.sh pattern is very broad and will ignore all shell scripts in the repository, including potentially legitimate scripts that should be version-controlled. If the intent is to ignore only build-generated scripts, consider a more specific pattern like source/*.sh or build*.sh.
source/build_wf_charts.py (2)
65-71: Consider handling workflows updated more than 365 days ago.

The current binning logic excludes workflows that haven't been updated in over a year, as values beyond the last bin edge (365) become NaN and are dropped by value_counts(). If this is intentional, consider:

Adding a final label like ">1 year" with a larger upper bin edge (e.g., bins=[0, 1, 7, 31, 365, 999999])

Documenting that only recently updated workflows are shown

Also, the label "last_year" is ambiguous—it represents workflows updated between 31-365 days ago, so ">1 month" or "1-12 months" might be clearer.

99-117: Consider explicit column naming for robustness.

The eval_status logic correctly categorizes linting/formatting status. However, Line 115's rename(columns={"index": "health"}) relies on pandas' default column naming from value_counts().reset_index(), which can vary between versions. For clarity and robustness, explicitly assign column names:
df_health = (
    df[["linting", "formatting"]]
    .apply(lambda row: eval_status(row), axis=1)
    .value_counts()
    .reset_index()
)
df_health.columns = ["health", "count"]

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c9b3f8 and f2f08ad.

📒 Files selected for processing (6)

.gitignore (1 hunks)
environment.yml (1 hunks)
source/build_wf_charts.py (1 hunks)
source/build_wf_tables.py (2 hunks)
source/conf.py (2 hunks)
source/index.md (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

source/build_wf_charts.py
source/build_wf_tables.py
source/conf.py

🧬 Code graph analysis (1)

source/conf.py (2)

source/build_wf_charts.py (1)

build_wf_charts (33-129)

source/build_wf_tables.py (1)

build_wf_tables (68-156)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: generate-catalog

🔇 Additional comments (11)

environment.yml (1)

18-18: LGTM! Dependencies support new charting feature.

The addition of Altair 5.5.0 and sphinxext-altair appropriately enables the chart generation functionality introduced in this PR.

Also applies to: 23-23

source/build_wf_tables.py (2)

59-60: Good defensive programming!

The guard prevents a potential KeyError when attempting to assign the metric to an empty repository list. This ensures the function handles edge cases gracefully.

151-152: LGTM! Topics statistics correctly persisted for chart generation.

This output is appropriately consumed by build_wf_charts.py to generate the topic-based visualizations.

source/conf.py (2)

20-20: LGTM! Chart generation correctly integrated into build sequence.

The import and invocation of build_wf_charts() is placed after build_wf_tables(), which ensures that the required topics_stats.json dependency is generated before chart creation.

Also applies to: 25-25

35-35: LGTM! Altair extension properly registered.

The sphinxext_altair.altairplot extension enables rendering of Altair visualizations in the Sphinx documentation, supporting the new chart features.

source/index.md (1)

73-92: LGTM! Chart iframes correctly reference generated assets.

The six iframe elements properly reference the chart HTML files generated by build_wf_charts.py: chart_standardized.html, chart_age_group.html, chart_stargazers_count.html, chart_subscribers_count.html, chart_health.html, and chart_topic.html.

source/build_wf_charts.py (5)

7-30: LGTM! Chart rendering function is well-structured.

The bar_chart function correctly generates horizontal bar charts with custom styling and saves them as standalone HTML files. The function signature and implementation are appropriate for the use cases in this module.

35-56: LGTM! Data loading and theme configuration are correct.

The function properly loads workflow data and topics statistics, and registers a custom Altair theme with appropriate styling for the documentation context.

86-97: Same sorting concern as stargazers chart.

This plot has the same categorical sorting issue as the stargazers chart (lines 73-84). The categories may not appear in the intended numerical order.

119-125: LGTM! Topic chart generation is correct.

The data transformation properly renames columns, sorts by workflow count, and limits to the top 6 topics for visualization.

73-84: The review concern is incorrect; the code works as intended.

Based on web search results, value_counts on a categorical Series preserves category information and ordering, and reset_index keeps the categorical dtype (including the ordered flag) when the counts' index is a CategoricalIndex. Since pd.cut() creates an ordered Categorical by default, and sort_values sorts by the order defined by categories—not lexical order—the bars will appear in the correct numerical sequence: 0, 1-5, 6-10, 11-20, 21-50, >50.

The categorical ordering is maintained throughout the pipeline (pd.cut → value_counts → reset_index → sort_values), so no changes are needed.

Likely an incorrect or invalid review comment.

source/build_wf_charts.py

cademirch

awesome work @m-jahn as usual. looks great.

cmeesters

looks great! And the code is clean.

cmeesters · 2025-10-24T07:55:05Z

@m-jahn I still need to test before merging. This might take a few hours as I am busy.

m-jahn · 2025-10-24T08:00:06Z

Oops sorry @cmeesters I was too quick now. Please have a go at it. We can make improvements with the next PR if necessary.

m-jahn · 2025-10-24T08:38:33Z

PS: the catalog building and deploying is tested automatically in the github action, so that at least is nothing we need to worry about.

cmeesters · 2025-10-26T11:07:00Z

@m-jahn it looks just great! Wasn't able to test, anyway. I am particularly busy ...

m-jahn added 2 commits October 21, 2025 11:22

fix: precaution against empty lists

fedfb37

feat: update to front page, now with wf stats, closes snakemake#58

f2f08ad

m-jahn requested review from cademirch, cmeesters and johanneskoester October 22, 2025 12:50

coderabbitai bot reviewed Oct 22, 2025

View reviewed changes

source/build_wf_charts.py Show resolved Hide resolved

cademirch approved these changes Oct 22, 2025

View reviewed changes

cmeesters approved these changes Oct 24, 2025

View reviewed changes

m-jahn merged commit dd5b0c5 into snakemake:main Oct 24, 2025
3 checks passed

coderabbitai bot mentioned this pull request Jan 7, 2026

feat: improved look of bar charts on landing page #62

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: update of landing page#60

feat: update of landing page#60
m-jahn merged 2 commits intosnakemake:mainfrom
m-jahn:landing_page

m-jahn commented Oct 22, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 22, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

cademirch left a comment

Uh oh!

cmeesters left a comment

Uh oh!

Uh oh!

cmeesters commented Oct 24, 2025

Uh oh!

m-jahn commented Oct 24, 2025

Uh oh!

m-jahn commented Oct 24, 2025

Uh oh!

cmeesters commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

m-jahn commented Oct 22, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cademirch left a comment

Choose a reason for hiding this comment

Uh oh!

cmeesters left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cmeesters commented Oct 24, 2025

Uh oh!

m-jahn commented Oct 24, 2025

Uh oh!

m-jahn commented Oct 24, 2025

Uh oh!

cmeesters commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

m-jahn commented Oct 22, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 22, 2025 •

edited

Loading