meta-pytorch
diff --git a/‎.github/workflows/gpu_test.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/gpu_test.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/unit_test.yaml‎
Lines changed: 4 additions & 2 deletions b/‎.github/workflows/unit_test.yaml‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 1 deletion b/‎README.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎apps/grpo/main.py‎
Lines changed: 0 additions & 16 deletions b/‎apps/grpo/main.py‎
Lines changed: 0 additions & 16 deletions
diff --git a/‎docs/Tutorials/ReadMe.MD‎
Lines changed: 0 additions & 19 deletions b/‎docs/Tutorials/ReadMe.MD‎
Lines changed: 0 additions & 19 deletions
diff --git a/‎docs/source/_static/logo-icon.svg‎
Lines changed: 12 additions & 0 deletions b/‎docs/source/_static/logo-icon.svg‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/source/conf.py‎
Lines changed: 51 additions & 2 deletions b/‎docs/source/conf.py‎
Lines changed: 51 additions & 2 deletions
diff --git a/‎docs/source/getting_started.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/getting_started.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/index.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/index.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/Tutorials/1_RL_and_Forge_Fundamentals.MD‎ renamed to ‎docs/source/tutorial_sources/zero-to-forge/1_RL_and_Forge_Fundamentals.md‎
Lines changed: 3 additions & 1 deletion b/‎docs/Tutorials/1_RL_and_Forge_Fundamentals.MD‎ renamed to ‎docs/source/tutorial_sources/zero-to-forge/1_RL_and_Forge_Fundamentals.md‎
Lines changed: 3 additions & 1 deletion
@@ -1,4 +1,4 @@
-name: GPU tests
+name: GPU Tests
 
 on:
   schedule:
 
@@ -1,8 +1,10 @@
-name: Unit Test
+name: Unit Tests
 
 on:
   pull_request:
-
+  push:
+    branches: [ main ]
+  workflow_dispatch:
 
 jobs:
   unit_tests:
 
@@ -1,7 +1,8 @@
 # <img width="35" height="35" alt="image" src="https://github.com/user-attachments/assets/2700a971-e5d6-4036-b03f-2f89c9791609" /> Forge
 
-
 #### A PyTorch-native agentic RL library that lets you focus on algorithms—not infra.
+[![Unit Tests](https://github.com/meta-pytorch/forge/actions/workflows/unit_test.yaml/badge.svg?branch=main)](https://github.com/meta-pytorch/forge/actions/workflows/unit_test.yaml?query=branch%3Amain)
+[![GPU Tests](https://github.com/meta-pytorch/forge/actions/workflows/gpu_test.yaml/badge.svg?branch=main)](https://github.com/meta-pytorch/forge/actions/workflows/gpu_test.yaml?query=branch%3Amain)
 
 ## Overview
 The primary purpose of the Forge ecosystem is to delineate infra concerns from model concerns thereby making RL experimentation easier. Forge delivers this by providing clear RL abstractions and one scalable implementation of these abstractions. When you need fine-grained control over placement, fault handling/redirecting training loads during a run, or communication patterns, the primitives are there. When you don’t, you can focus purely on your RL algorithm.
 
@@ -496,22 +496,6 @@ async def continuous_training():
 
         training_task.cancel()
 
-        # give mlogger time to shutdown backends, otherwise they can stay running.
-        # TODO (felipemello) find more elegant solution
-        await mlogger.shutdown.call_one()
-        await asyncio.sleep(2)
-
-        await asyncio.gather(
-            DatasetActor.shutdown(dataloader),
-            policy.shutdown(),
-            RLTrainer.shutdown(trainer),
-            ReplayBuffer.shutdown(replay_buffer),
-            ComputeAdvantages.shutdown(compute_advantages),
-            ref_model.shutdown(),
-            reward_actor.shutdown(),
-        )
-        # TODO - add a global shutdown that implicitly shuts down all services
-        # and remote allocations
         await shutdown()
 
 
 
@@ -65,6 +65,8 @@ def get_version_path():
     "sphinx_gallery.gen_gallery",
 ]
 
+html_favicon = "_static/logo-icon.svg"
+
 html_baseurl = (
     f"https://meta-pytorch.org/forge/{version_path}"  # needed for sphinx-sitemap
 )
@@ -82,8 +84,14 @@ def get_version_path():
     "_templates",
     os.path.join(os.path.dirname(pytorch_sphinx_theme2.__file__), "templates"),
 ]
-exclude_patterns = ["tutorials/index.rst", "tutorials/template_tutorial.rst"]
 
+exclude_patterns = [
+    "tutorials/index.rst",
+    "tutorials/template_tutorial.rst",
+    "tutorials/**/index.rst",
+    "tutorial_sources/**/*.md",  # Exclude all markdown files from tutorial_sources
+    "tutorial_sources/**/*.MD",  # Also exclude uppercase .MD files
+]
 html_static_path = ["_static"]
 html_css_files = ["custom.css"]
 html_js_files = ["custom.js"]
@@ -167,6 +175,9 @@ def get_version_path():
     "html_image",
 ]
 
+# Configure MyST parser to treat mermaid code blocks as mermaid directives
+myst_fence_as_directive = ["mermaid"]
+
 autodoc_default_options = {
     "members": True,
     "undoc-members": True,
@@ -204,14 +215,15 @@ def get_version_path():
 sphinx_gallery_conf = {
     "examples_dirs": "tutorial_sources",  # Path to examples directory
     "gallery_dirs": "tutorials",  # Path to generate gallery
-    "filename_pattern": ".*",  # Include all files
+    "filename_pattern": r".*\.py$",  # Only process .py files, not .md files
     "download_all_examples": False,
     "first_notebook_cell": "%matplotlib inline",
     "plot_gallery": "True",
     "promote_jupyter_magic": True,
     "backreferences_dir": None,
     "show_signature": False,
     "write_computation_times": False,
+    "ignore_pattern": r".*\.md$|.*\.MD$",  # Explicitly ignore markdown files
 }
 
 
@@ -222,5 +234,42 @@ def clean_docstring_indentation(app, what, name, obj, options, lines):
             lines.append("")
 
 
+def copy_markdown_tutorials(app):
+    """Copy markdown files from tutorial_sources to tutorials directory.
+
+    This runs after the builder is initialized but before sphinx-gallery processes files,
+    ensuring markdown files are available alongside generated .py tutorials.
+    """
+    import shutil
+    from pathlib import Path
+
+    source_dir = Path(app.srcdir) / "tutorial_sources"
+    target_dir = Path(app.srcdir) / "tutorials"
+
+    # Ensure target directory exists
+    target_dir.mkdir(parents=True, exist_ok=True)
+
+    # Walk through tutorial_sources and copy all .md files
+    for md_file in source_dir.rglob("*.md"):
+        # Skip README files
+        if md_file.name.lower() in ["readme.md", "readme.txt"]:
+            continue
+
+        # Calculate relative path from tutorial_sources
+        rel_path = md_file.relative_to(source_dir)
+
+        # Create target path in tutorials directory
+        target_path = target_dir / rel_path
+        target_path.parent.mkdir(parents=True, exist_ok=True)
+
+        # Copy the file
+        shutil.copy2(md_file, target_path)
+        print(
+            f"[Forge Docs] Copied {md_file.name} to {target_path.relative_to(app.srcdir)}"
+        )
+
+
 def setup(app):
     app.connect("autodoc-process-docstring", clean_docstring_indentation)
+    # Use builder-inited to ensure it runs before source files are read
+    app.connect("builder-inited", copy_markdown_tutorials)
@@ -5,5 +5,5 @@ Welcome to TorchForge! This guide will help you get up and running with TorchFor
 TorchForge specializes in post-training techniques for large language models, including:
 
 - **Supervised Fine-Tuning (SFT)**: Adapt pre-trained models to specific tasks using labeled data
-- **Generalized Reward Policy Optimization (GRPO)**: Advanced reinforcement learning for model alignment
+- **Group Relative Policy Optimization (GRPO)**: Advanced reinforcement learning for model alignment
 - **Multi-GPU Distributed Training**: Efficient scaling across multiple GPUs and nodes
@@ -7,7 +7,7 @@ Key Features
 ------------
 
 * **Post-Training Focus**: Specializes in techniques
-  like Supervised Fine-Tuning (SFT) and Generalized Reward Policy Optimization (GRPO)
+  like Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO)
 * **PyTorch Integration**: Built natively on PyTorch with
   dependencies on [PyTorch nightly](https://pytorch.org/get-started/locally/),
   [Monarch](https://meta-pytorch.org/monarch), [vLLM](https://docs.vllm.ai/en/latest/),
 
@@ -76,6 +76,7 @@ Here's the key insight: **Each RL component becomes a Forge service**. The toy e
 ```mermaid
 graph LR
     subgraph Concepts["RL Concepts"]
+        direction TB
         C1["Dataset"]
         C2["Policy"]
         C3["Reward Model"]
@@ -85,6 +86,7 @@ graph LR
     end
 
     subgraph Services["Forge Services (Real Classes)"]
+        direction TB
         S1["DatasetActor"]
         S2["Policy"]
         S3["RewardActor"]
@@ -392,4 +394,4 @@ score = await reward_actor.evaluate_response.route(
 
 This is fundamentally different from monolithic RL implementations where any component failure stops everything!
 
-In the next Section, we will go a layer deeper and learn how ForgeServices work. Continue to [Part 2 here](./2_Forge_Internals.MD)
+In the next Section, we will go a layer deeper and learn how ForgeServices work. Continue to [Part 2 here](./2_Forge_Internals.md)
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-name: GPU tests`
	`1`	`+name: GPU Tests`
`2`	`2`
`3`	`3`	`on:`
`4`	`4`	`schedule:`