TorchJD
diff --git a/‎CHANGELOG.md‎
Lines changed: 30 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 30 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 15 additions & 4 deletions b/‎CONTRIBUTING.md‎
Lines changed: 15 additions & 4 deletions
diff --git a/‎docs/source/docs/aggregation/aligned_mtl.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/docs/aggregation/aligned_mtl.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/docs/aggregation/bases.rst‎
Lines changed: 0 additions & 10 deletions b/‎docs/source/docs/aggregation/bases.rst‎
Lines changed: 0 additions & 10 deletions
diff --git a/‎docs/source/docs/aggregation/cagrad.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/docs/aggregation/cagrad.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/docs/aggregation/constant.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/docs/aggregation/constant.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/docs/aggregation/dualproj.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/docs/aggregation/dualproj.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/docs/aggregation/imtl_g.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/docs/aggregation/imtl_g.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/docs/aggregation/index.rst‎
Lines changed: 13 additions & 132 deletions b/‎docs/source/docs/aggregation/index.rst‎
Lines changed: 13 additions & 132 deletions
diff --git a/‎docs/source/docs/aggregation/krum.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/docs/aggregation/krum.rst‎
Lines changed: 5 additions & 0 deletions
@@ -8,6 +8,36 @@ changes that do not affect the user.
 
 ## [Unreleased]
 
+### Added
+
+- Added the `autogram` package, with the `autogram.Engine`. This is an implementation of Algorithm 3
+  from [Jacobian Descent for Multi-Objective Optimization](https://arxiv.org/pdf/2406.16232),
+  optimized for batched computations, as in IWRM.
+- For all `Aggregator`s based on the weighting of the Gramian of the Jacobian, made their
+  `Weighting` class public. It can be used directly on a Gramian (computed via the
+  `autogram.Engine`) to extract some weights. The list of new public classes is:
+  - `Weighting` (abstract base class)
+  - `UPGradWeighting`
+  - `AlignedMTLWeighting`
+  - `CAGradWeighting`
+  - `ConstantWeighting`
+  - `DualProjWeighting`
+  - `IMTLGWeighting`
+  - `KrumWeighting`
+  - `MeanWeighting`
+  - `MGDAWeighting`
+  - `PCGradWeighting`
+  - `RandomWeighting`
+  - `SumWeighting`
+- Added usage example for IWRM with autogram.
+- Added usage example for IWRM with partial autogram.
+
+### Changed
+
+- Revamped documentation.
+- Made `backward` and `mtl_backward` importable from `torchjd.autojac` (like it was prior to 0.7.0).
+- Deprecated importing `backward` and `mtl_backward` from `torchjd` directly.
+
 ## [0.7.0] - 2025-06-04
 
 ### Changed
 
@@ -51,11 +51,16 @@ uv run pre-commit install
 ```
 
 ## Running tests
-   - To verify that your installation was successful, and that all unit tests pass, run:
+   - To verify that your installation was successful, and that unit tests pass, run:
      ```bash
      uv run pytest tests/unit
      ```
 
+   - To also run the unit tests that are marked as slow, add the `--runslow` flag:
+    ```bash
+    uv run pytest tests/unit --runslow
+    ```
+
    - If you have access to a cuda-enabled GPU, you should also check that the unit tests pass on it:
      ```bash
      CUBLAS_WORKSPACE_CONFIG=:4096:8 PYTEST_TORCH_DEVICE=cuda:0 uv run pytest tests/unit
@@ -113,19 +118,20 @@ We ask contributors to implement the unit tests necessary to check the correctne
 implementations. Besides, whenever usage examples are provided, we require the example's code to be
 tested in `tests/doc`. We require a very high code coverage for newly introduced sources (~95-100%).
 To ensure that the tensors generated during the tests are on the right device, you have to use the
-partial functions defined in `tests/unit/_utils.py` to instantiate tensors. For instance, instead of
+partial functions defined in `tests/utils/tensors.py` to instantiate tensors. For instance, instead
+of
 ```python
 import torch
 a = torch.ones(3, 4)
 ```
 use
 ```python
-from unit._utils import ones_
+from utils.tensors import ones_
 a = ones_(3, 4)
 ```
 
 This will automatically call `torch.ones` with `device=unit.conftest.DEVICE`.
-If the function you need does not exist yet as a partial function in `_utils.py`, add it.
+If the function you need does not exist yet as a partial function in `tensors.py`, add it.
 Lastly, when you create a model or a random generator, you have to move them manually to the right
 device (the `DEVICE` defined in `unit.conftest`):
 ```python
@@ -162,6 +168,11 @@ implementation of a mathematical aggregator.
 > Before working on the implementation of a new aggregator, please contact us via an issue or a
 > discussion: in many cases, we have already thought about it, or even started an implementation.
 
+## Deprecation
+
+To deprecate some public functionality, make it raise a `DeprecationWarning`. A test should also be
+added in `tests/units/test_deprecations.py`, ensuring that this warning is issued.
+
 ## Release
 
 *This section is addressed to maintainers.*
 
@@ -7,3 +7,8 @@ Aligned-MTL
     :members:
     :undoc-members:
     :exclude-members: forward
+
+.. autoclass:: torchjd.aggregation.AlignedMTLWeighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward
@@ -7,3 +7,8 @@ CAGrad
     :members:
     :undoc-members:
     :exclude-members: forward
+
+.. autoclass:: torchjd.aggregation.CAGradWeighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward
@@ -7,3 +7,8 @@ Constant
     :members:
     :undoc-members:
     :exclude-members: forward
+
+.. autoclass:: torchjd.aggregation.ConstantWeighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward
@@ -7,3 +7,8 @@ DualProj
     :members:
     :undoc-members:
     :exclude-members: forward
+
+.. autoclass:: torchjd.aggregation.DualProjWeighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward
@@ -7,3 +7,8 @@ IMTL-G
     :members:
     :undoc-members:
     :exclude-members: forward
+
+.. autoclass:: torchjd.aggregation.IMTLGWeighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward
@@ -1,146 +1,27 @@
-Aggregation
+aggregation
 ===========
 
-A mapping :math:`\mathcal A: \mathbb R^{m\times n} \to \mathbb R^n` reducing any matrix
-:math:`J \in \mathbb R^{m\times n}` into its aggregation :math:`\mathcal A(J) \in \mathbb R^n` is
-called an aggregator.
+.. automodule:: torchjd.aggregation
+   :no-members:
 
-In the context of JD, the matrix to aggregate is a Jacobian whose rows are the gradients of the
-individual objectives. The aggregator is used to reduce this matrix into an update vector for the
-parameters of the model
+Abstract base classes
+---------------------
 
-In TorchJD, an aggregator is a class that inherits from the abstract class
-:doc:`Aggregator <bases>`. We provide the following list of aggregators from the literature:
-
-.. role:: raw-html(raw)
-   :format: html
-
-.. |yes| replace:: :raw-html:`<center><font color="#28b528">✔</font></center>`
-.. |no| replace:: :raw-html:`<center><font color="#e63232">✘</font></center>`
-
-.. list-table::
-   :widths: 25 15 15 15
-   :header-rows: 1
-
-   * - :doc:`Aggregator <bases>`
-     - :ref:`Non-conflicting <Non-conflicting>`
-     - :ref:`Linear under scaling <Linear under scaling>`
-     - :ref:`Weighted <Weighted>`
-   * - :doc:`UPGrad <upgrad>` (recommended)
-     - |yes|
-     - |yes|
-     - |yes|
-   * - :doc:`Aligned-MTL <aligned_mtl>`
-     - |no|
-     - |no|
-     - |yes|
-   * - :doc:`CAGrad <cagrad>`
-     - |no|
-     - |no|
-     - |yes|
-   * - :doc:`ConFIG <config>`
-     - |no|
-     - |yes|
-     - |yes|
-   * - :doc:`Constant <constant>`
-     - |no|
-     - |yes|
-     - |yes|
-   * - :doc:`DualProj <dualproj>`
-     - |yes|
-     - |no|
-     - |yes|
-   * - :doc:`GradDrop <graddrop>`
-     - |no|
-     - |no|
-     - |no|
-   * - :doc:`IMTL-G <imtl_g>`
-     - |no|
-     - |no|
-     - |yes|
-   * - :doc:`Krum <krum>`
-     - |no|
-     - |no|
-     - |yes|
-   * - :doc:`Mean <mean>`
-     - |no|
-     - |yes|
-     - |yes|
-   * - :doc:`MGDA <mgda>`
-     - |yes|
-     - |no|
-     - |yes|
-   * - :doc:`Nash-MTL <nash_mtl>`
-     - |yes|
-     - |no|
-     - |yes|
-   * - :doc:`PCGrad <pcgrad>`
-     - |no|
-     - |yes|
-     - |yes|
-   * - :doc:`Random <random>`
-     - |no|
-     - |yes|
-     - |yes|
-   * - :doc:`Sum <sum>`
-     - |no|
-     - |yes|
-     - |yes|
-   * - :doc:`Trimmed Mean <trimmed_mean>`
-     - |no|
-     - |no|
-     - |no|
-
-.. hint::
-    This table is an adaptation of the one available in `Jacobian Descent For Multi-Objective
-    Optimization <https://arxiv.org/pdf/2406.16232>`_. The paper provides precise justification of
-    the properties in Section 2.2 as well as proofs in Appendix B.
-
-.. _Non-conflicting:
-.. admonition::
-    Non-conflicting
-
-    An aggregator :math:`\mathcal A: \mathbb R^{m\times n} \to \mathbb R^n` is said to be
-    *non-conflicting* if for any :math:`J\in\mathbb R^{m\times n}`, :math:`J\cdot\mathcal A(J)` is a
-    vector with only non-negative elements.
-
-    In other words, :math:`\mathcal A` is non-conflicting whenever the aggregation of any matrix has
-    non-negative inner product with all rows of that matrix. In the context of JD, this ensures that
-    no objective locally increases.
-
-.. _Linear under scaling:
-.. admonition::
-    Linear under scaling
-
-    An aggregator :math:`\mathcal A: \mathbb R^{m\times n} \to \mathbb R^n` is said to be
-    *linear under scaling* if for any :math:`J\in\mathbb R^{m\times n}`, the mapping from any
-    positive :math:`c\in\mathbb R^{n}` to :math:`\mathcal A(\operatorname{diag}(c)\cdot J)` is
-    linear in :math:`c`.
-
-    In other words, :math:`\mathcal A` is linear under scaling whenever scaling a row of the matrix
-    to aggregate scales its influence proportionally. In the context of JD, this ensures that even
-    when the gradient norms are imbalanced, each gradient will contribute to the update
-    proportionally to its norm.
-
-.. _Weighted:
-.. admonition::
-    Weighted
-
-    An aggregator :math:`\mathcal A: \mathbb R^{m\times n} \to \mathbb R^n` is said to be *weighted*
-    if for any :math:`J\in\mathbb R^{m\times n}`, there exists a weight vector
-    :math:`w\in\mathbb R^m` such that :math:`\mathcal A(J)=J^\top w`.
-
-    In other words, :math:`\mathcal A` is weighted whenever the aggregation of any matrix is always
-    in the span of the rows of that matrix. This ensures a higher precision of the Taylor
-    approximation that JD relies on.
+.. autoclass:: torchjd.aggregation.Aggregator
+    :members:
+    :undoc-members:
+    :exclude-members: forward
 
+.. autoclass:: torchjd.aggregation.Weighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward
 
 
 .. toctree::
     :hidden:
     :maxdepth: 1
 
-    bases.rst
     upgrad.rst
     aligned_mtl.rst
     cagrad.rst
 
@@ -7,3 +7,8 @@ Krum
     :members:
     :undoc-members:
     :exclude-members: forward
+
+.. autoclass:: torchjd.aggregation.KrumWeighting
+    :members:
+    :undoc-members:
+    :exclude-members: forward