Update docs/guides/sequential_onloading.md

kylesayrs · gemini-code-assist[bot] · kylesayrs · commit ffedd0197bd6 · 2026-02-25T11:13:40.000-05:00
Co-authored-by: gemini-code-assist[bot] &lt;176961590+gemini-code-assist[bot]@users.noreply.github.com&gt;
Signed-off-by: Kyle Sayers &lt;kylesayrs@gmail.com&gt;
diff --git a/docs/guides/sequential_onloading.md b/docs/guides/sequential_onloading.md
@@ -22,7 +22,7 @@ for layer in model.layers:
 
 ## Implementation ##
 
-Before a model can be sequentially onloaded, it must first be broken up into disjoint parts which can be individually onloaded. This is achived through the [torch.fx.Tracer](https://github.com/pytorch/pytorch/blob/main/torch/fx/README.md#tracing) module, which allows a model to represented as a graph operations (nodes) and data inputs (edges). Once the model has been traced into a valid graph representation, the graph is cut (partitioned) into disjoint subgraphs, each of which is onloaded individually as a layer. This implementation can be found [here](/src/llmcompressor/pipelines/sequential/helpers.py).
+Before a model can be sequentially onloaded, it must first be broken up into disjoint parts which can be individually onloaded. This is achieved through the [torch.fx.Tracer](https://github.com/pytorch/pytorch/blob/main/torch/fx/README.md#tracing) module, which allows a model to be represented as a graph of operations (nodes) and data inputs (edges). Once the model has been traced into a valid graph representation, the graph is cut (partitioned) into disjoint subgraphs, each of which is onloaded individually as a layer. This implementation can be found [here](/src/llmcompressor/pipelines/sequential/helpers.py).
 
 ![sequential_onloading](../assets/model_graph.jpg)
 *This image depicts some of the operations performed when executing the Llama3.2-Vision model*