Skip to content

Commit 6ea2644

Browse files
Refactor Pruna documentation for clarity and consistency
- Removed outdated references to FLUX-juiced and streamlined the explanation of benchmarking. - Enhanced the description of evaluating standalone `diffusers` models. - Cleaned up code examples by removing unnecessary imports and comments for better readability.
1 parent 2f05b6e commit 6ea2644

File tree

1 file changed

+3
-34
lines changed

1 file changed

+3
-34
lines changed

docs/source/en/optimization/pruna.md

Lines changed: 3 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,6 @@ The resulting generated image and inference per optimization configuration are s
9595
<img src="https://huggingface.co/datasets/PrunaAI/documentation-images/resolve/main/diffusers/flux_smashed_comparison.png">
9696
</div>
9797

98-
Besides the results shown above, we have also used Pruna to create [FLUX-juiced, the fastest image generation endpoint alive](https://www.pruna.ai/blog/flux-juiced-the-fastest-image-generation-endpoint). We benchmarked our model against, FLUX.1-dev versions provided by different inference frameworks and surpassed them all. Full results of this benchmark can be found in [our blog post](https://huggingface.co/blog/PrunaAI/flux-fastest-image-generation-endpoint) and [our InferBench space](https://huggingface.co/spaces/PrunaAI/InferBench).
99-
10098
As you can see, Pruna is a very simple and easy to use framework that allows you to optimize your models with minimal effort. We already saw that the results look good to the naked eye but the cool thing is that you can also use Pruna to benchmark and evaluate your optimized models.
10199

102100
## Evaluate and benchmark diffusers models
@@ -157,9 +155,11 @@ smashed_model_results = eval_agent.evaluate(smashed_pipe)
157155
smashed_pipe.move_to_device("cpu")
158156
```
159157

158+
Besides the results we can get from the `EvaluationAgent` above, we have also used a similar approach to create and benchmark [FLUX-juiced, the fastest image generation endpoint alive](https://www.pruna.ai/blog/flux-juiced-the-fastest-image-generation-endpoint). We benchmarked our model against, FLUX.1-dev versions provided by different inference frameworks and surpassed them all. Full results of this benchmark can be found in [our blog post](https://huggingface.co/blog/PrunaAI/flux-fastest-image-generation-endpoint) and [our InferBench space](https://huggingface.co/spaces/PrunaAI/InferBench).
159+
160160
### Evaluate and benchmark standalone diffusers models
161161

162-
Instead of comparing the optimized model to the base model, you can also evaluate the standalone `diffusers` model. This is useful if you want to evaluate the performance of the model without the optimization. We can do so by using the `PrunaModel` wrapper.
162+
Instead of comparing the optimized model to the base model, you can also evaluate the standalone `diffusers` model. This is useful if you want to evaluate the performance of the model without the optimization. We can do so by using the `PrunaModel` wrapper and run the `EvaluationAgent` on it.
163163

164164
Let's take a look at an example on how to evaluate and benchmark a standalone `diffusers` model.
165165

@@ -168,17 +168,6 @@ import torch
168168
from diffusers import FluxPipeline
169169

170170
from pruna import PrunaModel
171-
from pruna.data.pruna_datamodule import PrunaDataModule
172-
from pruna.evaluation.evaluation_agent import EvaluationAgent
173-
from pruna.evaluation.metrics import (
174-
ThroughputMetric,
175-
TorchMetricWrapper,
176-
TotalTimeMetric,
177-
)
178-
from pruna.evaluation.task import Task
179-
180-
# define the device
181-
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
182171

183172
# load the model
184173
# Try PrunaAI/Segmind-Vega-smashed or PrunaAI/FLUX.1-dev-smashed with a small GPU memory
@@ -187,26 +176,6 @@ pipe = FluxPipeline.from_pretrained(
187176
torch_dtype=torch.bfloat16
188177
).to("cpu")
189178
wrapped_pipe = PrunaModel(model=pipe)
190-
191-
# Define the metrics
192-
metrics = [
193-
TotalTimeMetric(n_iterations=20, n_warmup_iterations=5),
194-
ThroughputMetric(n_iterations=20, n_warmup_iterations=5),
195-
TorchMetricWrapper("clip"),
196-
]
197-
198-
# Define the datamodule
199-
datamodule = PrunaDataModule.from_string("LAION256")
200-
datamodule.limit_datasets(10)
201-
202-
# Define the task and evaluation agent
203-
task = Task(metrics, datamodule=datamodule, device=device)
204-
eval_agent = EvaluationAgent(task)
205-
206-
# Evaluate base model and offload it to CPU
207-
wrapped_pipe.move_to_device(device)
208-
base_model_results = eval_agent.evaluate(wrapped_pipe)
209-
wrapped_pipe.move_to_device("cpu")
210179
```
211180

212181
Now that you have seen how to optimize and evaluate your models, you can start using Pruna to optimize your own models. Luckily, we have many examples to help you get started.

0 commit comments

Comments
 (0)