Skip to content

Commit 8383275

Browse files
authored
docs: moving three tutorials from Pruna Pro to Pruna (#539)
* adding tutorials from pruna pro * small changes to rin_attn tutorial * make tutorials toctree explicit * fixin colab link * fixed grid cards
1 parent dd44897 commit 8383275

File tree

4 files changed

+590
-0
lines changed

4 files changed

+590
-0
lines changed
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Blazingly fast Computer Vision Models"
8+
]
9+
},
10+
{
11+
"cell_type": "raw",
12+
"metadata": {
13+
"vscode": {
14+
"languageId": "raw"
15+
}
16+
},
17+
"source": [
18+
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/PrunaAI/pruna/blob/v|version|/docs/tutorials/computer_vision.ipynb\">\n",
19+
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
20+
"</a>"
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"metadata": {},
26+
"source": [
27+
"This tutorial demonstrates how to use the `pruna` package to optimize any custom computer vision model. We will use the `vit_b_16` model as an example. Any execution times given below are measured on a T4 GPU."
28+
]
29+
},
30+
{
31+
"cell_type": "markdown",
32+
"metadata": {},
33+
"source": [
34+
"### 1. Loading the CV Model\n",
35+
"\n",
36+
"First, load your ViT model.\n",
37+
"\n"
38+
]
39+
},
40+
{
41+
"cell_type": "code",
42+
"metadata": {},
43+
"source": [
44+
"import torchvision\n",
45+
"\n",
46+
"model = torchvision.models.vit_b_16(weights=\"ViT_B_16_Weights.DEFAULT\").cuda()"
47+
],
48+
"execution_count": null,
49+
"outputs": []
50+
},
51+
{
52+
"cell_type": "markdown",
53+
"metadata": {},
54+
"source": [
55+
"### 2. Initializing the Smash Config\n",
56+
"\n",
57+
"Next, initialize the smash_config."
58+
]
59+
},
60+
{
61+
"cell_type": "code",
62+
"metadata": {},
63+
"source": [
64+
"from pruna import SmashConfig\n",
65+
"\n",
66+
"# Initialize the SmashConfig\n",
67+
"smash_config = SmashConfig([\"x_fast\"])"
68+
],
69+
"execution_count": null,
70+
"outputs": []
71+
},
72+
{
73+
"cell_type": "markdown",
74+
"metadata": {},
75+
"source": [
76+
"### 3. Smashing the Model\n",
77+
"\n",
78+
"Now, you can smash the model, which will take around 5 seconds."
79+
]
80+
},
81+
{
82+
"cell_type": "code",
83+
"metadata": {},
84+
"source": [
85+
"from pruna import smash\n",
86+
"\n",
87+
"# Smash the model\n",
88+
"smashed_model = smash(\n",
89+
" model=model,\n",
90+
" smash_config=smash_config,\n",
91+
")"
92+
],
93+
"execution_count": null,
94+
"outputs": []
95+
},
96+
{
97+
"cell_type": "markdown",
98+
"metadata": {},
99+
"source": [
100+
"### 4. Preparing the Input"
101+
]
102+
},
103+
{
104+
"cell_type": "code",
105+
"metadata": {},
106+
"source": [
107+
"import numpy as np\n",
108+
"from torchvision import transforms\n",
109+
"\n",
110+
"# Generating a random image\n",
111+
"image = np.random.randint(0, 256, size=(224, 224, 3), dtype=np.uint8)\n",
112+
"input_tensor = transforms.ToTensor()(image).unsqueeze(0).cuda()"
113+
],
114+
"execution_count": null,
115+
"outputs": []
116+
},
117+
{
118+
"cell_type": "markdown",
119+
"metadata": {},
120+
"source": [
121+
"### 5. Running the Model\n",
122+
"\n",
123+
"After the model has been compiled, we run inference for a few iterations as warm-up. This will take around 8 seconds."
124+
]
125+
},
126+
{
127+
"cell_type": "code",
128+
"metadata": {},
129+
"source": [
130+
"# run some warm-up iterations\n",
131+
"for _ in range(5):\n",
132+
" smashed_model(input_tensor)"
133+
],
134+
"execution_count": null,
135+
"outputs": []
136+
},
137+
{
138+
"cell_type": "markdown",
139+
"metadata": {},
140+
"source": [
141+
"Finally, run the model with accelerated inference."
142+
]
143+
},
144+
{
145+
"cell_type": "code",
146+
"metadata": {},
147+
"source": [
148+
"# Display the result\n",
149+
"smashed_model(input_tensor)"
150+
],
151+
"execution_count": null,
152+
"outputs": []
153+
},
154+
{
155+
"cell_type": "markdown",
156+
"metadata": {},
157+
"source": [
158+
"### Wrap Up\n",
159+
"\n",
160+
"Congratulations! You have successfully smashed a CV model. You can now use the `pruna` package to optimize any custom CV model. The only parts that you should modify are step 1, 4 and 5 to fit your use case"
161+
]
162+
}
163+
],
164+
"metadata": {
165+
"kernelspec": {
166+
"display_name": "pruna",
167+
"language": "python",
168+
"name": "python3"
169+
},
170+
"language_info": {
171+
"name": "python",
172+
"version": "3.10.15"
173+
}
174+
},
175+
"nbformat": 4,
176+
"nbformat_minor": 2
177+
}

docs/tutorials/index.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ These tutorials will guide you through the process of using |pruna| to optimize
7575
:link: ./sd_deepcache.ipynb
7676

7777
Optimize your ``diffusion`` model with ``deepcache`` ``caching``.
78+
7879
.. grid-item-card:: Optimize and Deploy Sana diffusers with Pruna and Hugging Face
7980
:text-align: center
8081
:link: ./deploying_sana_tutorial.ipynb
@@ -87,10 +88,41 @@ These tutorials will guide you through the process of using |pruna| to optimize
8788

8889
Learn how to use the ``target_modules`` parameter to target specific modules in your model.
8990

91+
.. grid-item-card:: Blazingly Fast Computer Vision
92+
:text-align: center
93+
:link: ./computer_vision.ipynb
94+
95+
Optimize any ``computer vision`` model with ``x_fast`` ``compilation``.
96+
97+
.. grid-item-card:: Recover Quality after Quantization
98+
:text-align: center
99+
:link: ./recovery.ipynb
100+
101+
Recover quality using ``text_to_image_perp`` after ``diffusers_int8`` ``quantization``.
102+
103+
.. grid-item-card:: Distribute across GPUs with Ring Attention
104+
:text-align: center
105+
:link: ./ring_attn.ipynb
106+
107+
Distribute your ``Flux`` model across multiple GPUs with ``ring_attn`` and ``torch_compile``.
108+
109+
.. grid-item-card:: Reducing Warm-up Time for Compilation
110+
:text-align: center
111+
:link: ./portable_compilation.ipynb
112+
113+
Reduce warm-up time significantly when re-loading a ``torch_compile`` compiled model on a new machine.
114+
115+
.. grid-item-card:: Quantize and Speedup any LLM
116+
:text-align: center
117+
:link: ./llm_quantization_compilation_acceleration.ipynb
118+
119+
Optimize latency and memory footprint of any LLM with ``hqq`` ``quantization`` and ``torch_compile`` ``compilation``.
120+
90121
.. toctree::
91122
:hidden:
92123
:maxdepth: 1
93124
:caption: Pruna
94125
:glob:
95126

96127
./*
128+

0 commit comments

Comments
 (0)