Skip to content

Commit 412f4ff

Browse files
contentisw-e-wMorkTheOrkeltociearMilly
authored
Hot fix (#43)
* trt profile as one markdown * remove todo and comment code * updated install.py to catch old packages * Fix typo in utilities.py speciailized -> specialized * move available profiles into a seperate column * Correct "Export Default Engines" * remove yiel in favour of print * fix: installed check for onnx-graphsurgeon Apply the package import name to `launch.is_installed()` instead of the pip package name. * remove ui config mod * extend readme * fix bug when deleting engine manually * adding resolution constraint and nvidia support guide * fix type --------- Co-authored-by: w-e-w <[email protected]> Co-authored-by: Cem Moluluo <[email protected]> Co-authored-by: Ikko Eltociear Ashimine <[email protected]> Co-authored-by: Cem Moluluo <[email protected]> Co-authored-by: Milly <[email protected]>
1 parent 82d52e1 commit 412f4ff

File tree

7 files changed

+81
-98
lines changed

7 files changed

+81
-98
lines changed

README.md

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,31 @@ Happy prompting!
2828

2929
TensorRT uses optimized engines for specific resolutions and batch sizes. You can generate as many optimized engines as desired. Types:
3030

31-
- The “Generate Default Engines” selection adds support for resolutions between 512x512 and 768x768 for Stable Diffusion 1.5 and 768x768 to 1024x1024 for SDXL with batch sizes 1 to 4.
31+
- The "Export Default Engines” selection adds support for resolutions between 512x512 and 768x768 for Stable Diffusion 1.5 and 768x768 to 1024x1024 for SDXL with batch sizes 1 to 4.
3232
- Static engines support a single specific output resolution and batch size.
3333
- Dynamic engines support a range of resolutions and batch sizes, at a small cost in performance. Wider ranges will use more VRAM.
3434

35-
Each preset can be adjusted with the “Advanced Settings” option.
35+
Each preset can be adjusted with the “Advanced Settings” option. More detailed instructions can be found [here](https://nvidia.custhelp.com/app/answers/detail/a_id/5487/~/tensorrt-extension-for-stable-diffusion-web-ui).
36+
37+
### Common Issues/Limitations
38+
39+
**HIRES FIX:** If using the hires.fix option in Automatic1111 you must build engines that match both the starting and ending resolutions. For instance, if initial size is `512 x 512` and hires.fix upscales to `1024 x 1024`, you must either generate two engines, one at 512 and one at 1024, or generate a single dynamic engine that covers the whole range.
40+
Having two seperate engines will heavily impact performance at the moment. Stay tuned for updates.
41+
42+
**Resolution:** When generating images the resolution needs to be a multiple of 64. This applies to hires.fix as well, requiring the low and high-res to be divisible by 64.
43+
44+
**Failing CMD arguments:**
45+
46+
- `medvram` and `lowvram` Have caused issues when compiling the engine and running it.
47+
- `api` Has caused the `model.json` to not be updated. Resulting in SD Unets not appearing after compilation.
48+
49+
**Failing installation or TensorRT tab not appearing in UI:** This is most likely due to a failed install. To resolve this manually use this [guide](https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/issues/27#issuecomment-1767570566).
50+
51+
## Requirements
52+
53+
**Driver**:
54+
55+
- Linux: >= 450.80.02
56+
- Windows: >=452.39
57+
58+
We always recommend keeping the driver up-to-date for system wide performance improvments.

info.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Happy prompting!
1515

1616
TensorRT uses optimized engines for specific resolutions and batch sizes. You can generate as many optimized engines as desired. Types:
1717

18-
- The "Generate Default Engines" selection adds support for resolutions between 512x512 and 768x768 for Stable Diffusion 1.5 and 768x768 to 1024x1024 for SDXL with batch sizes 1 to 4.
18+
- The "Export Default Engines" selection adds support for resolutions between 512x512 and 768x768 for Stable Diffusion 1.5 and 768x768 to 1024x1024 for SDXL with batch sizes 1 to 4.
1919
- Static engines support a single specific output resolution and batch size.
2020
- Dynamic engines support a range of resolutions and batch sizes, at a small cost in performance. Wider ranges will use more VRAM.
2121

install.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,31 @@
11
import launch
2-
from modules import shared
2+
from importlib_metadata import version
33

44
def install():
5+
if launch.is_installed("tensorrt"):
6+
if not version("tensorrt") == "9.0.1.post11.dev4":
7+
launch.run(["python","-m","pip","uninstall","-y","tensorrt"], "removing old version of tensorrt")
8+
9+
510
if not launch.is_installed("tensorrt"):
611
print("TensorRT is not installed! Installing...")
712
launch.run_pip("install nvidia-cudnn-cu11==8.9.4.25", "nvidia-cudnn-cu11")
813
launch.run_pip("install --pre --extra-index-url https://pypi.nvidia.com tensorrt==9.0.1.post11.dev4", "tensorrt", live=True)
9-
launch.run(["python","-m","pip","uninstall","-y","nvidia-cudnn-cu11"],"removing nvidia-cudnn-cu11")
14+
launch.run(["python","-m","pip","uninstall","-y","nvidia-cudnn-cu11"], "removing nvidia-cudnn-cu11")
15+
16+
if launch.is_installed("nvidia-cudnn-cu11"):
17+
if version("nvidia-cudnn-cu11") == "8.9.4.25":
18+
launch.run(["python","-m","pip","uninstall","-y","nvidia-cudnn-cu11"], "removing nvidia-cudnn-cu11")
1019

1120
# Polygraphy
1221
if not launch.is_installed("polygraphy"):
1322
print("Polygraphy is not installed! Installing...")
1423
launch.run_pip("install polygraphy --extra-index-url https://pypi.ngc.nvidia.com", "polygraphy", live=True)
1524

1625
# ONNX GS
17-
if not launch.is_installed("onnx-graphsurgeon"):
26+
if not launch.is_installed("onnx_graphsurgeon"):
1827
print("GS is not installed! Installing...")
1928
launch.run_pip("install protobuf==3.20.2", "protobuf", live=True)
2029
launch.run_pip('install onnx-graphsurgeon --extra-index-url https://pypi.ngc.nvidia.com', "onnx-graphsurgeon", live=True)
21-
22-
if shared.opts is None:
23-
print("UI Config not initialized")
24-
return
25-
26-
if "sd_unet" not in shared.opts["quicksettings_list"]:
27-
shared.opts["quicksettings_list"].append("sd_unet")
28-
shared.opts.save(shared.config_filename)
2930

3031
install()

model_manager.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,9 @@ def update(self):
6363
for trt_file in os.listdir(TRT_MODEL_DIR)
6464
if trt_file.endswith(".trt")
6565
]
66-
for cc, base_models in self.all_models.items():
66+
67+
tmp_all_models = self.all_models.copy()
68+
for cc, base_models in tmp_all_models.items():
6769
for base_model, models in base_models.items():
6870
tmp_config_list = {}
6971
for model_config in models:

scripts/trt.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ def switch_engine(self, feed_dict):
106106
)
107107
if len(valid_models) == 0:
108108
raise ValueError(
109-
"No valid profile found. Please go to the TensorRT tab and generate an engine with the necessary profile. Or use the default (torch) U-Net."
109+
"No valid profile found. Please go to the TensorRT tab and generate an engine with the necessary profile. If using hires.fix, you need an engine for both the base and upscaled resolutions. Otherwise, use the default (torch) U-Net."
110110
)
111111

112112
best = valid_models[np.argmin(distances)]

ui_trt.py

Lines changed: 38 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -31,18 +31,6 @@ def get_version_from_model(sd_model):
3131
return "xl-1.0"
3232

3333

34-
class LogLevel:
35-
Debug = 0
36-
Info = 1
37-
Warning = 2
38-
Error = 3
39-
40-
41-
def log_md(logging_history, message, prefix="**[INFO]:**"):
42-
logging_history += f"{prefix} {message} \n"
43-
return logging_history
44-
45-
4634
def export_unet_to_trt(
4735
batch_min,
4836
batch_opt,
@@ -61,7 +49,6 @@ def export_unet_to_trt(
6149
preset,
6250
controlnet=None,
6351
):
64-
logging_history = ""
6552

6653
if preset == "Default":
6754
(
@@ -82,10 +69,7 @@ def export_unet_to_trt(
8269
use_fp32 = False
8370
if cc_major < 7:
8471
use_fp32 = True
85-
logging_history = log_md(
86-
logging_history, "FP16 has been disabled because your GPU does not support it."
87-
)
88-
yield logging_history
72+
print("FP16 has been disabled because your GPU does not support it.")
8973

9074
unet_hidden_dim = shared.sd_model.model.diffusion_model.in_channels
9175
if unet_hidden_dim == 9:
@@ -95,10 +79,7 @@ def export_unet_to_trt(
9579
model_name = shared.sd_model.sd_checkpoint_info.model_name
9680
onnx_filename, onnx_path = modelmanager.get_onnx_path(model_name, model_hash)
9781

98-
logging_history = log_md(
99-
logging_history, f"Exporting {model_name} to TensorRT", prefix="###"
100-
)
101-
yield logging_history
82+
print(f"Exporting {model_name} to TensorRT")
10283

10384
timing_cache = modelmanager.get_timing_cache()
10485

@@ -149,27 +130,23 @@ def export_unet_to_trt(
149130
print(profile)
150131

151132
if not os.path.exists(onnx_path):
152-
logging_history = log_md(logging_history, "No ONNX file found. Exporting ONNX")
153-
yield logging_history
133+
print("No ONNX file found. Exporting ONNX...")
134+
gr.Info("No ONNX file found. Exporting ONNX... Please check the progress in the terminal.")
154135
export_onnx(
155136
onnx_path,
156137
modelobj,
157138
profile=profile,
158139
diable_optimizations=diable_optimizations,
159140
)
160-
logging_history = log_md(logging_history, "Exported to ONNX.")
161-
yield logging_history
141+
print("Exported to ONNX.")
162142

163143
trt_engine_filename, trt_path = modelmanager.get_trt_path(
164144
model_name, model_hash, profile, static_shapes
165145
)
166146

167147
if not os.path.exists(trt_path) or force_export:
168-
logging_history = log_md(
169-
logging_history,
170-
"Building TensorRT engine... This can take a while, please check the progress in the terminal.",
171-
)
172-
yield logging_history
148+
print("Building TensorRT engine... This can take a while, please check the progress in the terminal.")
149+
gr.Info("Building TensorRT engine... This can take a while, please check the progress in the terminal.")
173150
gc.collect()
174151
torch.cuda.empty_cache()
175152
ret = export_trt(
@@ -180,12 +157,9 @@ def export_unet_to_trt(
180157
use_fp16=not use_fp32,
181158
)
182159
if ret:
183-
yield logging_history + "\n --- \n ## Export Failed due to unknown reason. See shell for more information. \n"
184-
return
185-
logging_history = log_md(
186-
logging_history, "TensorRT engines has been saved to disk."
187-
)
188-
yield logging_history
160+
return "## Export Failed due to unknown reason. See shell for more information. \n"
161+
162+
print("TensorRT engines has been saved to disk.")
189163
modelmanager.add_entry(
190164
model_name,
191165
model_hash,
@@ -199,25 +173,17 @@ def export_unet_to_trt(
199173
lora=False,
200174
)
201175
else:
202-
logging_history = log_md(
203-
logging_history,
204-
"TensorRT engine found. Skipping build. You can enable Force Export in the Advanced Settings to force a rebuild if needed.",
205-
)
206-
yield logging_history
176+
print("TensorRT engine found. Skipping build. You can enable Force Export in the Advanced Settings to force a rebuild if needed.")
207177

208-
yield logging_history + "\n --- \n ## Exported Successfully \n"
178+
return "## Exported Successfully \n"
209179

210180

211181
def export_lora_to_trt(lora_name, force_export):
212-
logging_history = ""
213182
is_inpaint = False
214183
use_fp32 = False
215184
if cc_major < 7:
216185
use_fp32 = True
217-
logging_history = log_md(
218-
logging_history, "FP16 has been disabled because your GPU does not support it."
219-
)
220-
yield logging_history
186+
print("FP16 has been disabled because your GPU does not support it.")
221187
unet_hidden_dim = shared.sd_model.model.diffusion_model.in_channels
222188
if unet_hidden_dim == 9:
223189
is_inpaint = True
@@ -261,8 +227,8 @@ def export_lora_to_trt(lora_name, force_export):
261227
diable_optimizations = False
262228

263229
if not os.path.exists(onnx_lora_path):
264-
logging_history = log_md(logging_history, "No ONNX file found. Exporting ONNX")
265-
yield logging_history
230+
print("No ONNX file found. Exporting ONNX...")
231+
gr.Info("No ONNX file found. Exporting ONNX... Please check the progress in the terminal.")
266232
export_onnx(
267233
onnx_lora_path,
268234
modelobj,
@@ -272,33 +238,29 @@ def export_lora_to_trt(lora_name, force_export):
272238
diable_optimizations=diable_optimizations,
273239
lora_path=lora_model["filename"],
274240
)
275-
logging_history = log_md(logging_history, "Exported to ONNX.")
276-
yield logging_history
241+
print("Exported to ONNX.")
277242

278243
trt_lora_name = onnx_lora_filename.replace(".onnx", ".trt")
279244
trt_lora_path = os.path.join(TRT_MODEL_DIR, trt_lora_name)
280245

281246
available_trt_unet = modelmanager.available_models()
282247
if len(available_trt_unet[base_name]) == 0:
283-
logging_history = log_md(logging_history, "Please export the base model first.")
284-
yield logging_history
248+
return "## Please export the base model first."
285249
trt_base_path = os.path.join(
286250
TRT_MODEL_DIR, available_trt_unet[base_name][0]["filepath"]
287251
)
288252

289253
if not os.path.exists(onnx_base_path):
290-
raise ValueError("Please export the base model first.")
254+
return "## Please export the base model first."
291255

292256
if not os.path.exists(trt_lora_path) or force_export:
293-
logging_history = log_md(
294-
logging_history, "No TensorRT engine found. Building..."
295-
)
296-
yield logging_history
257+
print("No TensorRT engine found. Building...")
258+
gr.Info("No TensorRT engine found. Building...")
259+
297260
engine = Engine(trt_base_path)
298261
engine.load()
299262
engine.refit(onnx_base_path, onnx_lora_path, dump_refit_path=trt_lora_path)
300-
logging_history = log_md(logging_history, "Built TensorRT engine.")
301-
yield logging_history
263+
print("Built TensorRT engine.")
302264

303265
modelmanager.add_lora_entry(
304266
base_name,
@@ -309,7 +271,7 @@ def export_lora_to_trt(lora_name, force_export):
309271
0,
310272
unet_hidden_dim,
311273
)
312-
yield logging_history + "\n --- \n ## Exported Successfully \n"
274+
return "## Exported Successfully \n"
313275

314276

315277
def export_default_unet_to_trt():
@@ -827,23 +789,27 @@ def on_ui_tabs():
827789
with gr.Accordion("Output", open=True):
828790
trt_result = gr.Markdown(elem_id="trt_result", value="")
829791

792+
def get_trt_profiles_markdown():
793+
profiles_md_string = ""
794+
for model, profiles in engine_profile_card().items():
795+
profiles_md_string += f"<details><summary>{model} ({len(profiles)} Profiles)</summary>\n\n"
796+
for i, profile in enumerate(profiles):
797+
profiles_md_string += f"#### Profile {i} \n{profile}\n\n"
798+
profiles_md_string += "</details>\n"
799+
profiles_md_string += "</details>\n"
800+
return profiles_md_string
801+
802+
830803
with gr.Column(variant="panel"):
831804
with gr.Row(equal_height=True, variant="compact"):
832805
button_refresh_profiles = ToolButton(value=refresh_symbol, elem_id="trt_refresh_profiles", visible=True)
833806
profile_header_md = gr.Markdown(
834807
value=f"## Available TensorRT Engine Profiles"
835808
)
836-
engines_md = engine_profile_card()
837-
for model, profiles in engines_md.items():
838-
with gr.Row(equal_height=False):
839-
row_name = model + " ({} Profiles)".format(len(profiles))
840-
with gr.Accordion(row_name, open=False):
841-
out_string = ""
842-
for i, profile in enumerate(profiles):
843-
out_string += f"#### Profile {i} \n"
844-
out_string += profile
845-
out_string += "\n\n"
846-
gr.Markdown(elem_id=f"trt_{model}_{i}", value=out_string)
809+
with gr.Row(equal_height=True):
810+
trt_profiles_markdown = gr.Markdown(elem_id=f"trt_profiles_markdown", value=get_trt_profiles_markdown())
811+
812+
button_refresh_profiles.click(lambda: gr.Markdown.update(value=get_trt_profiles_markdown()), outputs=[trt_profiles_markdown])
847813

848814
button_export_unet.click(
849815
export_unet_to_trt,
@@ -895,13 +861,4 @@ def on_ui_tabs():
895861
outputs=[trt_result],
896862
)
897863

898-
899-
# TODO Dynamically update available profiles. Not possible with gradio?!
900-
button_refresh_profiles.click(
901-
fn=shared.state.request_restart,
902-
_js='restart_reload',
903-
inputs=[],
904-
outputs=[],
905-
)
906-
907864
return [(trt_interface, "TensorRT", "tensorrt")]

utilities.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,7 @@ def map_name(name):
228228
refitter = trt.Refitter(self.engine, TRT_LOGGER)
229229
all_weights = refitter.get_all()
230230
for layer_name, role in zip(all_weights[0], all_weights[1]):
231-
# for speciailized roles, use a unique name in the map:
231+
# for specialized roles, use a unique name in the map:
232232
if role == trt.WeightsRole.KERNEL:
233233
name = layer_name + "_TRTKERNEL"
234234
elif role == trt.WeightsRole.BIAS:

0 commit comments

Comments
 (0)