Skip to content

Commit b9aa90d

Browse files
committed
fix
1 parent ca1e6c0 commit b9aa90d

File tree

1 file changed

+13
-2
lines changed

1 file changed

+13
-2
lines changed

README.md

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,10 +67,21 @@ For hardware, we used a 96GB 700W H100 GPU. Some of the optimizations applied (B
6767

6868
## Run the optimized pipeline
6969

70-
TODO
70+
```sh
71+
python gen_image.py --prompt "An astronaut standing next to a giant lemon" --output-file output.png --use-cached-model
72+
```
73+
74+
This will include all optimizations and will attempt to use pre-cached binary models
75+
generated via `torch.export` + AOTI. To generate these binaries for subsequent runs, run
76+
the above command without the `--use-cached-model` flag.
7177

7278
> [!IMPORTANT]
73-
> The binaries won't work for hardware that are different from the ones they were obtained on. For example, if the binaries were obtained on an H100, they won't work on A100.
79+
> The binaries won't work for hardware that is sufficiently different from the hardware they were
80+
> obtained on. For example, if the binaries were obtained on an H100, they won't work on A100.
81+
> Further, the binaries are currently Linux-only and include dependencies on specific versions
82+
> of system libs such as libstdc++; they will not work if they were generated in a sufficiently
83+
> different environment than the one present at runtime. The PyTorch Compiler team is working on
84+
> solutions for more portable binaries / artifact caching.
7485
7586
## Benchmarking
7687
[`run_benchmark.py`](./run_benchmark.py) is the main script for benchmarking the different optimization techniques.

0 commit comments

Comments
 (0)