|
| 1 | +--- |
| 2 | +icon: material/lightbulb-on |
| 3 | +--- |
| 4 | + |
| 5 | +# Updating SDXL Golden Outputs for IREE CI |
| 6 | + |
| 7 | +Golden outputs are reference results generated from a known-good version of the |
| 8 | +SDXL pipeline. They serve as the “ground truth” for CI quality tests in IREE, |
| 9 | +ensuring that future changes do not silently alter accuracy. When a change is |
| 10 | +made which affects the numerics (e.g, modifying the order of floating-point |
| 11 | +operations), differences in outputs can occur. In such cases, you must |
| 12 | +regenerate the golden outputs so that CI reflects the new expected results. This |
| 13 | +page describes the end-to-end process: verifying accuracy, generating new |
| 14 | +outputs, uploading them to storage, bumping the version in configuration, and |
| 15 | +re-running CI. |
| 16 | + |
| 17 | +## Verify accuracy before updating goldens |
| 18 | + |
| 19 | +Before updating golden outputs, first confirm your change maintains acceptable |
| 20 | +accuracy. Follow the steps |
| 21 | +[outlined](https://github.com/nod-ai/SHARK-MLPERF/blob/dev/code/stable-diffusion-xl/development.md#test-accuracy-only). |
| 22 | + |
| 23 | +A straightforward way to test your change is by editing |
| 24 | +`sdxl_harness_rocm_shortfin_from_source_iree.dockerfile` so that it builds your |
| 25 | +IREE and exposes the right tooling: |
| 26 | + |
| 27 | +- Build your IREE commit and add the build’s tools to `PATH`. |
| 28 | +- Add your IREE Python bindings to `PYTHONPATH`. |
| 29 | +- Remove the prebuilt wheels for `iree-base-compiler` and `iree-base-runtime` so |
| 30 | + you’re testing your own build. |
| 31 | + |
| 32 | +Run the accuracy script (`run_accuracy_mi325x.sh`) and be mindful of |
| 33 | +platform-specific settings. If you are running in SPX mode, update available |
| 34 | +device IDs accordingly. On MI300x, set `CPD=1` and use `BATCH_SIZE=32`. Accuracy |
| 35 | +is considered acceptable if FID and CLIP scores fall within the advertised |
| 36 | +ranges. |
| 37 | + |
| 38 | +## Generate new outputs with your IREE build |
| 39 | + |
| 40 | +Once accuracy is confirmed, generate new outputs using the same inputs that CI |
| 41 | +consumes. Both inputs and outputs live in the `sharkpublic` Azure container. If |
| 42 | +you do not already have the desired inputs, locate and download the input files |
| 43 | +for your model revision and place them in a local directory. You may find the |
| 44 | +exact paths in the relevant json file in |
| 45 | +`tests/external/iree-test-suites/sharktank_models/quality_tests/sdxl/`. |
| 46 | + |
| 47 | +Next, compile the relevant model using your IREE build. The exact flags should |
| 48 | +mirror what CI uses for the target you're validating. You can find this |
| 49 | +information from failing CI logs or from the same json file as mentioned above. |
| 50 | +The example below shows a representative invocation; replace paths and flags |
| 51 | +with your local equivalents as needed. |
| 52 | + |
| 53 | +```bash |
| 54 | +iree-build/tools/iree-compile \ |
| 55 | + -o model.rocm_gfx942.vmfb \ |
| 56 | + punet_fp16.mlir \ |
| 57 | + --mlir-timing \ |
| 58 | + --mlir-timing-display=list \ |
| 59 | + --iree-consteval-jit-debug \ |
| 60 | + --iree-hal-target-device=hip \ |
| 61 | + --iree-opt-const-eval=false \ |
| 62 | + --iree-opt-level=O3 \ |
| 63 | + --iree-dispatch-creation-enable-fuse-horizontal-contractions=true \ |
| 64 | + --iree-vm-target-truncate-unsupported-floats \ |
| 65 | + --iree-llvmgpu-enable-prefetch=true \ |
| 66 | + --iree-opt-data-tiling=false \ |
| 67 | + --iree-codegen-gpu-native-math-precision=true \ |
| 68 | + --iree-codegen-llvmgpu-use-vector-distribution \ |
| 69 | + --iree-hip-waves-per-eu=2 \ |
| 70 | + --iree-execution-model=async-external \ |
| 71 | + --iree-scheduling-dump-statistics-format=json \ |
| 72 | + --iree-scheduling-dump-statistics-file=compilation_info.json \ |
| 73 | + --iree-preprocessing-pass-pipeline="builtin.module(util.func(iree-flow-canonicalize), iree-preprocessing-transpose-convolution-pipeline, iree-preprocessing-pad-to-intrinsics)" \ |
| 74 | + --iree-codegen-transform-dialect-library=/path/to/attention_and_matmul_spec_punet_mi300.mlir \ |
| 75 | + --iree-hip-target=gfx942 |
| 76 | +``` |
| 77 | + |
| 78 | +After compilation, run the module to produce the new outputs that will become |
| 79 | +the new goldens: |
| 80 | + |
| 81 | +```bash |
| 82 | +iree-build/tools/iree-run-module \ |
| 83 | + --device=hip \ |
| 84 | + --module=model.rocm_gfx942.vmfb \ |
| 85 | + --function=main \ |
| 86 | + --input=1x4x128x128xf16=@${CACHE_DIR}/punet_input0.bin \ |
| 87 | + --input=1xf16=@${CACHE_DIR}/punet_input1.bin \ |
| 88 | + --input=2x64x2048xf16=@${CACHE_DIR}/punet_input2.bin \ |
| 89 | + --input=2x1280xf16=@${CACHE_DIR}/punet_input3.bin \ |
| 90 | + --input=2x6xf16=@${CACHE_DIR}/punet_input4.bin \ |
| 91 | + --input=1xf16=@${CACHE_DIR}/punet_input5.bin \ |
| 92 | + --parameters=model=/path/to/punet_weights.irpa \ |
| 93 | + --output=@punet_fp16_out_v{n+1}.0.bin |
| 94 | +``` |
| 95 | + |
| 96 | +## Upload new outputs to Azure |
| 97 | + |
| 98 | +With outputs generated, upload the new `v{n+1}` outputs to the same location in |
| 99 | +the `sharkpublic` Azure container as the previous outputs. |
| 100 | + |
| 101 | +```bash |
| 102 | +az storage blob upload \ |
| 103 | + --account-name sharkpublic \ |
| 104 | + --container-name sharkpublic \ |
| 105 | + --name <path/in/blob/container> \ |
| 106 | + --file <local/file/path> |
| 107 | +``` |
| 108 | + |
| 109 | +After uploading, update the configuration that tells CI which golden version to |
| 110 | +use. This is typically a JSON key whose value encodes the version (for example, |
| 111 | +`punet_output_v{n}`). Increment it to `punet_output_v{n+1}` and commit this |
| 112 | +change along with any related edits. |
| 113 | + |
| 114 | +Finally, re-run the CI pipeline and confirm the quality tests pass against the |
| 115 | +newly uploaded outputs. |
0 commit comments