Add test_all_subset_models mode #2428

kwasd · 2024-10-07T11:52:16Z

kwasd · 2024-10-07T13:05:01Z

Test runs (are still pending):

.github/workflows/e2e-performance.yml

.github/workflows/e2e-accuracy.yml

pbchekin · 2024-10-07T16:35:33Z

.github/workflows/e2e-reusable.yml

          elif [[ "${{ inputs.models }}" == "subset" ]]; then
            while read model; do
-              bash -e $GITHUB_WORKSPACE/scripts/inductor_xpu_test.sh ${{ inputs.suite }} ${{ inputs.dtype }} ${{ inputs.mode }} ${{ inputs.test_mode }} xpu 0 static 1 0 $model
+              bash -e $GITHUB_WORKSPACE/scripts/inductor_xpu_test.sh ${{ inputs.suite }} ${{ inputs.dtype }} ${{ inputs.mode }} ${{ inputs.test_mode }} xpu 0 static 1 0 $model || ${{ inputs.test_all_subset_models }}


This won't work as expected, the script returns 0 even if some accuracy tests or performance runs failed. You need to check the report (csv file) and verify it has all items from a corresponding subset and every item has passed. See https://github.com/intel/intel-xpu-backend-for-triton/blob/main/.github/workflows/build-test-reusable.yml#L179-L181 for the idea.

Co-authored-by: Pavel Chekin <[email protected]>

…ure/2246-e2e-accuracy-subset

…/intel/intel-xpu-backend-for-triton into feature/2246-e2e-accuracy-subset

kwasd · 2024-10-08T11:04:00Z

Test runs:
true: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11234055787
false: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11234070592

pbchekin · 2024-10-08T15:45:16Z

.github/workflows/e2e-reusable.yml

          elif [[ "${{ inputs.models }}" == "subset" ]]; then
            while read model; do
              bash -e $GITHUB_WORKSPACE/scripts/inductor_xpu_test.sh ${{ inputs.suite }} ${{ inputs.dtype }} ${{ inputs.mode }} ${{ inputs.test_mode }} xpu 0 static 1 0 $model
+              grep ,$model, $WORKSPACE/inductor_log/*/*/*.csv | grep -q ,pass, || ${{ inputs.test_all_subset_models }}


This will work only for accuracy. Also for accuracy it is possible that CSV file does not exist (major failure with E2E) in this case the one liner above won't work. I think you need more sophisticated script to handle accuracy/performance and all corner cases. The idea is you need to check that every model from the subset was successfully executed.

pbchekin · 2024-10-21T14:37:21Z

scripts/check_inductor_report.py

+
+
+def check_report(suite, dtype, mode, test_mode, device, models_file):
+    inductor_log_dir = Path("torch_compile_debug") / suite / dtype


I think the right location is inductor_log. Anyways, it is better to pass it as a parameter.

pbchekin · 2024-10-21T14:38:58Z

scripts/check_inductor_report.py

+    argparser = argparse.ArgumentParser()
+    argparser.add_argument("--suite", required=True)
+    argparser.add_argument("--dtype", required=True)
+    argparser.add_argument("--mode", required=True, choices=("training", "inference"))


The mode sting can be inference-no-freezing (see the workflow inputs). The location needs to be inference in this case.

The location needs to be inference in this case.

What do you mean? Here is example of a CSV file name: timm_models/amp_fp16/inductor_timm_models_amp_fp16_inference-no-freezing_xpu_accuracy.csv

Oh you probably right.

pbchekin · 2024-10-21T14:42:01Z

scripts/check_inductor_report.py

+    exitcode = 0
+
+    with open(models_file, encoding="utf-8") as f:
+        subset = f.read().strip().split("\n")


Nit: use readlines().

This leaves newlines.
f.read().splitlines() works

kwasd · 2024-10-21T19:22:37Z

Accuracy run: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11445587582
Performance run: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11445592099

kwasd added 2 commits October 7, 2024 13:51

Add test_all_subset_models mode

1fc325d

Add test_all_subset_models mode

7eb6e85

kwasd requested a review from pbchekin October 7, 2024 13:03

vlad-penkin linked an issue Oct 7, 2024 that may be closed by this pull request

Subset of E2E models for accuracy tests #2246

Closed

pbchekin reviewed Oct 7, 2024

View reviewed changes

kwasd and others added 5 commits October 7, 2024 18:59

Update .github/workflows/e2e-performance.yml

284bf46

Co-authored-by: Pavel Chekin <[email protected]>

Update .github/workflows/e2e-accuracy.yml

c47a9f8

Co-authored-by: Pavel Chekin <[email protected]>

Merge https://github.com/intel/intel-xpu-backend-for-triton into feat…

151e282

…ure/2246-e2e-accuracy-subset

Merge branch 'feature/2246-e2e-accuracy-subset' of https://github.com…

312d793

…/intel/intel-xpu-backend-for-triton into feature/2246-e2e-accuracy-subset

Check models

1818779

kwasd requested a review from pbchekin October 8, 2024 10:33

fix whitespace

3122341

pbchekin reviewed Oct 8, 2024

View reviewed changes

kwasd marked this pull request as draft October 9, 2024 11:03

kwasd added 8 commits October 21, 2024 13:32

Merge branch 'main' into feature/2246-e2e-accuracy-subset

da4183b

check-inductor-report.py

c43be77

check-inductor-report.py

78a156f

check-inductor-report.py

bbfd22b

check-inductor-report.py

4a96f07

check-inductor-report.py

eea69a3

check-inductor-report.py

c749d14

check-inductor-report.py

92f5d63

kwasd marked this pull request as ready for review October 21, 2024 13:48

kwasd added 3 commits October 21, 2024 15:49

Merge branch 'main' into feature/2246-e2e-accuracy-subset

4ef86fe

check-inductor-report.py

da66237

check-inductor-report.py

d644e75

kwasd requested a review from pbchekin October 21, 2024 14:15

check-inductor-report.py

06db6d3

check-inductor-report.py

e5c6af7

pbchekin reviewed Oct 21, 2024

View reviewed changes

kwasd added 11 commits October 21, 2024 16:46

check-inductor-report.py

5192b85

check-inductor-report.py

b148c40

check-inductor-report.py

3679715

check-inductor-report.py

543cdd1

check-inductor-report.py

76b3ece

check-inductor-report.py

8794950

check-inductor-report.py

fe5f9f1

check-inductor-report.py

0d19d9d

check-inductor-report.py

8285df5

check-inductor-report.py

cd3dacd

check-inductor-report.py

6dfbaee

pbchekin approved these changes Oct 21, 2024

View reviewed changes

kwasd merged commit b6cdccd into main Oct 21, 2024
96 of 98 checks passed

kwasd deleted the feature/2246-e2e-accuracy-subset branch October 21, 2024 21:20



		def check_report(suite, dtype, mode, test_mode, device, models_file):
		inductor_log_dir = Path("torch_compile_debug") / suite / dtype

Add test_all_subset_models mode #2428

Add test_all_subset_models mode #2428

Uh oh!

Conversation

kwasd commented Oct 7, 2024

Uh oh!

kwasd commented Oct 7, 2024

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kwasd commented Oct 8, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kwasd commented Oct 21, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants