Skip to content

Conversation

@leshikus
Copy link
Contributor

@leshikus leshikus commented Oct 10, 2024

No-basekit requires more work

@leshikus leshikus requested a review from pbchekin October 21, 2024 15:15
@leshikus
Copy link
Contributor Author

@pbchekin Regarding your last offline comments that reusable action becomes too big with this patch, how do you think it is better to approach the problem?

Should I modify the existing actions instead of installing pre-built wheels?

@pbchekin
Copy link
Contributor

@pbchekin Regarding your last offline comments that reusable action becomes too big with this patch, how do you think it is better to approach the problem?

Should I modify the existing actions instead of installing pre-built wheels?

Roughly 50% of steps in build-test-reusable have a check for env_manager. IMHO, it is too much. At this point it is much cleaner just to create something like conda-build-test-reusable. I would expect some code duplication, but the code will be much simpler.

@leshikus
Copy link
Contributor Author

@pbchekin the separation is completed, please take a look

btw, I wonder why conda workflow starts automatically on this PR - how it is implemented? is it some gh magic in some other workflow?

@pbchekin
Copy link
Contributor

pbchekin commented Oct 24, 2024

It starts because the workflow has

  pull_request:
    branches:
      - main

Copy link
Contributor

@pbchekin pbchekin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the new files in scripts such as run-{base,conda}.sh?

@leshikus
Copy link
Contributor Author

you are correct, we no longer need run-base; we need run-conda, because I have also run-no-basekit in other branch

@leshikus
Copy link
Contributor Author

leshikus commented Oct 24, 2024

It starts because the workflow has

@pbchekin this has to be reverted to the scheduled run, is it correct?

@leshikus leshikus requested a review from pbchekin October 25, 2024 00:03
@pbchekin
Copy link
Contributor

It starts because the workflow has

@pbchekin this has to be reverted to the scheduled run, is it correct?

Yes.


ZE_AFFINITY_MASK=0 python pytorch/benchmarks/dynamo/huggingface.py --accuracy --float32 -dxpu -n10 --no-skip --dashboard --inference --freezing --total-partitions 1 --partition-id 0 --only AlbertForMaskedLM --backend=inductor --timeout=4800 --output=$(pwd -P)/inductor_log.csv

cat inductor_log.csv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason the original important comments are missing:

          # TODO: Find the fastest Hugging Face model
          # The script above always returns 0, so we need an additional check to see if the accuracy test passed

Copy link
Contributor Author

@leshikus leshikus Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# The script above always returns 0, so we need an additional check to see if the accuracy test passed

the script is no longer called, thus the note about its return status are no longer relevant

I will restore TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

fi

INSTRUMENTATION_LIB_DIR=$(ls -1d $TRITON_PROJ/python/build/*lib*/triton/instrumentation) || err "Could not find $TRITON_PROJ/python/build/*lib*/triton/instrumentation, build Triton first"
INSTRUMENTATION_LIB_DIR=$(ls -1d $TRITON_PROJ/python/build/*lib*/triton/instrumentation) || SHARED_LIB_DIR=$(ls -1d $TRITON_PROJ/python/triton/_C) || err "Could not find $TRITON_PROJ/python/build/*lib*/triton/instrumentation, build Triton first"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
INSTRUMENTATION_LIB_DIR=$(ls -1d $TRITON_PROJ/python/build/*lib*/triton/instrumentation) || SHARED_LIB_DIR=$(ls -1d $TRITON_PROJ/python/triton/_C) || err "Could not find $TRITON_PROJ/python/build/*lib*/triton/instrumentation, build Triton first"
INSTRUMENTATION_LIB_DIR=$(ls -1d $TRITON_PROJ/python/build/*lib*/triton/instrumentation) || INSTRUMENTATION_LIB_DIR=$(ls -1d $TRITON_PROJ/python/triton/_C) || err "Could not find $TRITON_PROJ/python/build/*lib*/triton/instrumentation, build Triton first"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also looks like the error message is no longer consistent with the verified locations. Also, conda workflow also builds triton, why $TRITON_PROJ/python/build/*lib*/triton/instrumentation does not work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location is different for installed wheels. From the other side the instrumentation test do not work for installed wheels, so this change is optional anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location is different for installed wheels. From the other side the instrumentation test do not work for installed wheels, so this change is optional anyway.

I see. Let's undo this change for now.

@leshikus
Copy link
Contributor Author

@leshikus leshikus requested a review from pbchekin October 29, 2024 16:26
@leshikus leshikus merged commit 18f70d0 into main Oct 29, 2024
4 checks passed
@leshikus leshikus deleted the lesh/conda-oct branch October 29, 2024 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] Fix conda build

4 participants