Skip to content

move failing mulitmodal lmeval tests to skipped folder#1273

Merged
dsikka merged 1 commit intomainfrom
bdellabe/skip-failing-lmeval-tests
Mar 21, 2025
Merged

move failing mulitmodal lmeval tests to skipped folder#1273
dsikka merged 1 commit intomainfrom
bdellabe/skip-failing-lmeval-tests

Conversation

@brian-dellabetta
Copy link
Collaborator

SUMMARY:
multi-modal lm-eval tests are failing due to a non-reproducibility issue that still needs to be resolved. In the meantime, moving those tests to a skipped folder until resolution.

Resolution can be tracked in #1260

TEST PLAN:
no new source code

@brian-dellabetta brian-dellabetta added the ready When a PR is ready for review label Mar 20, 2025
@github-actions
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/skip-failing-lmeval-tests branch from 135ab73 to cbd85cc Compare March 20, 2025 18:49
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we expect failures from all 3? I thought it was just the 2.

@brian-dellabetta
Copy link
Collaborator Author

brian-dellabetta commented Mar 20, 2025

we expect failures from all 3? I thought it was just the 2.

@dsikka I have changes on #1260 that will move away from the 2B parameter model we are using with FP8, so all 3 configs will change. so i thought it'd be best to just disable them all for now, WDYT?

@dsikka dsikka merged commit 83d53e7 into main Mar 21, 2025
8 checks passed
@dsikka dsikka deleted the bdellabe/skip-failing-lmeval-tests branch March 21, 2025 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants