-
Notifications
You must be signed in to change notification settings - Fork 1
[huggingface tracer] Add suite based off of results from tracer #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -62,11 +79,17 @@ def cli(suite, backend, ops, llm_max_attempts): | |||
torch.bfloat16, | |||
filter=ops, | |||
), | |||
"huggingface": lambda: HuggingFaceTracerTestSuite( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's code duplication for HuggingFaceTracerTestSuite - also the file path is still linked to your personal repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree with that, though let's keep it for now, and get rid of it in a seperate pr. Thanks for the catch on the personal json (though an aws bucket is probably better. It just doesn't make sense to run the tracer once per benchmark)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally sure checking in sample_inputs.json will make sense in the repo, especially if we're making updates, mind pushing it to some blob storage instead or HF datasets, something so we can visualize it in browser would be nice too since it's too big to visualize in the Github UI
@msaroufim yeah I agree. I stuck something on huggingface here |
This PR is an extension of #21 which is currently being worked on as it's both pretty messy and needs more coverage (right now it only supports 20 models). In practice there are two parts to the tracer. 1) The actual tracer that grabs a bunch of huggingface models and traces them, and 2) turning the output into a test suite. This PR does the latter.
A sample output of the tracer can be found in BackendBench/huggingface_tracer/tracer_ops_and_shapes/sample_inputs.json with an explanation of the schema in BackendBench/huggingface_tracer/tracer_ops_and_shapes/README.md
Effectively, what the code here does is that it takes one of these json outputs and then uses that in order to create test cases for a suite. Currently, I use the 5 most popular sets of inputs and 5 largest sets of inputs to create a set of up to 10 tests per op that we find. Generally this seems to work for correctness, though we still need to support performance.
In order to check dtype / device compatibility, I am using op_info, however, it is not all comprehensive, so I add a few manual ops at BackendBench/huggingface_tracer/manual_ops_mapping.json. We probably need a better solution to this in the long run.
Some weird corner cases
Some more todos
Copilot generated summary to make reviewing this easier
This pull request introduces a comprehensive test suite for HuggingFace tracer data within the
BackendBench
module. The changes include the addition of new classes and methods for handling tracer operations, parsing JSON data, and generating test cases for PyTorch operations. The updates also include a schema definition for traced inputs and a manual mapping of unsupported operations.Test Suite Implementation:
BackendBench/huggingface_tracer/__init__.py
: Added module-level documentation and exposed key classes and methods (HuggingFaceTracerTest
,HuggingFaceTracerOpTest
,HuggingFaceTracerTestSuite
,build_huggingface_tracer_tests
) for creating and running tracer tests.BackendBench/huggingface_tracer/suite.py
: Implemented theHuggingFaceTracerTestSuite
class and related functionality for generating tests based on tracer data, including handling unsupported operations and selecting unique inputs.JSON Data Handling:
BackendBench/huggingface_tracer/tracer_parser.py
: Added utilities for loading JSON data, selecting relevant inputs based on popularity and size, and creating tensors and tensor lists from metadata. Special cases for certain operations requiring unique handling were also defined.Manual Mapping of Unsupported Operations:
BackendBench/huggingface_tracer/manual_ops_mapping.json
: Introduced a JSON file mapping unsupported operations to their compatible data types on CPU and CUDA devices, enabling tests for these operations.Documentation:
BackendBench/huggingface_tracer/tracer_ops_and_shapes/README.md
: Added a detailed schema for the structure of traced inputs, including field descriptions and examples, to guide developers in understanding and utilizing the tracer data.