Skip to content

[Training] Unifying Preprocess + Postprocessing logic for Train/Oneshot#1212

Merged
dsikka merged 13 commits intomainfrom
processing
Mar 6, 2025
Merged

[Training] Unifying Preprocess + Postprocessing logic for Train/Oneshot#1212
dsikka merged 13 commits intomainfrom
processing

Conversation

@horheynm
Copy link
Copy Markdown

@horheynm horheynm commented Feb 28, 2025

Order of reviews:
#1206
#1207
#1209
#1212 <-- Here
#1214

SUMMARY:

  • Move the preprocessing and postprocessing logic out of src/llmcompressor/transformers/finetune/text_generation.py and into
    src/llmcompressor/entrypoints/utils.py

TEST PLAN:
Pass tests

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

George Ohashi added 5 commits February 28, 2025 10:25
Signed-off-by: George Ohashi <george@neuralmagic.com>
Signed-off-by: George Ohashi <george@neuralmagic.com>
…nto processing

Signed-off-by: George Ohashi <george@neuralmagic.com>
:
Signed-off-by: George Ohashi <george@neuralmagic.com>
dsikka pushed a commit that referenced this pull request Mar 3, 2025
Order of reviews:
#1206
#1207 <-- Here
#1209 
#1212
#1214 

SUMMARY:
* Decouple arg parser to be used for both oneshot and train

TEST PLAN:
* Pass tests
dsikka added a commit that referenced this pull request Mar 5, 2025
Order of reviews:
#1206  <-- Here
#1207
#1209 
#1212
#1214 

SUMMARY:
Rename data_args to dataset_args

TEST PLAN:
Pass tests
FInd `data_args` using `grep`

---------

Signed-off-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
dsikka pushed a commit that referenced this pull request Mar 5, 2025
Order of reviews:
#1206
#1207
#1209 <-- Here
#1212
#1214 

SUMMARY:
* Move dataset logic out of transformers module
`src/llmcompressor/transformers/finetune/data/data_helpers.py`, add it
to `src/llmcompressor/datasets/utils.py`


TEST PLAN:
Pass tests
@dsikka dsikka enabled auto-merge (squash) March 6, 2025 17:31
@dsikka dsikka merged commit 9d82f35 into main Mar 6, 2025
8 checks passed
@dsikka dsikka deleted the processing branch March 6, 2025 19:03
brian-dellabetta pushed a commit that referenced this pull request Mar 10, 2025
…ot (#1212)

Order of reviews:
#1206
#1207
#1209
#1212  <-- Here
#1214

SUMMARY:
* Move the preprocessing and postprocessing logic out of
`src/llmcompressor/transformers/finetune/text_generation.py` and into
`src/llmcompressor/entrypoints/utils.py`

TEST PLAN:
Pass tests

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
kylesayrs pushed a commit that referenced this pull request Mar 13, 2025
Order of reviews:
#1206
#1207
#1209
#1212
#1214 <-- Here

SUMMARY:
* Refactor Training pipeline
* Remove initialize, finalize from the session functions
* Add training information on entrypoints/readme.md on the different
types of training that can be carried out on llm-compressor
* Decouple training from text_generation.py::main. The new logic loves
in llmcompressor/entrypoints/train.py that takes the flow of
pre-process, carry out training logic and then post-process
* Delete outdated info on transformers/finetune/readme.md
* Update session_mixin.py to use session().initialize or
session().finalize.
* Deprecate train.py in text_generation.py, raising deprecation message
if used.
* Update tests to use llmcompressor's train, not
llmcompressor.transformers' train

TEST PLAN:
* Pass tests

---------

Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants