Skip to content

[Oneshot] Add validation for empty dataset and enhance oneshot function parameters (Supersedes PR #1957)#2378

Draft
ArkaSanka wants to merge 1 commit intovllm-project:mainfrom
ArkaSanka:oneshot-dataset-params
Draft

[Oneshot] Add validation for empty dataset and enhance oneshot function parameters (Supersedes PR #1957)#2378
ArkaSanka wants to merge 1 commit intovllm-project:mainfrom
ArkaSanka:oneshot-dataset-params

Conversation

@ArkaSanka
Copy link
Copy Markdown

@ArkaSanka ArkaSanka commented Feb 17, 2026

Issue Description

The oneshot function signature in oneshot.py was missing several parameters that exist in the underlying dataclasses (DatasetArguments, ModelArguments, RecipeArguments). This caused issues when users tried to use these parameters directly, particularly with:

  • sequential_targets: Conflicts occurred between recipe modifiers and direct parameters
  • preprocessing_func: Returns an error when the dataset is empty
  • pipeline: Not properly validated against sequential_targets

Changes Made

Parameter Alignment:

  • Updated the oneshot function signature to include all missing parameters from the argument dataclasses
  • Ensured type hints and default values match those defined in the dataclasses
  • Added missing parameters: preprocessing_func, data_collator, raw_kwargs, max_train_samples, pipeline, tracing_ignore, sequential_targets

Validation Logic:

  • Added validation to detect conflicting sequential_targets between recipe modifiers and direct parameters
  • Added validation to prevent incompatible pipeline settings with sequential_targets
  • Fixed error message formatting to comply with style guidelines

Test Improvements:

  • Updated the test fixture in test_api_inputs.py to handle all parameters correctly
  • Added detection for potential parameter conflicts to make tests more robust

Impact

These changes ensure that all parameters defined in the argument dataclasses can be used directly with the oneshot function without unexpected behavior. Users can now pass parameters, such as sequential_targets and preprocessing_func, directly to oneshot without encountering cryptic errors or unexpected behavior. The API is now more consistent with its underlying implementation, making it more intuitive to use.

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @ArkaSanka, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the oneshot function's interface to align more closely with its internal argument structures, resolving issues where certain parameters were not directly accessible or caused conflicts. It introduces crucial validation checks for parameter combinations and updates the testing framework to reflect these changes, leading to a more robust and intuitive API for users.

Highlights

  • Function Signature Alignment: The oneshot function signature has been updated to include missing parameters from underlying dataclasses, such as preprocessing_func, raw_kwargs, and sequential_targets, ensuring better consistency and direct usability.
  • Validation Logic Added: New validation has been implemented to prevent conflicting configurations, specifically disallowing sequential_targets when the pipeline is set to 'independent'.
  • Test Suite Enhancement: The test fixture for API inputs has been updated to correctly handle all newly added and modified parameters, improving test robustness and coverage.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/llmcompressor/entrypoints/oneshot.py
    • Imported Any type for more flexible type hinting.
    • Modified the oneshot function signature to include raw_kwargs, preprocessing_func, remove_columns, and dvc_data_repository parameters.
    • Adjusted default values and order for pipeline, tracing_ignore, and sequential_targets parameters in the oneshot function.
    • Updated docstrings for the oneshot function to describe the newly added and modified parameters.
    • Added validation logic to raise a ValueError if sequential_targets is used with an 'independent' pipeline.
  • tests/llmcompressor/transformers/oneshot/test_api_inputs.py
    • Added logging import and basic configuration for improved test output.
    • Updated the get_oneshot_args helper function to pass additional parameters like pipeline, sequential_targets, tracing_ignore, raw_kwargs, preprocessing_func, remove_columns, dvc_data_repository, splits, and log_dir from the test configuration.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the oneshot function by aligning its parameters with the underlying dataclasses, improving API consistency. The changes include adding several missing parameters, along with validation to prevent incompatible configurations between sequential_targets and the pipeline type. The tests have also been updated to cover these new parameters.

My review focuses on improving the documentation clarity and fixing a minor typo in an error message. The core logic of the changes appears sound and addresses the described issue effectively.

@ArkaSanka ArkaSanka force-pushed the oneshot-dataset-params branch 2 times, most recently from c959cc7 to 459d04a Compare February 18, 2026 16:04
Signed-off-by: Arka Sanka <arkasanka12@gmail.com>
@ArkaSanka ArkaSanka force-pushed the oneshot-dataset-params branch from 459d04a to 9b54980 Compare February 18, 2026 16:06
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Feb 18, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ArkaSanka.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant