Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 10, 2026

LGBMEstimator does not expose the objective parameter in its search space, preventing users from specifying custom objectives like mape even when using non-default metrics.

Changes

  • flaml/automl/model.py: Added objective to LGBMEstimator.search_space() with domain: None to allow specification via custom_hp without hyperparameter tuning
  • test/automl/test_custom_hp.py: Added test_lgbm_objective() to verify objective parameter is correctly passed to underlying LightGBM model

Usage

automl = AutoML()
custom_hp = {
    "lgbm": {
        "objective": {"domain": "mape"}  # or regression_l1, huber, etc.
    }
}
automl.fit(X_train, y_train, custom_hp=custom_hp, metric="mape")

The objective parameter remains optional—when unspecified, LightGBM defaults are used.

Original prompt

This section details on the original issue you should resolve

<issue_title>AutoML does not pass proper objective to estimator_class when metric is non-default.</issue_title>
<issue_description>Currently, neither search_space nor get_params in LGBMEstimator passes objective to the params.

FLAML/flaml/automl/model.py

Lines 1266 to 1309 in a68d073

def search_space(cls, data_size, **params):
upper = max(5, min(32768, int(data_size[0]))) # upper must be larger than lower
return {
"n_estimators": {
"domain": tune.lograndint(lower=4, upper=upper),
"init_value": 4,
"low_cost_init_value": 4,
},
"num_leaves": {
"domain": tune.lograndint(lower=4, upper=upper),
"init_value": 4,
"low_cost_init_value": 4,
},
"min_child_samples": {
"domain": tune.lograndint(lower=2, upper=2**7 + 1),
"init_value": 20,
},
"learning_rate": {
"domain": tune.loguniform(lower=1 / 1024, upper=1.0),
"init_value": 0.1,
},
"log_max_bin": { # log transformed with base 2
"domain": tune.lograndint(lower=3, upper=11),
"init_value": 8,
},
"colsample_bytree": {
"domain": tune.uniform(lower=0.01, upper=1.0),
"init_value": 1.0,
},
"reg_alpha": {
"domain": tune.loguniform(lower=1 / 1024, upper=1024),
"init_value": 1 / 1024,
},
"reg_lambda": {
"domain": tune.loguniform(lower=1 / 1024, upper=1024),
"init_value": 1.0,
},
}
def config2params(self, config: dict) -> dict:
params = super().config2params(config)
if "log_max_bin" in params:
params["max_bin"] = (1 << params.pop("log_max_bin")) - 1
return params

As a result, the objective of the underlying model is always the default value reg:squarederror, even when the metric is mape or other non-default value.

Ideally, objective should be either configured in the search space or derived from metric.

</issue_description>

Comments on the Issue (you are @copilot in this section)

@thinkall Thank you for the feedback, @Atry ! Would you like to raise a PR for this? @thinkall Hi @dannycg1996 , what do you think of this issue? @dannycg1996 Hi @Atry and @thinkall, This is an interesting issue!

My initial thought is that I like this feature. A few points which came to mind:

  • I prefer deriving the objective from the search space, over the metric:
    • Not all metrics available within FLAML are in-built within LGBM (such as f1)
    • I'm worried we might end up with inconsistent behaviour, where sometimes we derive the objective from the metric, and sometimes we don't
    • Deriving the objective from the search space gives users explicit control over the objective i.e. 'The LGBM objective will be l2 for regression tasks, unless this is overridden with a custom search space'
    • There might be cases where the user wishes to have the LGBM objective differ from the objective metric used in the AutoML process.
  • I prefer that we don't allow users to pass custom metrics in as the objective metric:
    • Custom metrics used for the objective have to be written in this form:
      Image, which I don't think aligns with the standard SKLearn implementation of a lot of these metrics.
    • A quick check of the LGBM code seems to suggest to me that objectives (i.e. MAPE) are written in C, presumably for speed. Even if we can pass Python methods in, we might see some performance issues.

I'm still unfamiliar with large parts of the FLAML and LGBM codebases, but I think implementing this is possible, where if the user dictates a value for the objective within the LGBM search space (i.e. setting it to 'mape') then we can pass that through to the LGBMRegressor.

If we do implement something along these lines, then updating documentation is important, and I'd also like to have some sort of benchmark comparison.

Please let me know what your thoughts are @thinkall </comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 10, 2026 08:37
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix AutoML not passing proper objective to estimator class Add objective parameter to LGBMEstimator search space Jan 10, 2026
Copilot AI requested a review from thinkall January 10, 2026 08:41
@thinkall thinkall marked this pull request as ready for review January 19, 2026 12:38
@thinkall
Copy link
Collaborator

automl = AutoML()
custom_hp = {
    "lgbm": {
        "objective": {"domain": "mape"}  # or regression_l1, huber, etc.
    }
}
automl.fit(X_train, y_train, custom_hp=custom_hp, metric="mape")

This will work without making any changes to the model. That to say, we already supported the feature ask.

@thinkall thinkall closed this Jan 19, 2026
@thinkall thinkall deleted the copilot/fix-objective-passing-issue branch January 19, 2026 12:59
@thinkall thinkall reopened this Jan 19, 2026
@thinkall thinkall merged commit 46a406e into main Jan 19, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AutoML does not pass proper objective to estimator_class when metric is non-default.

3 participants