Fix get_target_trial_index for LILO experiments (#5038) by ItsMrLin · Pull Request #5038 · facebook/Ax

ItsMrLin · 2026-03-16T20:28:52Z

Summary:

In LILO (LLM-In-the-Loop Optimization) experiments, the optimization
config objective is pairwise_pref_query — a derived metric that only
LILO labeling trials carry data for. get_target_trial_index() then
selects these labeling trials (which have COMPLETE pairwise data) as
the relativization reference instead of non-LILO trials (which have base
metric data). The target trial's SQ then lacks base metrics, causing
TransformToNewSQ and downstream model fitting to fail.

Fix:

Exclude LILO labeling trials (trial_type == LILO_LABELING) from
the target trial candidate set.
For LILO experiments, accept INCOMPLETE metric availability so that
non-LILO trials (which have base-metric data but lack the pairwise
preference metric) can serve as relativization references.

Reviewed By: Balandat

Differential Revision: D96574746

meta-codesync · 2026-03-16T20:29:00Z

@ItsMrLin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96574746.

Summary: In LILO (LLM-In-the-Loop Optimization) experiments, the optimization config objective is `pairwise_pref_query` — a derived metric that only LILO labeling trials carry data for. `get_target_trial_index()` then selects these labeling trials (which have COMPLETE pairwise data) as the relativization reference instead of Sobol trials (which have base metric data). The target trial's SQ then lacks base metrics, causing TransformToNewSQ and downstream model fitting to fail. Fix: 1. Exclude LILO labeling trials (`trial_type == LILO_LABELING`) from the target trial candidate set. 2. For LILO experiments, add a fallback that checks metric availability against all experiment metrics (not just opt config), so Sobol trials are found even when the opt config includes the pairwise metric. Differential Revision: D96574746

Summary: In LILO (LLM-In-the-Loop Optimization) experiments, the optimization config objective is `pairwise_pref_query` — a derived metric that only LILO labeling trials carry data for. `get_target_trial_index()` then selects these labeling trials (which have COMPLETE pairwise data) as the relativization reference instead of Sobol trials (which have base metric data). The target trial's SQ then lacks base metrics, causing TransformToNewSQ and downstream model fitting to fail. Fix: 1. Exclude LILO labeling trials (`trial_type == LILO_LABELING`) from the target trial candidate set. 2. For LILO experiments, add a fallback that checks metric availability against all experiment metrics (not just opt config), so Sobol trials are found even when the opt config includes the pairwise metric. Reviewed By: Balandat Differential Revision: D96574746

codecov-commenter · 2026-03-17T23:54:36Z

Codecov Report

❌ Patch coverage is 94.44444% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.74%. Comparing base (2d4750a) to head (7f1a99c).

Files with missing lines	Patch %	Lines
ax/storage/sqa_store/decoder.py	60.00%	2 Missing ⚠️
ax/storage/sqa_store/encoder.py	75.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #5038   +/-   ##
=======================================
  Coverage   96.73%   96.74%           
=======================================
  Files         606      606           
  Lines       66242    66296   +54     
=======================================
+ Hits        64080    64136   +56     
+ Misses       2162     2160    -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Summary: In LILO (LLM-In-the-Loop Optimization) experiments, the optimization config objective is `pairwise_pref_query` — a derived metric that only LILO labeling trials carry data for. `get_target_trial_index()` then selects these labeling trials (which have COMPLETE pairwise data) as the relativization reference instead of non-LILO trials (which have base metric data). The target trial's SQ then lacks base metrics, causing TransformToNewSQ and downstream model fitting to fail. Fix: 1. Exclude LILO labeling trials (`trial_type == LILO_LABELING`) from the target trial candidate set. 2. For LILO experiments, accept INCOMPLETE metric availability so that non-LILO trials (which have base-metric data but lack the pairwise preference metric) can serve as relativization references. Reviewed By: Balandat Differential Revision: D96574746

Summary: Pull Request resolved: facebook#5038 In LILO (LLM-In-the-Loop Optimization) experiments, the optimization config objective is `pairwise_pref_query` — a derived metric that only LILO labeling trials carry data for. `get_target_trial_index()` then selects these labeling trials (which have COMPLETE pairwise data) as the relativization reference instead of non-LILO trials (which have base metric data). The target trial's SQ then lacks base metrics, causing TransformToNewSQ and downstream model fitting to fail. Fix: 1. Exclude LILO labeling trials (`trial_type == LILO_LABELING`) from the target trial candidate set. 2. For LILO experiments, accept INCOMPLETE metric availability so that non-LILO trials (which have base-metric data but lack the pairwise preference metric) can serve as relativization references. Reviewed By: Balandat Differential Revision: D96574746

Summary: In LILO (LLM-In-the-Loop Optimization) experiments, the optimization config objective is `pairwise_pref_query` — a derived metric that only LILO labeling trials carry data for. `get_target_trial_index()` then selects these labeling trials (which have COMPLETE pairwise data) as the relativization reference instead of non-LILO trials (which have base metric data). The target trial's SQ then lacks base metrics, causing TransformToNewSQ and downstream model fitting to fail. Fix: 1. Exclude LILO labeling trials (`trial_type == LILO_LABELING`) from the target trial candidate set. 2. For LILO experiments, accept INCOMPLETE metric availability so that non-LILO trials (which have base-metric data but lack the pairwise preference metric) can serve as relativization references. Reviewed By: Balandat Differential Revision: D96574746

Summary: Move `LLMMessage` dict conversion from the `experiment.llm_messages` getter/setter to the storage encoders/decoders, following Ax convention that domain objects hold domain types and serialization happens at the storage boundary. **`experiment.py`**: The setter now stores `LLMMessage` objects directly in `_properties`. The getter handles both `LLMMessage` objects (new path) and plain dicts (backward compat with previously stored data). **JSON store**: No explicit changes needed — the encoder's generic dataclass fallback auto-serializes `LLMMessage` with a `__type` tag, and `LLMMessage` is already registered in `CORE_DECODER_REGISTRY`. **SQA store**: The encoder converts `LLMMessage` → dict via `dataclasses.asdict()` in the properties copy before DB write (same pattern as `pruning_target_parameterization`). The decoder converts dicts → `LLMMessage` after loading properties, in both `_init_experiment_from_sqa` and `_init_mt_experiment_from_sqa`. Reviewed By: lena-kashtelyan Differential Revision: D96434290

Summary: In LILO (LLM-In-the-Loop Optimization) experiments, the optimization config objective is `pairwise_pref_query` — a derived metric that only LILO labeling trials carry data for. `get_target_trial_index()` then selects these labeling trials (which have COMPLETE pairwise data) as the relativization reference instead of non-LILO trials (which have base metric data). The target trial's SQ then lacks base metrics, causing TransformToNewSQ and downstream model fitting to fail. Fix: 1. Exclude LILO labeling trials (`trial_type == LILO_LABELING`) from the target trial candidate set. 2. For LILO experiments, accept INCOMPLETE metric availability so that non-LILO trials (which have base-metric data but lack the pairwise preference metric) can serve as relativization references. Reviewed By: Balandat Differential Revision: D96574746

meta-codesync · 2026-03-18T16:48:40Z

This pull request has been merged in 0ecc5d9.

meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Mar 16, 2026

meta-codesync bot added fb-exported meta-exported labels Mar 16, 2026

ItsMrLin force-pushed the export-D96574746 branch from 6f8c609 to e864223 Compare March 17, 2026 23:17

ItsMrLin force-pushed the export-D96574746 branch 2 times, most recently from 1064d78 to 649afea Compare March 18, 2026 00:10

ItsMrLin force-pushed the export-D96574746 branch from 649afea to 22f3397 Compare March 18, 2026 00:12

meta-codesync bot changed the title ~~Fix get_target_trial_index for LILO experiments~~ Fix get_target_trial_index for LILO experiments (#5038) Mar 18, 2026

ItsMrLin force-pushed the export-D96574746 branch from 22f3397 to 28987d9 Compare March 18, 2026 00:51

ItsMrLin force-pushed the export-D96574746 branch from 28987d9 to ee2481a Compare March 18, 2026 05:49

ItsMrLin added 2 commits March 17, 2026 22:50

ItsMrLin force-pushed the export-D96574746 branch from ee2481a to 7f1a99c Compare March 18, 2026 05:50

meta-codesync bot closed this in 0ecc5d9 Mar 18, 2026

facebook-github-tools bot added the Merged label Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix get_target_trial_index for LILO experiments (#5038)#5038

Fix get_target_trial_index for LILO experiments (#5038)#5038
ItsMrLin wants to merge 2 commits intofacebook:mainfrom
ItsMrLin:export-D96574746

ItsMrLin commented Mar 16, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

meta-codesync bot commented Mar 16, 2026

Uh oh!

codecov-commenter commented Mar 17, 2026 •

edited

Loading

Uh oh!

meta-codesync bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ItsMrLin commented Mar 16, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meta-codesync bot commented Mar 16, 2026

Uh oh!

codecov-commenter commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

meta-codesync bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ItsMrLin commented Mar 16, 2026 •

edited by meta-codesync bot

Loading

codecov-commenter commented Mar 17, 2026 •

edited

Loading