Skip to content

Potential bug: custom feature_dimensions provided via program.metrics should not be binarized #207

@agadetsky

Description

@agadetsky

LLM prompt optimization example considers using custom features as MAP-Elites coordinates:
https://github.com/codelion/openevolve/blob/599cd54036745a6f4e4d22a74760a77e22778fca/examples/llm_prompt_optimization/evaluator.py#L62

As we can see from the implementation of calculate_prompt_features, function returns bin indexes rather than raw scores.
This will cause some undefined behavior, i.e., further binarization in database.add method:
https://github.com/codelion/openevolve/blob/599cd54036745a6f4e4d22a74760a77e22778fca/openevolve/database.py#L718

From the modeling perspective, I suggest to move binarization entirely out of ProgramDatabase class, since, in practice, really useful feature coordinates are very problem specific and should be defined by the user. Subsequently, user should define how to binarize them and provide bin indices in program.metrics instead of raw scores.

Then, in database._calculate_feature_coords we could just do

elif dim in program.metrics:
    coords.append(program.metrics[dim])

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions