-
Notifications
You must be signed in to change notification settings - Fork 654
Feature: Parallel iterations #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Parallel iterations #63
Conversation
|
Solves: #32 |
|
Thanks for contributing can you rebase from main, I can test and then review this PR. |
|
Hello @codelion, thank you for this great project, and thank you @MashAliK for the parallelization effort. I saw the request to rebase this PR to the latest main. I needed the parallel evaluation as well, so I rebased it and resolved the conflicts in my fork: One thing I am not sure about, which was causing loading/storing large artifacts to fail is this code snippet: # Create directory and remove old path if it exists
if os.path.exists(save_path):
shutil.rmtree(save_path)Lines 361-363 in the save method in database.py The code now passes all the tests, but I am not sure what I did is correct. Let me know if you'd like me to open a new PR, or if the @MashAliK prefers to pull from this branch to update this one. Thank you |
|
I don't think that code in main? It was done in this PR, perhaps to handle multiple processes updating the DB
How have you solved it? Happy to look at the PR if you can send it across. Also, can you test with some of the existing examples with and without parallel execution to see if there is a speed up and they both reach similar convergence in terms of the best_program found. |
|
@codelion yes, it’s not in main, I was referring to the PR version done by @MashAliK. Erasing the path was causing a failure in loading large artifacts which are stored in that path, so I commented it out in my rebased version: https://github.com/SuhailB/openevolve/tree/updated-parallel-iterations I am not sure if this breaks the dependencies or not. @MashAliK probably would know more about this. I will try to run examples tomorrow. |
|
Hello, sorry I wasn't able to get to updating this PR earlier. I will rebase it soon. @SuhailB I originally deleted the folder containing the programs each time to prevent duplicate programs from being added, but now I'm seeing that the files are just overwritten inside The only other benefit I can think of for deleting the folder is if a new database is saved at the same location, then the previous programs will all be cleared. Again, this doesn't matter for how it's being used right now because Since you're saying that deleting tends to cause issues for larger databases I'd prefer to just remove those lines for now so I will do that here. Thanks for pointing this out! |
|
@MashAliK Sounds good. And to clarify, deleting it is causing an issue for large artifacts (a feature added after your PR), not large databases. Also feel free to use my branch for rebasing (if it has no issues) |
|
I've tested the updated-parallel-iteration version for the circle_packing_with_artifacts example for 50 iterations for each stage instead of 100 to reduce API costs. I used gemini-2.0-flash-lite and gemini-2.0-flash-lite. Results: {
"id": "788596e9-6d2b-4134-ade9-c309fdf812c2",
"generation": 2,
"iteration": 6,
"timestamp": 1751574058.320835,
"parent_id": "6eb73296-9b3c-42e1-91da-2063ce7acfaf",
"metrics": {
"validity": 1.0,
"sum_radii": 1.8801588665719113,
"target_ratio": 0.7135327766876325,
"combined_score": 0.7135327766876325,
"eval_time": 0.11612915992736816
},
"language": "python",
"saved_at": 1751574423.6830008
}Stage2: {
"id": "084465de-efce-4e42-81b0-79f6a044e819",
"generation": 0,
"iteration": 0,
"timestamp": 1751574425.2020102,
"parent_id": null,
"metrics": {
"validity": 1.0,
"sum_radii": 1.8801588665719113,
"target_ratio": 0.7135327766876325,
"combined_score": 0.7135327766876325,
"eval_time": 0.11661648750305176
},
"language": "python",
"saved_at": 1751575856.2608302
}Parallel (25 cores): {
"id": "b96d6e67-76bb-4765-9f51-584d92e5d2c2",
"generation": 1,
"iteration": 20,
"timestamp": 1751577546.0471964,
"parent_id": "49307652-3ead-485f-8c21-1ce11a16ec29",
"metrics": {
"validity": 1.0,
"sum_radii": 1.8598312591600656,
"target_ratio": 0.7058183146717517,
"combined_score": 0.7058183146717517,
"eval_time": 0.22881340980529785
},
"language": "python",
"saved_at": 1751577556.0780315
}Stage2: {
"id": "1a044db7-811c-4ea6-92e3-1918c34b28a1",
"generation": 1,
"iteration": 16,
"timestamp": 1751577569.0512633,
"parent_id": "74a8ac96-f519-49e5-b3da-f4b9058cdc2e",
"metrics": {
"validity": 1.0,
"sum_radii": 2.038107874942713,
"target_ratio": 0.7734754743615609,
"combined_score": 0.7734754743615609,
"eval_time": 0.3030838966369629
},
"language": "python",
"saved_at": 1751577604.0499
}The rebased version is in here: https://github.com/SuhailB/openevolve/tree/updated-parallel-iterations @MashAliK could you rebase for @codelion to review, or should I open a new PR? |
|
@SuhailB Got it, thanks. |
Add parallel iterations along with a couple of bug fixes/improvements. I would consider this a pretty important feature because of the significant speedup it provides to training. This is the implementation that I tried and has worked for my use cases.
Primary changes:
concurrent.futuresincontroller.pyto spawn processes for performing the iterations of trainingrun_iteration_syncin a new fileiteration.pywhich is run by the workers to allow concurrent executionLLMEnsembleandProgramDatabase(these classes are not pickleable so this is the best approach I could think of to use these classes in each of the worker processes)Minor changes:
calculate_edit_distancefunction was crashing the database when I was using it and since there's already libraries that do this routine I ended up using one of them (levenshtein)_calculate_island_diversitysince it's easier to read_calculate_feature_coordssince it's normalized for code lengthallowed_population_overflowsince otherwise the database was adding and removing a program every iteration when it reach the allowed program limit