Skip to content

Conversation

@zachcran
Copy link

We are currently trying to benchmark reaction optimization algorithms, including EDBO+. However, while benchmarking we noticed that each iteration of EDBO+ was a time-consuming process, especially when we need ~10000 iteration steps per benchmarking function. After some performance profiling, we found and optimized a few sections to help speed up the process.

The first improvement was achieved by optimizing calculating the training and testing indicies in EDBOplus.run(). This ended up speeding up each EDBOplus.run() call by 40x. To do this, I removed a redundant calculation of internal_df and used a more efficient lookup method for rows containing "PENDING" to create it.

Additionally, I added an extra, optional parameter to EDBOplus.run() called write_extra_data to make writing the predictions file at each step optional. This helps save time, since the file can be quite large depending upon your reaction space. To maintain status quo, I made the default value True so the prediction file is still written.

I also included some comments about the details of the timing improvements for reference, but they can be removed if necessary.

@zachcran
Copy link
Author

It has been some time since I submitted this PR. Can I get some guidance or feedback on what needs to be done to get this merged? These performance improvements are substantial, but the code changes are pretty minor. I can definitely resolve the merge conflicts if I have some assurance that this will move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant