Retrieving Gaussian Process model details in BayBE #737
-
|
Hello, I am working on an active learning project using the BayBE library (GP surrogate model + Bayesian optimization) and have a few questions about retrieving details of the Gaussian Process model which are computed internally by BayBE. Briefly, our model has eight inputs (x1-x8) and one target. The active learning loop recommends 10 experiments per cycle to help minimize the target. We would like to report the following information in our publication:
When I tried to retrieve the surrogate model, I could only obtain the following: Could you please help us retrieve this information? If there are other internal model parameters in BayBE that are important to report, we would be grateful for your advice and guidance on how to retrieve them as well. Thank you in advance for your help! Code: # Define the optimization objective
target_name = "Target"
target_mode = "MIN"
target = NumericalTarget(name=target_name, mode=target_mode, bounds=(-5,5))
objective = SingleTargetObjective(target=target)
# Define the parameters for the search space and their bounds
parameters = [NumericalContinuousParameter(name="x1", bounds=(0,100),),
NumericalContinuousParameter(name="x2", bounds=(0,100),),
NumericalContinuousParameter(name="x3", bounds=(0,100),),
NumericalContinuousParameter(name="x4", bounds=(0,100),),
NumericalContinuousParameter(name="x5", bounds=(553,613),),
NumericalContinuousParameter(name="x6", bounds=(10,50),),
NumericalContinuousParameter(name="x7", bounds=(12000,192000),),
NumericalContinuousParameter(name="x8", bounds=(1.5,4),),]
# Define the constraints: Sum of x1-x4 must be equal to 100
constraint = ContinuousLinearConstraint(parameters=["x1", "x2", "x3", "x4"], operator="=", coefficients=[1.0, 1.0, 1.0, 1.0], rhs=100,)
# Import the search space
searchspace = SearchSpace.from_product(parameters=parameters, constraints=[constraint])
# Create the recommender and campaign
acquisition_function = "qLogEI"
recommender = BotorchRecommender(surrogate_model=GaussianProcessSurrogate(), acquisition_function=acquisition_function,)
campaign = Campaign(searchspace=searchspace, objective=objective, recommender=recommender,)
campaign.add_measurements(df)
# Get recommendations
import time
start_time = time.time()
recommendation = campaign.recommend(batch_size=10)
print(recommendation.round(2))
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Execution time: {elapsed_time:.2f} seconds")
# Retrieve the surrogate model
surrogate_model = campaign.recommender.get_surrogate(searchspace=campaign.searchspace, objective=campaign.objective, measurements=campaign.measurements)
recommendation_df = pd.DataFrame(recommendation)
# Display the model details and recommended experiments
print(f"Target: {target_name}, Mode: {target_mode}")
print(f"Model: {surrogate_model}")
print(f"Acquisition function: {acquisition_function}")
display(recommendation_df.round(2)) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
|
Hi @inogueroles 👋🏼 Great to hear that you've decided to use our framework. Before I answer your questions, here a few general comments:
Will come back with answers to your questions later, need to finish a few other things first 🙃 |
Beta Was this translation helpful? Give feedback.
-
|
@inogueroles The lockfile is a snapshot of the exact enviroment you used for this publication. Anyone can use this lockfile and recreate your results, it even includes secondary dependencies which are used by baybe indirectly. It is also implicitly a complete snapshot of all the algorithmic details including hyperparamters etc BayBE uses because the snapshot will pin the exact version of baybe. Here is an example of how a lockfile looks and how you can create yours. If you provide this + your data + your production scripts there is everything a journal could expect for reproducibility, it is actually much more complete and practical than just writing down the hyperparameter values etc. This is just a suggestion. We can nonetheless provide an answer to the retrieval question but I'd have to look it up and @AdrianSosic is likely much faster |
Beta Was this translation helpful? Give feedback.
Hi @inogueroles, now I have some more time to answer.
Note that I fully agree with @Scienfitz's suggestion of submitting a lock file. This adds much more to reproducibility than just reporting model details in text form. Especially since things are subject to constant change/improvement (see details below):
Nevertheless, here my input to your points:
DefaultKernelFactoryto create an appropriate kernel for the specified optimization problem. Currently, we're using a smoothed version of the EDBO model (see docstring). But please note: This default behavior is supposed to be changed occasionally, i.e. whenever there is a good reason t…