This tutorial uses synthetic functions to illustrate Bayesian optimization using a multi-task Gaussian Process in Ax. A typical use case is optimizing an expensive-to-evaluate (online) system with supporting (offline) simulations of that system.
Bayesian optimization with a multi-task kernel (Multi-task Bayesian optimization) is described by Swersky et al. (2013). Letham and Bakshy (2019) describe using multi-task Bayesian optimization to tune a ranking system with a mix of online and offline (simulator) experiments.
This tutorial produces the results of Online Appendix 2 from that paper.
The synthetic problem used here is to maximize the Hartmann 6 function, a classic optimization test problem in 6 dimensions, subject to a constraint on the 2-norm of $x$. Both the objective and the constraint are treated as unknown and are modeled with separate GPs. Both objective and constraint are noisy.
Throughout the optimization we can make nosiy observations directly of the objective and constraint (an online observation), and we can make noisy observations of a biased version of the objective and constraint (offline observations). Bias is simulated by passing the function values through a piecewise linear function. Offline observations are much less time-consuming than online observations, so we wish to use them to improve our ability to optimize the online objective.
from copy import deepcopy
import numpy as np
import pandas as pd
from scipy.stats import norm
import time
from ax.core.data import Data
from ax.core.observation import ObservationFeatures, observations_from_data
from ax.core.optimization_config import OptimizationConfig
from ax.core.search_space import SearchSpace
from ax.core.objective import Objective
from ax.runners.synthetic import SyntheticRunner
from ax.modelbridge.random import RandomModelBridge
from ax.core.outcome_constraint import OutcomeConstraint
from ax.core.types import ComparisonOp
from ax.core.parameter import RangeParameter, ParameterType
from ax.core.multi_type_experiment import MultiTypeExperiment
from ax.metrics.hartmann6 import Hartmann6Metric
from ax.metrics.l2norm import L2NormMetric
from ax.modelbridge.factory import get_sobol, get_GPEI, get_MTGP
from ax.core.generator_run import GeneratorRun
from ax.plot.diagnostic import interact_batch_comparison
from ax.plot.trace import optimization_trace_all_methods
from ax.utils.notebook.plotting import init_notebook_plotting, render
init_notebook_plotting()
[INFO 12-08 18:53:43] ax.utils.notebook.plotting: Injecting Plotly library into cell. Do not overwrite or delete cell.
For this example, the online system is optimizing a Hartmann6 function subject to a 2-norm constraint. The Metric objects for these are directly imported above. We create analagous offline versions of these two metrics which are identical but have a transform applied (a piecewise linear function). We construct Metric objects for each of them.
# Create metrics with artificial offline bias, for both objective and constraint
# by passing the true values through a piecewise linear function.
class OfflineHartmann6Metric(Hartmann6Metric):
def f(self, x: np.ndarray) -> float:
raw_res = super().f(x)
m = -0.35
if raw_res < m:
return (1.5* (raw_res - m)) + m
else:
return (6.0 * (raw_res - m)) + m
class OfflineL2NormMetric(L2NormMetric):
def f(self, x: np.ndarray) -> float:
raw_res = super().f(x)
m = 1.25
if raw_res < m:
return (0.8 * (raw_res - m)) + m
else:
return (4 * (raw_res - m)) + m
A MultiTypeExperiment is used for managing online and offline trials together. It is constructed in several steps:
Finally, because this is a synthetic benchmark problem where the true function values are known, we will also register metrics with the true (noiseless) function values for plotting below.
def get_experiment(include_true_metric=True):
noise_sd = 0.1 # Observations will have this much Normal noise added to them
# 1. Create simple search space for [0,1]^d, d=6
param_names = [f"x{i}" for i in range(6)]
parameters=[
RangeParameter(
name=param_names[i], parameter_type=ParameterType.FLOAT, lower=0.0, upper=1.0
)
for i in range(6)
]
search_space=SearchSpace(parameters=parameters)
# 2. Specify optimization config
online_objective = Hartmann6Metric("objective", param_names=param_names, noise_sd=noise_sd)
online_constraint = L2NormMetric("constraint", param_names=param_names, noise_sd=noise_sd)
opt_config = OptimizationConfig(
objective=Objective(online_objective, minimize=True),
outcome_constraints=[OutcomeConstraint(online_constraint, op=ComparisonOp.LEQ, bound=1.25, relative=False)]
)
# 3. Init experiment
exp = MultiTypeExperiment(
name="mt_exp",
search_space=search_space,
default_trial_type="online",
default_runner=SyntheticRunner(),
optimization_config=opt_config,
)
# 4. Establish offline trial_type, and how those trials are deployed
exp.add_trial_type("offline", SyntheticRunner())
# 5. Add offline metrics that provide biased estimates of the online metrics
offline_objective = OfflineHartmann6Metric("offline_objective", param_names=param_names, noise_sd=noise_sd)
offline_constraint = OfflineL2NormMetric("offline_constraint", param_names=param_names, noise_sd=noise_sd)
# Associate each offline metric with corresponding online metric
exp.add_tracking_metric(metric=offline_objective, trial_type="offline", canonical_name="objective")
exp.add_tracking_metric(metric=offline_constraint, trial_type="offline", canonical_name="constraint")
# Add a noiseless equivalent for each metric, for tracking the true value of each observation
# for the purposes of benchmarking.
if include_true_metric:
exp.add_tracking_metric(Hartmann6Metric("objective_noiseless", param_names=param_names, noise_sd=0.0), "online")
exp.add_tracking_metric(L2NormMetric("constraint_noiseless", param_names=param_names, noise_sd=0.0), "online")
return exp
These figures compare the online measurements to the offline measurements on a random set of points, for both the objective metric and the constraint metric. You can see the offline measurements are biased but highly correlated. This produces Fig. S3 from the paper.
# Generate 50 points from a Sobol sequence
exp = get_experiment(include_true_metric=False)
s = get_sobol(exp.search_space, scramble=False)
gr = s.gen(50)
# Deploy them both online and offline
exp.new_batch_trial(trial_type="online", generator_run=gr).run()
exp.new_batch_trial(trial_type="offline", generator_run=gr).run()
# Fetch data
data = exp.fetch_data()
observations = observations_from_data(exp, data)
# Plot the arms in batch 0 (online) vs. batch 1 (offline)
render(interact_batch_comparison(observations, exp, 1, 0))
Here we construct a Bayesian optimization loop that interleaves online and offline batches. The loop defined here is described in Algorithm 1 of the paper. We compare multi-task Bayesian optimization to regular Bayesian optimization using only online observations.
Here we measure performance over 3 repetitions of the loop. Each one takes 1-2 hours so the whole benchmark run will take several hours to complete.
# Settings for the optimization benchmark.
# This should be changed to 50 to reproduce the results from the paper.
n_reps = 3 # Number of repeated experiments, each with independent observation noise
n_init_online = 5 # Size of the quasirandom initialization run online
n_init_offline = 20 # Size of the quasirandom initialization run offline
n_opt_online = 5 # Batch size for BO selected points to be run online
n_opt_offline = 20 # Batch size for BO selected to be run offline
n_batches = 3 # Number of optimized BO batches
For the online-only case, we run n_init_online
sobol points followed by n_batches
batches of n_opt_online
points selected by the GP. This is a normal Bayesian optimization loop.
# This function runs a Bayesian optimization loop, making online observations only.
def run_online_only_bo():
t1 = time.time()
### Do BO with online only
## Quasi-random initialization
exp_online = get_experiment()
m = get_sobol(exp_online.search_space, scramble=False)
gr = m.gen(n=n_init_online)
exp_online.new_batch_trial(trial_type="online", generator_run=gr).run()
## Do BO
for b in range(n_batches):
print('Online-only batch', b, time.time() - t1)
# Fit the GP
m = get_GPEI(
experiment=exp_online,
data=exp_online.fetch_data(),
search_space=exp_online.search_space,
)
# Generate the new batch
gr = m.gen(
n=n_opt_online,
search_space=exp_online.search_space,
optimization_config=exp_online.optimization_config,
)
exp_online.new_batch_trial(trial_type="online", generator_run=gr).run()
## Extract true objective and constraint at each iteration
df = exp_online.fetch_data().df
obj = df[df['metric_name'] == 'objective_noiseless']['mean'].values
con = df[df['metric_name'] == 'constraint_noiseless']['mean'].values
return obj, con
Here we incorporate offline observations to accelerate the optimization, while using the same total number of online observations as in the loop above. The strategy here is that outlined in Algorithm 1 of the paper.
n_init_online
Sobol points online, and n_init_offline
Sobol points offline.n_opt_offline
candidates using NEI.n_opt_offline
candidates offline and observe their offline metrics.n_opt_online
of the NEI candidates, after incorporating their offline observations, and run them online.# Online batches are constructed by selecting the maximum utility points from the offline
# batch, after updating the model with the offline results. This function selects the max utility points according
# to the MTGP predictions.
def max_utility_from_GP(n, m, experiment, search_space, gr):
obsf = []
for arm in gr.arms:
params = deepcopy(arm.parameters)
params['trial_type'] = 'online'
obsf.append(ObservationFeatures(parameters=params))
# Make predictions
f, cov = m.predict(obsf)
# Compute expected utility
mu_c = np.array(f['constraint'])
sigma_c = np.sqrt(cov['constraint']['constraint'])
pfeas = norm.cdf((1.25 - mu_c) / sigma_c)
u = -np.array(f['objective']) * pfeas
best_arm_indx = np.flip(np.argsort(u))[:n]
gr_new = GeneratorRun(
arms = [
gr.arms[i] for i in best_arm_indx
],
weights = [1.] * n,
)
return gr_new
# This function runs a multi-task Bayesian optimization loop, as outlined in Algorithm 1 and above.
def run_mtbo():
t1 = time.time()
online_trials = []
## 1. Quasi-random initialization, online and offline
exp_multitask = get_experiment()
# Online points
m = get_sobol(exp_multitask.search_space, scramble=False)
gr = m.gen(
n=n_init_online,
)
tr = exp_multitask.new_batch_trial(trial_type="online", generator_run=gr)
tr.run()
online_trials.append(tr.index)
# Offline points
m = get_sobol(exp_multitask.search_space, scramble=False)
gr = m.gen(
n=n_init_offline,
)
exp_multitask.new_batch_trial(trial_type="offline", generator_run=gr).run()
## Do BO
for b in range(n_batches):
print('Multi-task batch', b, time.time() - t1)
# (2 / 7). Fit the MTGP
m = get_MTGP(
experiment=exp_multitask,
data=exp_multitask.fetch_data(),
search_space=exp_multitask.search_space,
)
# 3. Finding the best points for the online task
gr = m.gen(
n=n_opt_offline,
optimization_config=exp_multitask.optimization_config,
fixed_features=ObservationFeatures(parameters={'trial_type': 'online'}),
)
# 4. But launch them offline
exp_multitask.new_batch_trial(trial_type="offline", generator_run=gr).run()
# 5. Update the model
m = get_MTGP(
experiment=exp_multitask,
data=exp_multitask.fetch_data(),
search_space=exp_multitask.search_space,
)
# 6. Select max-utility points from the offline batch to generate an online batch
gr = max_utility_from_GP(
n=n_opt_online,
m=m,
experiment=exp_multitask,
search_space=exp_multitask.search_space,
gr=gr,
)
tr = exp_multitask.new_batch_trial(trial_type="online", generator_run=gr)
tr.run()
online_trials.append(tr.index)
# Extract true objective at each online iteration for creating benchmark plot
obj = np.array([])
con = np.array([])
for tr in online_trials:
df_t = exp_multitask.trials[tr].fetch_data().df
df_tobj = df_t[df_t['metric_name'] == 'objective_noiseless']
obj = np.hstack((obj, df_tobj['mean'].values))
df_tcon = df_t[df_t['metric_name'] == 'constraint_noiseless']
con = np.hstack((con, df_tcon['mean'].values))
return obj, con
Run both Bayesian optimization loops and aggregate results.
runners = {
'GP, online only': run_online_only_bo,
'MTGP': run_mtbo,
}
iteration_objectives = {k: [] for k in runners}
iteration_constraints = {k: [] for k in runners}
for rep in range(n_reps):
print('Running rep', rep)
for k, r in runners.items():
obj, con = r()
iteration_objectives[k].append(obj)
iteration_constraints[k].append(con)
for k, v in iteration_objectives.items():
iteration_objectives[k] = np.array(v)
iteration_constraints[k] = np.array(iteration_constraints[k])
Running rep 0 Online-only batch 0 0.0038979053497314453 Online-only batch 1 41.28154897689819 Online-only batch 2 97.07738637924194 Multi-task batch 0 0.00837850570678711
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-8-6b4facc37f42> in <module> 8 print('Running rep', rep) 9 for k, r in runners.items(): ---> 10 obj, con = r() 11 iteration_objectives[k].append(obj) 12 iteration_constraints[k].append(con) <ipython-input-7-88fb8b2e972e> in run_mtbo() 51 experiment=exp_multitask, 52 data=exp_multitask.fetch_data(), ---> 53 search_space=exp_multitask.search_space, 54 ) 55 ~/build/facebook/Ax/ax/modelbridge/factory.py in get_MTGP(experiment, data, search_space, trial_index) 271 torch_dtype=torch.double, 272 torch_device=DEFAULT_TORCH_DEVICE, --> 273 status_quo_features=status_quo_features, 274 ) 275 ~/build/facebook/Ax/ax/modelbridge/torch.py in __init__(self, experiment, search_space, data, model, transforms, transform_configs, torch_dtype, torch_device, status_quo_name, status_quo_features, optimization_config, fit_out_of_design, default_model_gen_options) 74 status_quo_features=status_quo_features, 75 optimization_config=optimization_config, ---> 76 fit_out_of_design=fit_out_of_design, 77 ) 78 ~/build/facebook/Ax/ax/modelbridge/base.py in __init__(self, search_space, model, transforms, experiment, data, transform_configs, status_quo_name, status_quo_features, optimization_config, fit_out_of_design) 162 search_space=search_space, 163 observation_features=obs_feats, --> 164 observation_data=obs_data, 165 ) 166 self.fit_time = time.time() - t_fit_start ~/build/facebook/Ax/ax/modelbridge/torch.py in _fit(self, model, search_space, observation_features, observation_data) 100 search_space=search_space, 101 observation_features=observation_features, --> 102 observation_data=observation_data, 103 ) 104 ~/build/facebook/Ax/ax/modelbridge/array.py in _fit(self, model, search_space, observation_features, observation_data) 97 metric_names=self.outcomes, 98 fidelity_features=list(target_fidelities.keys()), ---> 99 candidate_metadata=candidate_metadata, 100 ) 101 ~/build/facebook/Ax/ax/modelbridge/torch.py in _model_fit(self, model, Xs, Ys, Yvars, bounds, task_features, feature_names, metric_names, fidelity_features, candidate_metadata) 144 metric_names=metric_names, 145 fidelity_features=fidelity_features, --> 146 candidate_metadata=candidate_metadata, 147 ) 148 ~/build/facebook/Ax/ax/models/torch/botorch.py in fit(self, Xs, Ys, Yvars, bounds, task_features, feature_names, metric_names, fidelity_features, candidate_metadata) 295 metric_names=self.metric_names, 296 use_input_warping=self.use_input_warping, --> 297 **self._kwargs, 298 ) 299 ~/build/facebook/Ax/ax/models/torch/botorch_defaults.py in get_and_fit_model(Xs, Ys, Yvars, task_features, fidelity_features, metric_names, state_dict, refit_model, use_input_warping, **kwargs) 171 # pyre-ignore: [16] 172 mll = ExactMarginalLogLikelihood(model.likelihood, model) --> 173 mll = fit_gpytorch_model(mll, bounds=bounds) 174 return model 175 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/botorch/fit.py in fit_gpytorch_model(mll, optimizer, **kwargs) 65 for mll_ in mll.mlls: 66 fit_gpytorch_model( ---> 67 mll=mll_, optimizer=optimizer, max_retries=max_retries, **kwargs 68 ) 69 return mll ~/virtualenv/python3.7.1/lib/python3.7/site-packages/botorch/fit.py in fit_gpytorch_model(mll, optimizer, **kwargs) 124 mll.model.load_state_dict(original_state_dict) 125 sample_all_priors(mll.model) --> 126 mll, _ = optimizer(mll, track_iterations=False, **kwargs) 127 if not any(issubclass(w.category, OptimizationWarning) for w in ws): 128 _set_transformed_inputs(mll=mll) ~/virtualenv/python3.7.1/lib/python3.7/site-packages/botorch/optim/fit.py in fit_gpytorch_scipy(mll, bounds, method, options, track_iterations, approx_mll, scipy_objective, module_to_array_func, module_from_array_func) 245 jac=True, 246 options=options, --> 247 callback=cb, 248 ) 249 iterations = [] ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options) 616 elif meth == 'l-bfgs-b': 617 return _minimize_lbfgsb(fun, x0, args, jac, bounds, --> 618 callback=callback, **options) 619 elif meth == 'tnc': 620 return _minimize_tnc(fun, x0, args, jac, bounds, callback=callback, ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py in _minimize_lbfgsb(fun, x0, args, jac, bounds, disp, maxcor, ftol, gtol, eps, maxfun, maxiter, iprint, callback, maxls, finite_diff_rel_step, **unknown_options) 306 sf = _prepare_scalar_function(fun, x0, jac=jac, args=args, epsilon=eps, 307 bounds=new_bounds, --> 308 finite_diff_rel_step=finite_diff_rel_step) 309 310 func_and_grad = sf.fun_and_grad ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/optimize.py in _prepare_scalar_function(fun, x0, jac, args, bounds, epsilon, finite_diff_rel_step, hess) 260 # calculation reduces overall function evaluations. 261 sf = ScalarFunction(fun, x0, args, grad, hess, --> 262 finite_diff_rel_step, bounds, epsilon=epsilon) 263 264 return sf ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py in __init__(self, fun, x0, args, grad, hess, finite_diff_rel_step, finite_diff_bounds, epsilon) 74 75 self._update_fun_impl = update_fun ---> 76 self._update_fun() 77 78 # Gradient evaluation ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py in _update_fun(self) 164 def _update_fun(self): 165 if not self.f_updated: --> 166 self._update_fun_impl() 167 self.f_updated = True 168 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py in update_fun() 71 72 def update_fun(): ---> 73 self.f = fun_wrapped(self.x) 74 75 self._update_fun_impl = update_fun ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py in fun_wrapped(x) 68 def fun_wrapped(x): 69 self.nfev += 1 ---> 70 return fun(x, *args) 71 72 def update_fun(): ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/optimize.py in __call__(self, x, *args) 72 def __call__(self, x, *args): 73 """ returns the the function value """ ---> 74 self._compute_if_needed(x, *args) 75 return self._value 76 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/scipy/optimize/optimize.py in _compute_if_needed(self, x, *args) 66 if not np.all(x == self.x) or self._value is None or self.jac is None: 67 self.x = np.asarray(x).copy() ---> 68 fg = self.fun(x, *args) 69 self.jac = fg[1] 70 self._value = fg[0] ~/virtualenv/python3.7.1/lib/python3.7/site-packages/botorch/optim/utils.py in _scipy_objective_and_grad(x, mll, property_dict) 217 return float("nan"), np.full_like(x, "nan") 218 else: --> 219 raise e # pragma: nocover 220 loss.backward() 221 param_dict = OrderedDict(mll.named_parameters()) ~/virtualenv/python3.7.1/lib/python3.7/site-packages/botorch/optim/utils.py in _scipy_objective_and_grad(x, mll, property_dict) 210 mll.zero_grad() 211 try: # catch linear algebra errors in gpytorch --> 212 output = mll.model(*train_inputs) 213 args = [output, train_targets] + _get_extra_mll_args(mll) 214 loss = -mll(*args).sum() ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/models/exact_gp.py in __call__(self, *args, **kwargs) 255 if not all(torch.equal(train_input, input) for train_input, input in zip(train_inputs, inputs)): 256 raise RuntimeError("You must train on the training inputs!") --> 257 res = super().__call__(*inputs, **kwargs) 258 return res 259 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/module.py in __call__(self, *inputs, **kwargs) 26 27 def __call__(self, *inputs, **kwargs): ---> 28 outputs = self.forward(*inputs, **kwargs) 29 if isinstance(outputs, list): 30 return [_validate_module_outputs(output) for output in outputs] ~/virtualenv/python3.7.1/lib/python3.7/site-packages/botorch/models/multitask.py in forward(self, x) 165 covar_i = self.task_covar_module(task_idcs) 166 # Combine the two in an ICM fashion --> 167 covar = covar_x.mul(covar_i) 168 return MultivariateNormal(mean_x, covar) 169 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/lazy/lazy_tensor.py in mul(self, other) 1160 return self._mul_constant(other.view(*other.shape[:-2])) 1161 -> 1162 return self._mul_matrix(lazify(other)) 1163 1164 def ndimension(self): ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/lazy/lazy_tensor.py in _mul_matrix(self, other) 504 other = other.evaluate_kernel() 505 if isinstance(self, NonLazyTensor) or isinstance(other, NonLazyTensor): --> 506 return NonLazyTensor(self.evaluate() * other.evaluate()) 507 else: 508 left_lazy_tensor = self if self._root_decomposition_size() < other._root_decomposition_size() else other ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/utils/memoize.py in g(self, *args, **kwargs) 57 kwargs_pkl = pickle.dumps(kwargs) 58 if not _is_in_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl): ---> 59 return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl) 60 return _get_from_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl) 61 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/lazy/lazy_tensor.py in evaluate(self) 904 eye = torch.eye(num_cols, dtype=self.dtype, device=self.device) 905 eye = eye.expand(*self.batch_shape, num_cols, num_cols) --> 906 res = self.matmul(eye) 907 return res 908 ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/lazy/interpolated_lazy_tensor.py in matmul(self, tensor) 400 # right_interp^T * tensor 401 base_size = self.base_lazy_tensor.size(-1) --> 402 right_interp_res = left_t_interp(self.right_interp_indices, self.right_interp_values, tensor, base_size) 403 404 # base_lazy_tensor * right_interp^T * tensor ~/virtualenv/python3.7.1/lib/python3.7/site-packages/gpytorch/utils/interpolation.py in left_t_interp(interp_indices, interp_values, rhs, output_dim) 228 else: 229 cls = getattr(torch.sparse, type_name) --> 230 summing_matrix = cls(summing_matrix_indices, summing_matrix_values, size) 231 232 # Sum up the values appropriately by performing sparse matrix multiplication RuntimeError: size is inconsistent with indices: for dim 1, size is 1 but found index 1
We plot the cumulative best point found by each online iteration of the optimization for each of the methods. We see that despite the bias in the offline observations, incorporating them into the optimziation with the multi-task model allows the optimization to converge to the optimal point much faster. With n_reps=50, this generates Fig. S4 of the paper.
best_objectives = {}
for m, obj in iteration_objectives.items():
x = obj.copy()
z = iteration_constraints[m].copy()
best_objectives[m] = np.array([np.minimum.accumulate(obj_i) for obj_i in x])
render(
optimization_trace_all_methods({k: best_objectives[k] for k in runners})
)
Benjamin Letham and Eytan Bakshy. Bayesian optimization for policy search via online-offline experimentation. arXiv preprint arXiv:1603.09326, 2019.
Kevin Swersky, Jasper Snoek, and Ryan P Adams. Multi-task Bayesian optimization. In Advances in Neural Information Processing Systems 26, NIPS, pages 2004–2012, 2013.