This tutorial illustrates the core Ax classes and their usage by constructing, running, and saving an experiment through the Developer API.
import pandas as pd
from ax import *
The core Experiment
class only has one required parameter, search_space
. A SearchSpace is composed of a set of parameters to be tuned in the experiment, and optionally a set of parameter constraints that define restrictions across these parameters.
Here we range over two parameters, each of which can take on values between 0 and 10.
range_param1 = RangeParameter(name="x1", lower=0.0, upper=10.0, parameter_type=ParameterType.FLOAT)
range_param2 = RangeParameter(name="x2", lower=0.0, upper=10.0, parameter_type=ParameterType.FLOAT)
search_space = SearchSpace(
parameters=[range_param1, range_param2],
)
Note that there are two other types of parameters, FixedParameter and ChoiceParameter. Although we won't use these in this example, you can create them as follows.
choice_param = ChoiceParameter(name="choice", values=["foo", "bar"], parameter_type=ParameterType.STRING)
fixed_param = FixedParameter(name="fixed", value=[True], parameter_type=ParameterType.BOOL)
Sum constraints enforce that the sum of a set of parameters is greater or less than some bound, and order constraints enforce that one parameter is smaller than the other. We won't use these either, but see two examples below.
sum_constraint = SumConstraint(
parameters=[range_param1, range_param2],
is_upper_bound=True,
bound=5.0,
)
order_constraint = OrderConstraint(
lower_parameter = range_param1,
upper_parameter = range_param2,
)
Once we have a search space, we can create an experiment.
experiment = Experiment(
name="experiment_building_blocks",
search_space=search_space,
)
We can also define control values for each parameter by adding a status quo arm to the experiment.
experiment.status_quo = Arm(
name="control",
parameters={"x1": 0.0, "x2": 0.0},
)
We can now generate arms, i.e. assignments of parameters to values, that lie within the search space. Below we use a Sobol generator to generate ten quasi-random arms.
sobol = Models.SOBOL(search_space=experiment.search_space)
generator_run = sobol.gen(5)
for arm in generator_run.arms:
print(arm)
Arm(parameters={'x1': 5.460216403007507, 'x2': 5.888960957527161}) Arm(parameters={'x1': 7.51576840877533, 'x2': 3.9955589175224304}) Arm(parameters={'x1': 1.0644689202308655, 'x2': 9.575881958007812}) Arm(parameters={'x1': 2.46075838804245, 'x2': 3.2335907220840454}) Arm(parameters={'x1': 8.97851824760437, 'x2': 7.565061450004578})
In order to perform an optimization, we also need to define an optimization config for the experiment. An optimization config is composed of an objective metric to be minimized or maximized in the experiment, and optionally a set of outcome constraints that place restrictions on how other metrics can be moved by the experiment.
In order to define an objective or outcome constraint, we first need to subclass Metric
. Metrics are used to evaluate trials, which are individual steps of the experiment sequence. Each trial contains one or more arms for which we will collect data at the same time.
Our custom metric(s) will determine how, given a trial, to compute the mean and sem of each of the trial's arms.
The only method that needs to be defined for most metric subclasses is fetch_trial_data
, which defines how a single trial is evaluated, and returns a pandas dataframe.
class BoothMetric(Metric):
def fetch_trial_data(self, trial):
records = []
for arm_name, arm in trial.arms_by_name.items():
params = arm.parameters
records.append({
"arm_name": arm_name,
"metric_name": self.name,
"mean": (params["x1"] + 2*params["x2"] - 7)**2 + (2*params["x1"] + params["x2"] - 5)**2,
"sem": 0.0,
"trial_index": trial.index,
})
return Data(df=pd.DataFrame.from_records(records))
Once we have our metric subclasses, we can define on optimization config.
optimization_config = OptimizationConfig(
objective = Objective(
metric=BoothMetric(name="booth"),
minimize=True,
),
)
experiment.optimization_config = optimization_config
Outcome constraints can also be defined as follows and passed into the optimization config.
outcome_constraint = OutcomeConstraint(
metric=Metric("constraint"),
op=ComparisonOp.LEQ,
bound=0.5,
)
Before an experiment can collect data, it must have a Runner
attached. A runner handles the deployment of trials. A trial must be "run" before it can be evaluated.
Here, we have a dummy runner that does nothing. In practice, a runner might be in charge of pushing an experiment to production.
The only method that needs to be defined for runner subclasses is run
, which performs any necessary deployment logic, and returns a dictionary of resulting metadata.
class MyRunner(Runner):
def run(self, trial):
return {"name": str(trial.index)}
experiment.runner = MyRunner()
Now we can collect data for arms within our search space and begin the optimization. We do this by:
experiment.new_batch_trial(generator_run=generator_run)
BatchTrial(experiment_name='experiment_building_blocks', index=0, status=TrialStatus.CANDIDATE)
Note that the arms attached to the trial are the same as those in the generator run above, except for the status quo, which is automatically added to each trial.
for arm in experiment.trials[0].arms:
print(arm)
Arm(name='0_0', parameters={'x1': 5.460216403007507, 'x2': 5.888960957527161}) Arm(name='0_1', parameters={'x1': 7.51576840877533, 'x2': 3.9955589175224304}) Arm(name='0_2', parameters={'x1': 1.0644689202308655, 'x2': 9.575881958007812}) Arm(name='0_3', parameters={'x1': 2.46075838804245, 'x2': 3.2335907220840454}) Arm(name='0_4', parameters={'x1': 8.97851824760437, 'x2': 7.565061450004578}) Arm(name='control', parameters={'x1': 0.0, 'x2': 0.0})
If our trial should contain contain only one arm, we can use experiment.new_trial
instead.
experiment.new_trial().add_arm(Arm(name='single_arm', parameters={'x1': 1, 'x2': 1}))
Trial(experiment_name='experiment_building_blocks', index=1, status=TrialStatus.CANDIDATE)
print(experiment.trials[1].arm)
Arm(name='single_arm', parameters={'x1': 1, 'x2': 1})
experiment.trials[0].run()
BatchTrial(experiment_name='experiment_building_blocks', index=0, status=TrialStatus.RUNNING)
data = experiment.fetch_data()
We can inspect the data that was fetched for each (arm, metric) pair.
data.df
arm_name | mean | metric_name | sem | trial_index | |
---|---|---|---|---|---|
0 | 0_0 | 244.281257 | booth | 0.0 | 0 |
1 | 0_1 | 269.126528 | booth | 0.0 | 0 |
2 | 0_2 | 219.623419 | booth | 0.0 | 0 |
3 | 0_3 | 13.671655 | booth | 0.0 | 0 |
4 | 0_4 | 713.862106 | booth | 0.0 | 0 |
5 | control | 74.000000 | booth | 0.0 | 0 |
Now we can model the data collected for the initial set of arms using a Gaussian process, and use the Expected Improvement acquisition function to determine the new arms for which to fetch data next.
gpei = Models.GPEI(experiment=experiment, data=data)
generator_run = gpei.gen(5)
experiment.new_batch_trial(generator_run=generator_run)
/data/users/lilidworkin/fbsource/fbcode/buck-out/dev/gen/bento/kernels/bento_kernel_ae#link-tree/gpytorch/utils/cholesky.py:41: RuntimeWarning: A not p.d., added jitter of 1e-08 to the diagonal /data/users/lilidworkin/fbsource/fbcode/buck-out/dev/gen/bento/kernels/bento_kernel_ae#link-tree/gpytorch/utils/cholesky.py:41: RuntimeWarning: A not p.d., added jitter of 1e-08 to the diagonal
BatchTrial(experiment_name='experiment_building_blocks', index=2, status=TrialStatus.CANDIDATE)
for arm in experiment.trials[2].arms:
print(arm)
Arm(name='2_0', parameters={'x1': 3.976978462464544, 'x2': 0.3738528311099245}) Arm(name='2_1', parameters={'x1': 3.906218969525253e-16, 'x2': 4.025368511850622}) Arm(name='2_2', parameters={'x1': 7.0545208724123745, 'x2': 1.1494985717796068e-16}) Arm(name='2_3', parameters={'x1': 2.0709359453742247, 'x2': 1.4285988425124616}) Arm(name='2_4', parameters={'x1': 1.4571514910958328, 'x2': 5.382555705918932}) Arm(name='control', parameters={'x1': 0.0, 'x2': 0.0})
experiment.trials[2].run()
data = experiment.fetch_data()
data.df
arm_name | mean | metric_name | sem | trial_index | |
---|---|---|---|---|---|
0 | 0_0 | 244.281257 | booth | 0.0 | 0 |
1 | 0_1 | 269.126528 | booth | 0.0 | 0 |
2 | 0_2 | 219.623419 | booth | 0.0 | 0 |
3 | 0_3 | 13.671655 | booth | 0.0 | 0 |
4 | 0_4 | 713.862106 | booth | 0.0 | 0 |
5 | control | 74.000000 | booth | 0.0 | 0 |
6 | 2_0 | 16.251380 | booth | 0.0 | 2 |
7 | 2_1 | 2.053955 | booth | 0.0 | 2 |
8 | 2_2 | 82.977614 | booth | 0.0 | 2 |
9 | 2_3 | 4.618067 | booth | 0.0 | 2 |
10 | 2_4 | 38.141307 | booth | 0.0 | 2 |
11 | control | 74.000000 | booth | 0.0 | 2 |
At any point, we can also save our experiment to a JSON file. To ensure that our custom metrics and runner are saved properly, we first need to register them.
from ax.storage.metric_registry import register_metric
from ax.storage.runner_registry import register_runner
register_metric(BoothMetric)
register_runner(MyRunner)
save(experiment, "experiment.json")
loaded_experiment = load("experiment.json")
To save our experiment to SQL, we must first specify a connection to a database and create all necessary tables.
from ax.storage.sqa_store.db import init_engine_and_session_factory,get_engine, create_all_tables
init_engine_and_session_factory(url='sqlite:///foo.db')
engine = get_engine()
create_all_tables(engine)
sqa_store.save(experiment)
sqa_store.load(experiment.name)
Experiment(experiment_building_blocks)
SimpleExperiment
is a subclass of Experiment
that assumes synchronous evaluation of trials and is therefore able to abstract away certain details and enable faster instantiation.
Rather than defining custom metrics and an optimization config, we define an evaluation function that determines the mean and sem for a given parameterization.
def evaluation_function(params):
return (params["x1"] + 2*params["x2"] - 7)**2 + (2*params["x1"] + params["x2"] - 5)**2
simple_experiment = SimpleExperiment(
search_space=search_space,
evaluation_function=evaluation_function,
)
We add trials and evaluate them as before.
simple_experiment.new_trial().add_arm(Arm(name='single_arm', parameters={'x1': 1, 'x2': 1}))
Trial(experiment_name='None', index=0, status=TrialStatus.CANDIDATE)
data = simple_experiment.fetch_data()
data.df
arm_name | mean | metric_name | sem | trial_index | |
---|---|---|---|---|---|
0 | single_arm | 20.0 | objective | 0.0 | 0 |