ax.benchmark

Benchmark

Benchmark Method

class ax.benchmark.benchmark_method.BenchmarkMethod(name: str, generation_strategy: GenerationStrategy, scheduler_options: SchedulerOptions, distribute_replications: bool = False)[source]

Bases: Base

Benchmark method, represented in terms of Ax generation strategy (which tells us which models to use when) and scheduler options (which tell us extra execution information like maximum parallelism, early stopping configuration, etc.).

Note: If BenchmarkMethod.scheduler_options.total_trials is less than BenchmarkProblem.num_trials then only the number of trials specified in the former will be run.

Note: The generation_strategy passed in is assumed to be in its “base state”, as it will be cloned and reset.

distribute_replications: bool = False
generation_strategy: GenerationStrategy
name: str
scheduler_options: SchedulerOptions
ax.benchmark.benchmark_method.get_benchmark_scheduler_options(timeout_hours: int = 4, batch_size: int = 1) SchedulerOptions[source]

The typical SchedulerOptions used in benchmarking.

Currently, regardless of batch size, all pending trials must complete before new ones are generated. That is, when batch_size > 1, the design is “batch sequential”, and when batch_size = 1, the design is “fully sequential.”

Parameters:
  • timeout_hours – The maximum amount of time (in hours) to run each benchmark replication. Defaults to 4 hours.

  • batch_size – Number of trials to generate at once.

Benchmark Problem

class ax.benchmark.benchmark_problem.BenchmarkProblem(name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]

Bases: Base

Benchmark problem, represented in terms of Ax search space, optimization config, and runner.

classmethod from_botorch(test_problem_class: Type[BaseTestProblem], test_problem_kwargs: Dict[str, Any], lower_is_better: bool, num_trials: int, observe_noise_sd: bool = False) BenchmarkProblem[source]

Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner and retrieved by the Metric.

Parameters:
  • test_problem_class – The BoTorch test problem class which will be used to define the search_space, optimization_config, and runner.

  • test_problem_kwargs – Keyword arguments used to instantiate the test_problem_class.

  • num_trials – Simply the num_trials of the BenchmarkProblem created.

  • observe_noise_sd – Whether the standard deviation of the observation noise is observed or not (in which case it must be inferred by the model). This is separate from whether synthetic noise is added to the problem, which is controlled by the noise_std of the test problem.

property observe_noise_stds: Union[bool, Dict[str, bool]]
property runner: Runner
class ax.benchmark.benchmark_problem.BenchmarkProblemProtocol(*args, **kwargs)[source]

Bases: Protocol

Specifies the interface any benchmark problem must adhere to.

Classes implementing this interface include BenchmarkProblem, SurrogateBenchmarkProblem, and MOOSurrogateBenchmarkProblem.

has_ground_truth: bool
is_noiseless: bool
name: str
num_trials: int
observe_noise_stds: Union[bool, Dict[str, bool]]
optimization_config: OptimizationConfig
abstract property runner: Runner
search_space: SearchSpace
tracking_metrics: List[BenchmarkMetricBase]
class ax.benchmark.benchmark_problem.BenchmarkProblemWithKnownOptimum(*args, **kwargs)[source]

Bases: Protocol

optimal_value: float
class ax.benchmark.benchmark_problem.MultiObjectiveBenchmarkProblem(maximum_hypervolume: float, reference_point: List[float], *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]

Bases: BenchmarkProblem

A BenchmarkProblem support multiple objectives. Rather than knowing each objective’s optimal value we track a known maximum hypervolume computed from a given reference point.

classmethod from_botorch_multi_objective(test_problem_class: Type[MultiObjectiveTestProblem], test_problem_kwargs: Dict[str, Any], num_trials: int, observe_noise_sd: bool = False) MultiObjectiveBenchmarkProblem[source]

Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner once per trial and each Metric will retrieve its own result by index.

property optimal_value: float
class ax.benchmark.benchmark_problem.SingleObjectiveBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]

Bases: BenchmarkProblem

The most basic BenchmarkProblem, with a single objective and a known optimal value.

classmethod from_botorch_synthetic(test_problem_class: Type[SyntheticTestFunction], test_problem_kwargs: Dict[str, Any], lower_is_better: bool, num_trials: int, observe_noise_sd: bool = False) SingleObjectiveBenchmarkProblem[source]

Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner and retrieved by the Metric.

Benchmark Result

class ax.benchmark.benchmark_result.AggregatedBenchmarkResult(name: str, results: List[BenchmarkResult], optimization_trace: pandas.DataFrame, score_trace: pandas.DataFrame, fit_time: List[float], gen_time: List[float])[source]

Bases: Base

The result of a benchmark test, or series of replications. Scalar data present in the BenchmarkResult is here represented as (mean, sem) pairs.

fit_time: List[float]
classmethod from_benchmark_results(results: List[BenchmarkResult]) AggregatedBenchmarkResult[source]

Aggregrates a list of BenchmarkResults. For various reasons (timeout, errors, etc.) each BenchmarkResult may have a different number of trials; aggregated traces and statistics are computed with and truncated to the minimum trial count to ensure each replication is included.

gen_time: List[float]
name: str
optimization_trace: pandas.DataFrame
results: List[BenchmarkResult]
score_trace: pandas.DataFrame
class ax.benchmark.benchmark_result.BenchmarkResult(name: str, seed: int, optimization_trace: ndarray, score_trace: ndarray, fit_time: float, gen_time: float, experiment: Optional[Experiment] = None, experiment_storage_id: Optional[str] = None)[source]

Bases: Base

The result of a single optimization loop from one (BenchmarkProblem, BenchmarkMethod) pair.

experiment: Optional[Experiment] = None
experiment_storage_id: Optional[str] = None
fit_time: float
gen_time: float
name: str
optimization_trace: ndarray
score_trace: ndarray
seed: int

Benchmark

Module for benchmarking Ax algorithms.

Key terms used:

  • Replication: 1 run of an optimization loop; (BenchmarkProblem, BenchmarkMethod) pair.

  • Test: multiple replications, ran for statistical significance.

  • Full run: multiple tests on many (BenchmarkProblem, BenchmarkMethod) pairs.

  • Method: (one of) the algorithm(s) being benchmarked.

  • Problem: a synthetic function, a surrogate surface, or an ML model, on which to assess the performance of algorithms.

ax.benchmark.benchmark.benchmark_multiple_problems_methods(problems: Iterable[BenchmarkProblemProtocol], methods: Iterable[BenchmarkMethod], seeds: Iterable[int]) List[AggregatedBenchmarkResult][source]

For each problem and method in the Cartesian product of problems and methods, run the replication on each seed in seeds and get the results as an AggregatedBenchmarkResult, then return a list of each AggregatedBenchmarkResult.

ax.benchmark.benchmark.benchmark_one_method_problem(problem: BenchmarkProblemProtocol, method: BenchmarkMethod, seeds: Iterable[int]) AggregatedBenchmarkResult[source]
ax.benchmark.benchmark.benchmark_replication(problem: BenchmarkProblemProtocol, method: BenchmarkMethod, seed: int) BenchmarkResult[source]

Runs one benchmarking replication (equivalent to one optimization loop).

Parameters:
  • problem – The BenchmarkProblem to test against (can be synthetic or real)

  • method – The BenchmarkMethod to test

  • seed – The seed to use for this replication.

ax.benchmark.benchmark.compute_score_trace(optimization_trace: ndarray, num_baseline_trials: int, problem: BenchmarkProblemProtocol) ndarray[source]

Computes a score trace from the optimization trace.

ax.benchmark.benchmark.make_ground_truth_metrics(problem: BenchmarkProblemProtocol, include_tracking_metrics: bool = True) Dict[str, Metric][source]

Makes a ground truth version for each metric defined on the problem.

Parameters:
  • problem – The BenchmarkProblem to test against (can be synthetic or real).

  • include_tracking_metrics – Whether or not to include tracking metrics.

Returns:

A dict mapping (original) metric names to their respective ground truth metric.

ax.benchmark.benchmark.make_ground_truth_optimization_config(experiment: Experiment) OptimizationConfig[source]

Makes a clone of the OptimizationConfig on the experiment in which each metric is replaced by its respective “ground truth” counterpart, which has been added to the experiment’s tracking metrics in _create_benchmark_experiment and which returns the ground truth (i.e., uncorrupted by noise) observations.

Benchmark Methods Modular BoTorch

ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_acquisition(model_cls: Type[Model], acquisition_cls: Type[AcquisitionFunction], distribute_replications: bool, scheduler_options: Optional[SchedulerOptions] = None, name: Optional[str] = None, num_sobol_trials: int = 5, model_gen_kwargs: Optional[Dict[str, Any]] = None) BenchmarkMethod[source]

Get a BenchmarkMethod that uses Sobol followed by MBM.

Parameters:
  • model_cls – BoTorch model class, e.g. SingleTaskGP

  • acquisition_cls – Acquisition function class, e.g. qLogNoisyExpectedImprovement.

  • distribute_replications – Whether to use multiple machines

  • scheduler_options – Passed as-is to scheduler. Default: get_benchmark_scheduler_options().

  • name – Name that will be attached to the GenerationStrategy.

  • num_sobol_trials – Number of Sobol trials; if the scheduler_options specify to use `BatchTrial`s, then this refers to the number of `BatchTrial`s.

  • model_gen_kwargs – Passed to the BoTorch GenerationStep and ultimately to the BoTorch Model.

Example

>>> # A simple example
>>> from ax.benchmark.methods.sobol_botorch_modular import (
...     get_sobol_botorch_modular_acquisition
... )
>>> from ax.benchmark.benchmark_method import get_benchmark_scheduler_options
>>>
>>> method = get_sobol_botorch_modular_acquisition(
...     model_cls=SingleTaskGP,
...     acquisition_cls=qLogNoisyExpectedImprovement,
...     distribute_replications=False,
... )
>>> # Pass sequential=False to BoTorch's optimize_acqf
>>> batch_method = get_sobol_botorch_modular_acquisition(
...     model_cls=SingleTaskGP,
...     acquisition_cls=qLogNoisyExpectedImprovement,
...     distribute_replications=False,
...     scheduler_options=get_benchmark_scheduler_options(
...         batch_size=5,
...     ),
...     model_gen_kwargs={
...         "model_gen_options": {
...             "optimizer_kwargs": {"sequential": False}
...         }
...     },
...     num_sobol_trials=1,
... )

Benchmark Methods Sobol

ax.benchmark.methods.sobol.get_sobol_benchmark_method(distribute_replications: bool, scheduler_options: Optional[SchedulerOptions] = None) BenchmarkMethod[source]

Benchmark Metrics Base

Module containing the metric base classes for benchmarks. The key property of a benchmark metric is whether it has a ground truth or not, which is indicated by a has_ground_truth attribute of BenchmarkMetricBase. All mnetrics used in Ax bechmarks need to be subclassed from BenchmarkMetricBase.

For metrics that do have a ground truth, we can compute the performance of the optimization directly in terms of the ground truth observations (or the ground truth of the out-of-sample model-suggested best point). For metrics that do not have a ground truth, this is not possible.

The benchmarks are designed in a way so that (unless the metric is noiseless) no ground truth observations are available to the optimziation algorithm. Instead, we use separate “ground truth metrics” attached as tracking metrics to the experiment that are used to evaluate the performance after the optimization is complete. GroundTruthMetricMixin can be used to construct such ground truth metrics (with the is_ground_truth property indicating that the metric provides the ground truth) and implements naming conventions and helpers for associating a the ground truth metric to the respective metric used during the optimization.

class ax.benchmark.metrics.base.BenchmarkMetricBase(name: str, lower_is_better: Optional[bool] = None, properties: Optional[Dict[str, Any]] = None)[source]

Bases: Metric, ABC

A generic metric used for Ax Benchmarks.

has_ground_truth

Whether or not there exists a ground truth for this metric, i.e. whether each observation has an associated ground truth value. This is trivially true for deterministic metrics, and is also true for metrics where synthetic observation noise is added to its (deterministic) values. This is not true for metrics that are inherently noisy.

Type:

bool

has_ground_truth: bool
abstract make_ground_truth_metric() BenchmarkMetricBase[source]

Create a ground truth version of this metric. If metric observations are noisy, the ground truth would be the underlying noiseless values.

class ax.benchmark.metrics.base.GroundTruthMetricMixin[source]

Bases: ABC

A mixin for metrics that defines a naming convention and associated helper methods that allow mapping from a ground truth metric to its original metric and vice versa.

classmethod get_ground_truth_name(metric: Metric) str[source]
classmethod get_original_name(full_name: str) str[source]
is_ground_truth: bool = True

Benchmark Metrics Benchmark

class ax.benchmark.metrics.benchmark.BenchmarkMetric(name: str, lower_is_better: bool, observe_noise_sd: bool = True, outcome_index: Optional[int] = None)[source]

Bases: BenchmarkMetricBase

A generic metric used for observed values produced by Ax Benchmarks.

Compatible e.g. with results generated by BotorchTestProblemRunner and SurrogateRunner.

has_ground_truth

Whether or not there exists a ground truth for this metric, i.e. whether each observation has an associated ground truth value. This is trivially true for deterministic metrics, and is also true for metrics where synthetic observation noise is added to its (deterministic) values. This is not true for metrics that are inherently noisy.

Type:

bool

fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE][source]

Fetch data for one trial.

has_ground_truth: bool = True
make_ground_truth_metric() BenchmarkMetricBase[source]

Create a ground truth version of this metric.

class ax.benchmark.metrics.benchmark.GroundTruthBenchmarkMetric(original_metric: BenchmarkMetric)[source]

Bases: BenchmarkMetric, GroundTruthMetricMixin

fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE][source]

Fetch data for one trial.

make_ground_truth_metric() BenchmarkMetricBase[source]

Create a ground truth version of this metric.

Benchmark Metrics Jenatton

class ax.benchmark.metrics.jenatton.GroundTruthJenattonMetric(original_metric: JenattonMetric)[source]

Bases: JenattonMetric, GroundTruthMetricMixin

class ax.benchmark.metrics.jenatton.JenattonMetric(name: str = 'jenatton', noise_std: float = 0.0, observe_noise_sd: bool = False)[source]

Bases: BenchmarkMetricBase

Jenatton metric for hierarchical search spaces.

fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE][source]

Fetch data for one trial.

has_ground_truth: bool = True
make_ground_truth_metric() GroundTruthJenattonMetric[source]

Create a ground truth version of this metric. If metric observations are noisy, the ground truth would be the underlying noiseless values.

ax.benchmark.metrics.jenatton.jenatton_test_function(x1: Optional[int] = None, x2: Optional[int] = None, x3: Optional[int] = None, x4: Optional[float] = None, x5: Optional[float] = None, x6: Optional[float] = None, x7: Optional[float] = None, r8: Optional[float] = None, r9: Optional[float] = None) float[source]

Jenatton test function for hierarchical search spaces.

This function is taken from:

R. Jenatton, C. Archambeau, J. González, and M. Seeger. Bayesian optimization with tree-structured dependencies. ICML 2017.

Benchmark Metrics Utils

Benchmark Problems Registry

class ax.benchmark.problems.registry.BenchmarkProblemRegistryEntry(factory_fn: Callable[..., ax.benchmark.benchmark_problem.BenchmarkProblem], factory_kwargs: Dict[str, Any])[source]

Bases: object

factory_fn: Callable[[...], BenchmarkProblem]
factory_kwargs: Dict[str, Any]
ax.benchmark.problems.registry.get_problem(problem_name: str, **additional_kwargs: Any) BenchmarkProblem[source]

Benchmark Problems High Dimensional Embedding

ax.benchmark.problems.hd_embedding.embed_higher_dimension(problem: TProblem, total_dimensionality: int) TProblem[source]

Return a new BenchmarkProblem with enough RangeParameter`s added to the search space to make its total dimensionality equal to `total_dimensionality and add total_dimensionality to its name.

The search space of the original problem is within the search space of the new problem, and the constraints are copied from the original problem.

Benchmark Problems Surrogate

class ax.benchmark.problems.surrogate.MOOSurrogateBenchmarkProblem(maximum_hypervolume: float, reference_point: List[float], *, name: str, search_space: SearchSpace, optimization_config: MultiObjectiveOptimizationConfig, num_trials: int, outcome_names: List[str], observe_noise_stds: Union[bool, Dict[str, bool]] = False, noise_stds: Union[float, Dict[str, float]] = 0.0, get_surrogate_and_datasets: Optional[Callable[[], Tuple[TorchModelBridge, List[SupervisedDataset]]]] = None, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None, _runner: Optional[Runner] = None)[source]

Bases: SurrogateBenchmarkProblemBase

Has the same attributes/properties as a MultiObjectiveBenchmarkProblem, but its runner is not constructed until needed, to allow for deferring constructing the surrogate.

Simple aspects of the problem problem such as its search space are defined immediately, while the surrogate is only defined when [TODO] in order to avoid expensive operations like downloading files and fitting a model.

property optimal_value: float
optimization_config: MultiObjectiveOptimizationConfig
class ax.benchmark.problems.surrogate.SOOSurrogateBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, num_trials: int, outcome_names: List[str], observe_noise_stds: Union[bool, Dict[str, bool]] = False, noise_stds: Union[float, Dict[str, float]] = 0.0, get_surrogate_and_datasets: Optional[Callable[[], Tuple[TorchModelBridge, List[SupervisedDataset]]]] = None, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None, _runner: Optional[Runner] = None)[source]

Bases: SurrogateBenchmarkProblemBase

Has the same attributes/properties as a SingleObjectiveBenchmarkProblem, but allows for constructing from a surrogate.

class ax.benchmark.problems.surrogate.SurrogateBenchmarkProblemBase(*, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, num_trials: int, outcome_names: List[str], observe_noise_stds: Union[bool, Dict[str, bool]] = False, noise_stds: Union[float, Dict[str, float]] = 0.0, get_surrogate_and_datasets: Optional[Callable[[], Tuple[TorchModelBridge, List[SupervisedDataset]]]] = None, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None, _runner: Optional[Runner] = None)[source]

Bases: Base

Base class for SOOSurrogateBenchmarkProblem and MOOSurrogateBenchmarkProblem.

Allows for lazy creation of objects needed to construct a runner, including a surrogate and datasets.

property has_ground_truth: bool
property is_noiseless: bool
property runner: Runner
set_runner() None[source]

Benchmark Problems Mixed Integer Synthetic

Mixed integer extensions of some common synthetic test functions. These are adapted from [Daulton2022bopr].

References

[Daulton2022bopr]

S. Daulton, X. Wan, D. Eriksson, M. Balandat, M. A. Osborne, E. Bakshy. Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization. Advances in Neural Information Processing Systems 35, 2022.

ax.benchmark.problems.synthetic.discretized.mixed_integer.get_discrete_ackley(num_trials: int = 50, observe_noise_sd: bool = False, bounds: Optional[List[Tuple[float, float]]] = None) BenchmarkProblem[source]

13D Ackley problem where first 10 dimensions are discretized.

This also restricts Ackley evaluation bounds to [0, 1].

ax.benchmark.problems.synthetic.discretized.mixed_integer.get_discrete_hartmann(num_trials: int = 50, observe_noise_sd: bool = False, bounds: Optional[List[Tuple[float, float]]] = None) BenchmarkProblem[source]

6D Hartmann problem where first 4 dimensions are discretized.

ax.benchmark.problems.synthetic.discretized.mixed_integer.get_discrete_rosenbrock(num_trials: int = 50, observe_noise_sd: bool = False, bounds: Optional[List[Tuple[float, float]]] = None) BenchmarkProblem[source]

10D Rosenbrock problem where first 6 dimensions are discretized.

Benchmark Problems Jenatton

ax.benchmark.problems.synthetic.hss.jenatton.get_jenatton_benchmark_problem(num_trials: int = 50, observe_noise_sd: bool = False) SingleObjectiveBenchmarkProblem[source]

Benchmark Problems PyTorchCNN

class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]

Bases: SingleObjectiveBenchmarkProblem

classmethod from_datasets(name: str, num_trials: int, train_set: Dataset, test_set: Dataset) PyTorchCNNBenchmarkProblem[source]
class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNMetric[source]

Bases: Metric

fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE][source]

Fetch data for one trial.

class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNRunner(name: str, train_set: Dataset, test_set: Dataset)[source]

Bases: Runner

class CNN[source]

Bases: Module

forward(x: Tensor) Tensor[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]][source]

Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax Scheduler.

NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.

Parameters:

trials – Trials to poll.

Returns:

A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).

run(trial: BaseTrial) Dict[str, Any][source]

Deploys a trial based on custom runner subclass implementation.

Parameters:

trial – The trial to deploy.

Returns:

Dict of run metadata from the deployment process.

train_and_evaluate(lr: float, momentum: float, weight_decay: float, step_size: int, gamma: float) float[source]

Benchmark Problems PyTorchCNN TorchVision

class ax.benchmark.problems.hpo.torchvision.PyTorchCNNTorchvisionBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]

Bases: PyTorchCNNBenchmarkProblem

classmethod from_dataset_name(name: str, num_trials: int) PyTorchCNNTorchvisionBenchmarkProblem[source]
class ax.benchmark.problems.hpo.torchvision.PyTorchCNNTorchvisionRunner(name: str, train_set: Dataset, test_set: Dataset)[source]

Bases: PyTorchCNNRunner

A subclass to aid in serialization. This allows us to save only the name of the dataset and reload it from TorchVision at deserialization time.

classmethod deserialize_init_args(args: Dict[str, Any], decoder_registry: Optional[Dict[str, Union[Type[T], Callable[[...], T]]]] = None, class_decoder_registry: Optional[Dict[str, Callable[[Dict[str, Any]], Any]]] = None) Dict[str, Any][source]

Given a dictionary, deserialize the properties needed to initialize the object. Used for storage.

classmethod serialize_init_args(obj: Any) Dict[str, Any][source]

Serialize the properties needed to initialize the object. Used for storage.

Benchmark Runners Base

class ax.benchmark.runners.base.BenchmarkRunner[source]

Bases: Runner, ABC

get_Y_Ystd(arm: Arm) Tuple[Tensor, Optional[Tensor]][source]

Function returning the observed values and their standard errors for a given arm. This function is unused for problems that have a ground truth (in this case get_Y_true() is used), and is required for problems that do not have a ground truth.

get_Y_true(arm: Arm) Tensor[source]

Function returning the ground truth values for a given arm. The synthetic noise is added as part of the Runner’s run() method. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.

get_noise_stds() Union[None, float, Dict[str, float]][source]

Function returning the standard errors for the synthetic noise to be applied to the observed values. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.

abstract property outcome_names: List[str]

The names of the outcomes of the problem (in the order of the outcomes).

run(trial: BaseTrial) Dict[str, Any][source]

Run the trial by evaluating its parameterization(s).

Parameters:

trial – The trial to evaluate.

Returns:

  • Ys: A dict mapping arm names to lists of corresponding outcomes,

    where the order of the outcomes is the same as in outcome_names.

  • Ystds: A dict mapping arm names to lists of corresponding outcome

    noise standard deviations (possibly nan if the noise level is unobserved), where the order of the outcomes is the same as in outcome_names.

  • Ys_true: A dict mapping arm names to lists of corresponding ground

    truth outcomes, where the order of the outcomes is the same as in outcome_names. If the benchmark problem does not provide a ground truth, this key will not be present in the dict returned by this function.

  • ”outcome_names”: A list of metric names.

Return type:

A dictionary with the following keys

Benchmark Runners BoTorch Test

class ax.benchmark.runners.botorch_test.BotorchTestProblemRunner(test_problem_class: Type[BaseTestProblem], test_problem_kwargs: Dict[str, Any], outcome_names: List[str], modified_bounds: Optional[List[Tuple[float, float]]] = None)[source]

Bases: BenchmarkRunner

A Runner for evaluating Botorch BaseTestProblems.

Given a trial the Runner will evaluate the BaseTestProblem.forward method for each arm in the trial, as well as return some metadata about the underlying Botorch problem such as the noise_std. We compute the full result on the Runner (as opposed to the Metric as is typical in synthetic test problems) because the BoTorch problem computes all metrics in one stacked tensor in the MOO case, and we wish to avoid recomputation per metric.

classmethod deserialize_init_args(args: Dict[str, Any], decoder_registry: Optional[Dict[str, Union[Type[T], Callable[[...], T]]]] = None, class_decoder_registry: Optional[Dict[str, Callable[[Dict[str, Any]], Any]]] = None) Dict[str, Any][source]

Given a dictionary, deserialize the properties needed to initialize the runner. Used for storage.

get_Y_true(arm: Arm) Tensor[source]

Converts X to original bounds – only if modified bounds were provided – and evaluates the test problem. See __init__ docstring for details.

Parameters:

X – A batch_shape x d-dim tensor of point(s) at which to evaluate the test problem.

Returns:

A batch_shape x m-dim tensor of ground truth (noiseless) evaluations.

get_noise_stds() Union[None, float, Dict[str, float]][source]

Function returning the standard errors for the synthetic noise to be applied to the observed values. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.

property outcome_names: List[str]

The names of the outcomes of the problem (in the order of the outcomes).

poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]][source]

Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax Scheduler.

NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.

Parameters:

trials – Trials to poll.

Returns:

A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).

classmethod serialize_init_args(obj: Any) Dict[str, Any][source]

Serialize the properties needed to initialize the runner. Used for storage.

test_problem: BaseTestProblem

Benchmark Runners Surrogate

class ax.benchmark.runners.surrogate.SurrogateRunner(name: str, surrogate: TorchModelBridge, datasets: List[SupervisedDataset], search_space: SearchSpace, outcome_names: List[str], noise_stds: Union[float, Dict[str, float]] = 0.0)[source]

Bases: BenchmarkRunner

classmethod deserialize_init_args(args: Dict[str, Any], decoder_registry: Optional[Dict[str, Union[Type[T], Callable[[...], T]]]] = None, class_decoder_registry: Optional[Dict[str, Callable[[Dict[str, Any]], Any]]] = None) Dict[str, Any][source]

Given a dictionary, deserialize the properties needed to initialize the object. Used for storage.

get_Y_true(arm: Arm) Tensor[source]

Function returning the ground truth values for a given arm. The synthetic noise is added as part of the Runner’s run() method. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.

get_noise_stds() Union[None, float, Dict[str, float]][source]

Function returning the standard errors for the synthetic noise to be applied to the observed values. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.

property outcome_names: List[str]

The names of the outcomes of the problem (in the order of the outcomes).

poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]][source]

Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax Scheduler.

NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.

Parameters:

trials – Trials to poll.

Returns:

A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).

run(trial: BaseTrial) Dict[str, Any][source]

Run the trial by evaluating its parameterization(s) on the surrogate model.

Note: This also sets the status of the trial to COMPLETED.

Parameters:

trial – The trial to evaluate.

Returns:

  • outcome_names: The names of the metrics being evaluated.

  • Ys: A dict mapping arm names to lists of corresponding outcomes,

    where the order of the outcomes is the same as in outcome_names.

  • Ystds: A dict mapping arm names to lists of corresponding outcome

    noise standard deviations (possibly nan if the noise level is unobserved), where the order of the outcomes is the same as in outcome_names.

  • Ys_true: A dict mapping arm names to lists of corresponding ground

    truth outcomes, where the order of the outcomes is the same as in outcome_names.

Return type:

A dictionary with the following keys

classmethod serialize_init_args(obj: Any) Dict[str, Any][source]

Serialize the properties needed to initialize the runner. Used for storage.

WARNING: Because of issues with consistently saving and loading BoTorch and GPyTorch modules the SurrogateRunner cannot be serialized at this time. At load time the runner will be replaced with a SyntheticRunner.