ax.benchmark

Benchmark

Benchmark Method

class ax.benchmark.benchmark_method.BenchmarkMethod(name: str, generation_strategy: GenerationStrategy, scheduler_options: SchedulerOptions)[source]

Bases: Base

Benchmark method, represented in terms of Ax generation strategy (which tells us which models to use when) and scheduler options (which tell us extra execution information like maximum parallelism, early stopping configuration, etc.). Note: if BenchmarkMethod.scheduler_optionss.total_trials is lower than BenchmarkProblem.num_trials only the number of trials specified in the former will be run.

generation_strategy: GenerationStrategy
name: str
scheduler_options: SchedulerOptions
ax.benchmark.benchmark_method.get_sequential_optimization_scheduler_options() SchedulerOptions[source]

The typical SchedulerOptions used in benchmarking.

Benchmark Problem

class ax.benchmark.benchmark_problem.BenchmarkProblem(name: str, search_space: ~ax.core.search_space.SearchSpace, optimization_config: ~ax.core.optimization_config.OptimizationConfig, runner: ~ax.core.runner.Runner, num_trials: int, infer_noise: bool, tracking_metrics: ~typing.List[~ax.core.metric.Metric] = <factory>)[source]

Bases: Base

Benchmark problem, represented in terms of Ax search space, optimization config, and runner.

classmethod from_botorch(test_problem_class: Type[BaseTestProblem], test_problem_kwargs: Dict[str, Any], num_trials: int, infer_noise: bool = True) BenchmarkProblem[source]

Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner and retrieved by the Metric.

infer_noise: bool
name: str
num_trials: int
optimization_config: OptimizationConfig
runner: Runner
search_space: SearchSpace
tracking_metrics: List[Metric]
class ax.benchmark.benchmark_problem.MultiObjectiveBenchmarkProblem(maximum_hypervolume: float, reference_point: List[float], **kwargs: Any)[source]

Bases: BenchmarkProblem

A BenchmarkProblem support multiple objectives. Rather than knowing each objective’s optimal value we track a known maximum hypervolume computed from a given reference point.

classmethod from_botorch_multi_objective(test_problem_class: Type[MultiObjectiveTestProblem], test_problem_kwargs: Dict[str, Any], num_trials: int, infer_noise: bool = True) MultiObjectiveBenchmarkProblem[source]

Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner once per trial and each Metric will retrieve its own result by index.

maximum_hypervolume: float
reference_point: List[float]
class ax.benchmark.benchmark_problem.SingleObjectiveBenchmarkProblem(optimal_value: float, **kwargs: Any)[source]

Bases: BenchmarkProblem

The most basic BenchmarkProblem, with a single objective and a known optimal value.

classmethod from_botorch_synthetic(test_problem_class: Type[SyntheticTestFunction], test_problem_kwargs: Dict[str, Any], num_trials: int, infer_noise: bool = True) SingleObjectiveBenchmarkProblem[source]

Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner and retrieved by the Metric.

optimal_value: float

Benchmark Result

class ax.benchmark.benchmark_result.AggregatedBenchmarkResult(name: str, results: List[BenchmarkResult], optimization_trace: pandas.DataFrame, score_trace: pandas.DataFrame, fit_time: List[float], gen_time: List[float])[source]

Bases: Base

The result of a benchmark test, or series of replications. Scalar data present in the BenchmarkResult is here represented as (mean, sem) pairs. More information will be added to the AggregatedBenchmarkResult as the suite develops.

fit_time: List[float]
classmethod from_benchmark_results(results: List[BenchmarkResult]) AggregatedBenchmarkResult[source]

Aggregrates a list of BenchmarkResults. For various reasons (timeout, errors, etc.) each BenchmarkResult may have a different number of trials; aggregated traces and statistics are computed with and truncated to the minimum trial count to ensure each replication is included.

gen_time: List[float]
name: str
optimization_trace: pandas.DataFrame
optimization_trace_by_progression(final_progression_only: bool = False) pandas.DataFrame[source]
progression_trace() pandas.DataFrame[source]
results: List[BenchmarkResult]
score_trace: pandas.DataFrame
total_progression() List[float][source]
class ax.benchmark.benchmark_result.BenchmarkResult(name: str, seed: int, experiment: Experiment, optimization_trace: ndarray, score_trace: ndarray, fit_time: float, gen_time: float)[source]

Bases: Base

The result of a single optimization loop from one (BenchmarkProblem, BenchmarkMethod) pair. More information will be added to the BenchmarkResult as the suite develops.

experiment: Experiment
fit_time: float
gen_time: float
name: str
optimization_trace: ndarray
optimization_trace_by_progression(final_progression_only: bool = False) Tuple[ndarray, ndarray][source]
progression_trace() ndarray[source]

Computes progressions used as a function of trials and also the total progression across all trials.

score_trace: ndarray
seed: int
total_progression() float[source]

Benchmark

Module for benchmarking Ax algorithms.

Key terms used:

  • Replication: 1 run of an optimization loop; (BenchmarkProblem, BenchmarkMethod) pair.

  • Test: multiple replications, ran for statistical significance.

  • Full run: multiple tests on many (BenchmarkProblem, BenchmarkMethod) pairs.

  • Method: (one of) the algorithm(s) being benchmarked.

  • Problem: a synthetic function, a surrogate surface, or an ML model, on which to assess the performance of algorithms.

ax.benchmark.benchmark.benchmark_full_run(problems: Iterable[BenchmarkProblem], methods: Iterable[BenchmarkMethod], seeds: Iterable[int], **kwargs: Any) List[AggregatedBenchmarkResult][source]
ax.benchmark.benchmark.benchmark_replication(problem: BenchmarkProblem, method: BenchmarkMethod, seed: int) BenchmarkResult[source]

Runs one benchmarking replication (equivalent to one optimization loop).

Parameters:
  • problem – The BenchmarkProblem to test against (can be synthetic or real)

  • method – The BenchmarkMethod to test

  • seed – The seed to use for this replication, set using manual_seed from botorch.utils.sampling.

ax.benchmark.benchmark.benchmark_test(problem: BenchmarkProblem, method: BenchmarkMethod, seeds: Iterable[int], **kwargs: Any) AggregatedBenchmarkResult[source]

Scored Benchmark

Benchmark Methods GPEI and MOO

ax.benchmark.methods.gpei_and_moo.get_gpei_default() BenchmarkMethod[source]
ax.benchmark.methods.gpei_and_moo.get_moo_default() BenchmarkMethod[source]

Benchmark Methods Modular BoTorch

ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_acquisition(acquisition_cls: Type[AcquisitionFunction], acquisition_options: Optional[Dict[str, Any]] = None) BenchmarkMethod[source]
ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_default() BenchmarkMethod[source]
ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_fixed_noise_gp_qnehvi() BenchmarkMethod[source]
ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_fixed_noise_gp_qnei() BenchmarkMethod[source]
ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_saas_fully_bayesian_single_task_gp_qnehvi() BenchmarkMethod[source]
ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_saas_fully_bayesian_single_task_gp_qnei() BenchmarkMethod[source]

Benchmark Methods SAASBO

ax.benchmark.methods.saasbo.get_saasbo_default() BenchmarkMethod[source]
ax.benchmark.methods.saasbo.get_saasbo_moo_default() BenchmarkMethod[source]

Benchmark Methods Choose Generation Strategy

ax.benchmark.methods.choose_generation_strategy.get_choose_generation_strategy_method(problem: BenchmarkProblem) BenchmarkMethod[source]

Benchmark Problems Registry

class ax.benchmark.problems.registry.BenchmarkProblemRegistryEntry(factory_fn: Callable[..., ax.benchmark.benchmark_problem.BenchmarkProblem], factory_kwargs: Dict[str, Any])[source]

Bases: object

factory_fn: Callable[[...], BenchmarkProblem]
factory_kwargs: Dict[str, Any]
ax.benchmark.problems.registry.get_problem(problem_name: str) BenchmarkProblem[source]

Benchmark Problems High Dimensional Embedding

ax.benchmark.problems.hd_embedding.embed_higher_dimension(problem: BenchmarkProblem, total_dimensionality: int) BenchmarkProblem[source]

Benchmark Problems Surrogate

class ax.benchmark.problems.surrogate.SurrogateBenchmarkProblem(optimal_value: float, **kwargs: Any)[source]

Bases: SingleObjectiveBenchmarkProblem

classmethod from_surrogate(name: str, search_space: SearchSpace, surrogate: Surrogate, datasets: List[SupervisedDataset], minimize: bool, optimal_value: float, num_trials: int, infer_noise: bool = True) SurrogateBenchmarkProblem[source]
infer_noise: bool
name: str
num_trials: int
optimal_value: float
optimization_config: OptimizationConfig
runner: Runner
search_space: SearchSpace
tracking_metrics: List[Metric]
class ax.benchmark.problems.surrogate.SurrogateMetric(infer_noise: bool = True)[source]

Bases: Metric

fetch_trial_data(trial: BaseTrial, **kwargs) Result[Data, MetricFetchE][source]

Fetch data for one trial.

class ax.benchmark.problems.surrogate.SurrogateRunner(name: str, surrogate: Surrogate, datasets: List[SupervisedDataset], search_space: SearchSpace)[source]

Bases: Runner

classmethod deserialize_init_args(args: Dict[str, Any]) Dict[str, Any][source]

Given a dictionary, deserialize the properties needed to initialize the object. Used for storage.

poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]][source]

Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax Scheduler.

NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.

Parameters:

trials – Trials to poll.

Returns:

A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).

run(trial: BaseTrial) Dict[str, Any][source]

Deploys a trial based on custom runner subclass implementation.

Parameters:

trial – The trial to deploy.

Returns:

Dict of run metadata from the deployment process.

classmethod serialize_init_args(obj: Any) Dict[str, Any][source]

Serialize the properties needed to initialize the runner. Used for storage.

WARNING: Because of issues with consistently saving and loading BoTorch and GPyTorch modules the SurrogateRunner cannot be serialized at this time. At load time the runner will be replaced with a SyntheticRunner.

Benchmark Problems Jenatton

ax.benchmark.problems.synthetic.hss.jenatton.get_jenatton_benchmark_problem(num_trials: int = 50, infer_noise: bool = True) SingleObjectiveBenchmarkProblem[source]

Benchmark Problems PyTorchCNN

class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNBenchmarkProblem(optimal_value: float, **kwargs: Any)[source]

Bases: SingleObjectiveBenchmarkProblem

classmethod from_datasets(name: str, num_trials: int, train_set: Dataset, test_set: Dataset, infer_noise: bool = True) PyTorchCNNBenchmarkProblem[source]
infer_noise: bool
name: str
num_trials: int
optimal_value: float
optimization_config: OptimizationConfig
runner: Runner
search_space: SearchSpace
tracking_metrics: List[Metric]
class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNMetric(infer_noise: bool = True)[source]

Bases: Metric

fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE][source]

Fetch data for one trial.

class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNRunner(name: str, train_set: Dataset, test_set: Dataset)[source]

Bases: Runner

class CNN[source]

Bases: Module

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]][source]

Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax Scheduler.

NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.

Parameters:

trials – Trials to poll.

Returns:

A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).

run(trial: BaseTrial) Dict[str, Any][source]

Deploys a trial based on custom runner subclass implementation.

Parameters:

trial – The trial to deploy.

Returns:

Dict of run metadata from the deployment process.

train_and_evaluate(lr: float, momentum: float, weight_decay: float, step_size: int, gamma: float) float[source]

Benchmark Problems PyTorchCNN TorchVision

class ax.benchmark.problems.hpo.torchvision.PyTorchCNNTorchvisionBenchmarkProblem(optimal_value: float, **kwargs: Any)[source]

Bases: PyTorchCNNBenchmarkProblem

classmethod from_dataset_name(name: str, num_trials: int, infer_noise: bool = True) PyTorchCNNTorchvisionBenchmarkProblem[source]
infer_noise: bool
name: str
num_trials: int
optimal_value: float
optimization_config: OptimizationConfig
runner: Runner
search_space: SearchSpace
tracking_metrics: List[Metric]
class ax.benchmark.problems.hpo.torchvision.PyTorchCNNTorchvisionRunner(name: str, train_set: Dataset, test_set: Dataset)[source]

Bases: PyTorchCNNRunner

A subclass to aid in serialization. This allows us to save only the name of the dataset and reload it from TorchVision at deserialization time.

classmethod deserialize_init_args(args: Dict[str, Any]) Dict[str, Any][source]

Given a dictionary, deserialize the properties needed to initialize the object. Used for storage.

classmethod serialize_init_args(obj: Any) Dict[str, Any][source]

Serialize the properties needed to initialize the object. Used for storage.