ax.benchmark¶
Benchmark¶
Benchmark Method¶
- class ax.benchmark.benchmark_method.BenchmarkMethod(name: str, generation_strategy: GenerationStrategy, scheduler_options: SchedulerOptions, distribute_replications: bool = False)[source]¶
Bases:
Base
Benchmark method, represented in terms of Ax generation strategy (which tells us which models to use when) and scheduler options (which tell us extra execution information like maximum parallelism, early stopping configuration, etc.).
Note: If BenchmarkMethod.scheduler_options.total_trials is less than BenchmarkProblem.num_trials then only the number of trials specified in the former will be run.
Note: The generation_strategy passed in is assumed to be in its “base state”, as it will be cloned and reset.
- generation_strategy: GenerationStrategy¶
- scheduler_options: SchedulerOptions¶
- ax.benchmark.benchmark_method.get_benchmark_scheduler_options(timeout_hours: int = 4, batch_size: int = 1) SchedulerOptions [source]¶
The typical SchedulerOptions used in benchmarking.
Currently, regardless of batch size, all pending trials must complete before new ones are generated. That is, when batch_size > 1, the design is “batch sequential”, and when batch_size = 1, the design is “fully sequential.”
- Parameters:
timeout_hours – The maximum amount of time (in hours) to run each benchmark replication. Defaults to 4 hours.
batch_size – Number of trials to generate at once.
Benchmark Problem¶
- class ax.benchmark.benchmark_problem.BenchmarkProblem(name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]¶
Bases:
Base
Benchmark problem, represented in terms of Ax search space, optimization config, and runner.
- classmethod from_botorch(test_problem_class: Type[BaseTestProblem], test_problem_kwargs: Dict[str, Any], lower_is_better: bool, num_trials: int, observe_noise_sd: bool = False) BenchmarkProblem [source]¶
Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner and retrieved by the Metric.
- Parameters:
test_problem_class – The BoTorch test problem class which will be used to define the search_space, optimization_config, and runner.
test_problem_kwargs – Keyword arguments used to instantiate the test_problem_class.
num_trials – Simply the num_trials of the BenchmarkProblem created.
observe_noise_sd – Whether the standard deviation of the observation noise is observed or not (in which case it must be inferred by the model). This is separate from whether synthetic noise is added to the problem, which is controlled by the noise_std of the test problem.
- class ax.benchmark.benchmark_problem.BenchmarkProblemProtocol(*args, **kwargs)[source]¶
Bases:
Protocol
Specifies the interface any benchmark problem must adhere to.
Classes implementing this interface include BenchmarkProblem, SurrogateBenchmarkProblem, and MOOSurrogateBenchmarkProblem.
- optimization_config: OptimizationConfig¶
- search_space: SearchSpace¶
- tracking_metrics: List[BenchmarkMetricBase]¶
- class ax.benchmark.benchmark_problem.BenchmarkProblemWithKnownOptimum(*args, **kwargs)[source]¶
Bases:
Protocol
- class ax.benchmark.benchmark_problem.MultiObjectiveBenchmarkProblem(maximum_hypervolume: float, reference_point: List[float], *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]¶
Bases:
BenchmarkProblem
A BenchmarkProblem support multiple objectives. Rather than knowing each objective’s optimal value we track a known maximum hypervolume computed from a given reference point.
- classmethod from_botorch_multi_objective(test_problem_class: Type[MultiObjectiveTestProblem], test_problem_kwargs: Dict[str, Any], num_trials: int, observe_noise_sd: bool = False) MultiObjectiveBenchmarkProblem [source]¶
Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner once per trial and each Metric will retrieve its own result by index.
- class ax.benchmark.benchmark_problem.SingleObjectiveBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]¶
Bases:
BenchmarkProblem
The most basic BenchmarkProblem, with a single objective and a known optimal value.
- classmethod from_botorch_synthetic(test_problem_class: Type[SyntheticTestFunction], test_problem_kwargs: Dict[str, Any], lower_is_better: bool, num_trials: int, observe_noise_sd: bool = False) SingleObjectiveBenchmarkProblem [source]¶
Create a BenchmarkProblem from a BoTorch BaseTestProblem using specialized Metrics and Runners. The test problem’s result will be computed on the Runner and retrieved by the Metric.
Benchmark Result¶
- class ax.benchmark.benchmark_result.AggregatedBenchmarkResult(name: str, results: List[BenchmarkResult], optimization_trace: pandas.DataFrame, score_trace: pandas.DataFrame, fit_time: List[float], gen_time: List[float])[source]¶
Bases:
Base
The result of a benchmark test, or series of replications. Scalar data present in the BenchmarkResult is here represented as (mean, sem) pairs.
- classmethod from_benchmark_results(results: List[BenchmarkResult]) AggregatedBenchmarkResult [source]¶
Aggregrates a list of BenchmarkResults. For various reasons (timeout, errors, etc.) each BenchmarkResult may have a different number of trials; aggregated traces and statistics are computed with and truncated to the minimum trial count to ensure each replication is included.
- optimization_trace: pandas.DataFrame¶
- results: List[BenchmarkResult]¶
- score_trace: pandas.DataFrame¶
- class ax.benchmark.benchmark_result.BenchmarkResult(name: str, seed: int, optimization_trace: ndarray, score_trace: ndarray, fit_time: float, gen_time: float, experiment: Optional[Experiment] = None, experiment_storage_id: Optional[str] = None)[source]¶
Bases:
Base
The result of a single optimization loop from one (BenchmarkProblem, BenchmarkMethod) pair.
- experiment: Optional[Experiment] = None¶
- optimization_trace: ndarray¶
- score_trace: ndarray¶
Benchmark¶
Module for benchmarking Ax algorithms.
Key terms used:
Replication: 1 run of an optimization loop; (BenchmarkProblem, BenchmarkMethod) pair.
Test: multiple replications, ran for statistical significance.
Full run: multiple tests on many (BenchmarkProblem, BenchmarkMethod) pairs.
Method: (one of) the algorithm(s) being benchmarked.
Problem: a synthetic function, a surrogate surface, or an ML model, on which to assess the performance of algorithms.
- ax.benchmark.benchmark.benchmark_multiple_problems_methods(problems: Iterable[BenchmarkProblemProtocol], methods: Iterable[BenchmarkMethod], seeds: Iterable[int]) List[AggregatedBenchmarkResult] [source]¶
For each problem and method in the Cartesian product of problems and methods, run the replication on each seed in seeds and get the results as an AggregatedBenchmarkResult, then return a list of each AggregatedBenchmarkResult.
- ax.benchmark.benchmark.benchmark_one_method_problem(problem: BenchmarkProblemProtocol, method: BenchmarkMethod, seeds: Iterable[int]) AggregatedBenchmarkResult [source]¶
- ax.benchmark.benchmark.benchmark_replication(problem: BenchmarkProblemProtocol, method: BenchmarkMethod, seed: int) BenchmarkResult [source]¶
Runs one benchmarking replication (equivalent to one optimization loop).
- Parameters:
problem – The BenchmarkProblem to test against (can be synthetic or real)
method – The BenchmarkMethod to test
seed – The seed to use for this replication.
- ax.benchmark.benchmark.compute_score_trace(optimization_trace: ndarray, num_baseline_trials: int, problem: BenchmarkProblemProtocol) ndarray [source]¶
Computes a score trace from the optimization trace.
- ax.benchmark.benchmark.make_ground_truth_metrics(problem: BenchmarkProblemProtocol, include_tracking_metrics: bool = True) Dict[str, Metric] [source]¶
Makes a ground truth version for each metric defined on the problem.
- Parameters:
problem – The BenchmarkProblem to test against (can be synthetic or real).
include_tracking_metrics – Whether or not to include tracking metrics.
- Returns:
A dict mapping (original) metric names to their respective ground truth metric.
- ax.benchmark.benchmark.make_ground_truth_optimization_config(experiment: Experiment) OptimizationConfig [source]¶
Makes a clone of the OptimizationConfig on the experiment in which each metric is replaced by its respective “ground truth” counterpart, which has been added to the experiment’s tracking metrics in _create_benchmark_experiment and which returns the ground truth (i.e., uncorrupted by noise) observations.
Benchmark Methods Modular BoTorch¶
- ax.benchmark.methods.modular_botorch.get_sobol_botorch_modular_acquisition(model_cls: Type[Model], acquisition_cls: Type[AcquisitionFunction], distribute_replications: bool, scheduler_options: Optional[SchedulerOptions] = None, name: Optional[str] = None, num_sobol_trials: int = 5, model_gen_kwargs: Optional[Dict[str, Any]] = None) BenchmarkMethod [source]¶
Get a BenchmarkMethod that uses Sobol followed by MBM.
- Parameters:
model_cls – BoTorch model class, e.g. SingleTaskGP
acquisition_cls – Acquisition function class, e.g. qLogNoisyExpectedImprovement.
distribute_replications – Whether to use multiple machines
scheduler_options – Passed as-is to scheduler. Default: get_benchmark_scheduler_options().
name – Name that will be attached to the GenerationStrategy.
num_sobol_trials – Number of Sobol trials; if the scheduler_options specify to use `BatchTrial`s, then this refers to the number of `BatchTrial`s.
model_gen_kwargs – Passed to the BoTorch GenerationStep and ultimately to the BoTorch Model.
Example
>>> # A simple example >>> from ax.benchmark.methods.sobol_botorch_modular import ( ... get_sobol_botorch_modular_acquisition ... ) >>> from ax.benchmark.benchmark_method import get_benchmark_scheduler_options >>> >>> method = get_sobol_botorch_modular_acquisition( ... model_cls=SingleTaskGP, ... acquisition_cls=qLogNoisyExpectedImprovement, ... distribute_replications=False, ... ) >>> # Pass sequential=False to BoTorch's optimize_acqf >>> batch_method = get_sobol_botorch_modular_acquisition( ... model_cls=SingleTaskGP, ... acquisition_cls=qLogNoisyExpectedImprovement, ... distribute_replications=False, ... scheduler_options=get_benchmark_scheduler_options( ... batch_size=5, ... ), ... model_gen_kwargs=model_gen_kwargs={ ... "model_gen_options": { ... "optimizer_kwargs": {"sequential": False} ... } ... } ... num_sobol_trials=1, ... )
Benchmark Methods Sobol¶
- ax.benchmark.methods.sobol.get_sobol_benchmark_method(distribute_replications: bool, scheduler_options: Optional[SchedulerOptions] = None) BenchmarkMethod [source]¶
Benchmark Metrics Base¶
Module containing the metric base classes for benchmarks. The key property of a benchmark metric is whether it has a ground truth or not, which is indicated by a has_ground_truth attribute of BenchmarkMetricBase. All mnetrics used in Ax bechmarks need to be subclassed from BenchmarkMetricBase.
For metrics that do have a ground truth, we can compute the performance of the optimization directly in terms of the ground truth observations (or the ground truth of the out-of-sample model-suggested best point). For metrics that do not have a ground truth, this is not possible.
The benchmarks are designed in a way so that (unless the metric is noiseless) no ground truth observations are available to the optimziation algorithm. Instead, we use separate “ground truth metrics” attached as tracking metrics to the experiment that are used to evaluate the performance after the optimization is complete. GroundTruthMetricMixin can be used to construct such ground truth metrics (with the is_ground_truth property indicating that the metric provides the ground truth) and implements naming conventions and helpers for associating a the ground truth metric to the respective metric used during the optimization.
- class ax.benchmark.metrics.base.BenchmarkMetricBase(name: str, lower_is_better: Optional[bool] = None, properties: Optional[Dict[str, Any]] = None)[source]¶
Bases:
Metric
,ABC
A generic metric used for Ax Benchmarks.
- has_ground_truth¶
Whether or not there exists a ground truth for this metric, i.e. whether each observation has an associated ground truth value. This is trivially true for deterministic metrics, and is also true for metrics where synthetic observation noise is added to its (deterministic) values. This is not true for metrics that are inherently noisy.
- Type:
- abstract make_ground_truth_metric() BenchmarkMetricBase [source]¶
Create a ground truth version of this metric. If metric observations are noisy, the ground truth would be the underlying noiseless values.
Benchmark Metrics Benchmark¶
- class ax.benchmark.metrics.benchmark.BenchmarkMetric(name: str, lower_is_better: bool, observe_noise_sd: bool = True, outcome_index: Optional[int] = None)[source]¶
Bases:
BenchmarkMetricBase
A generic metric used for observed values produced by Ax Benchmarks.
Compatible e.g. with results generated by BotorchTestProblemRunner and SurrogateRunner.
- has_ground_truth¶
Whether or not there exists a ground truth for this metric, i.e. whether each observation has an associated ground truth value. This is trivially true for deterministic metrics, and is also true for metrics where synthetic observation noise is added to its (deterministic) values. This is not true for metrics that are inherently noisy.
- Type:
- fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE] [source]¶
Fetch data for one trial.
- make_ground_truth_metric() BenchmarkMetricBase [source]¶
Create a ground truth version of this metric.
- class ax.benchmark.metrics.benchmark.GroundTruthBenchmarkMetric(original_metric: BenchmarkMetric)[source]¶
Bases:
BenchmarkMetric
,GroundTruthMetricMixin
- fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE] [source]¶
Fetch data for one trial.
- make_ground_truth_metric() BenchmarkMetricBase [source]¶
Create a ground truth version of this metric.
Benchmark Metrics Jenatton¶
- class ax.benchmark.metrics.jenatton.GroundTruthJenattonMetric(original_metric: JenattonMetric)[source]¶
Bases:
JenattonMetric
,GroundTruthMetricMixin
- class ax.benchmark.metrics.jenatton.JenattonMetric(name: str = 'jenatton', noise_std: float = 0.0, observe_noise_sd: bool = False)[source]¶
Bases:
BenchmarkMetricBase
Jenatton metric for hierarchical search spaces.
- fetch_trial_data(trial: BaseTrial, **kwargs: Any) Result[Data, MetricFetchE] [source]¶
Fetch data for one trial.
- make_ground_truth_metric() GroundTruthJenattonMetric [source]¶
Create a ground truth version of this metric. If metric observations are noisy, the ground truth would be the underlying noiseless values.
- ax.benchmark.metrics.jenatton.jenatton_test_function(x1: Optional[int] = None, x2: Optional[int] = None, x3: Optional[int] = None, x4: Optional[float] = None, x5: Optional[float] = None, x6: Optional[float] = None, x7: Optional[float] = None, r8: Optional[float] = None, r9: Optional[float] = None) float [source]¶
Jenatton test function for hierarchical search spaces.
This function is taken from:
R. Jenatton, C. Archambeau, J. González, and M. Seeger. Bayesian optimization with tree-structured dependencies. ICML 2017.
Benchmark Metrics Utils¶
Benchmark Problems Registry¶
- class ax.benchmark.problems.registry.BenchmarkProblemRegistryEntry(factory_fn: Callable[..., ax.benchmark.benchmark_problem.BenchmarkProblem], factory_kwargs: Dict[str, Any])[source]¶
Bases:
object
- factory_fn: Callable[[...], BenchmarkProblem]¶
- ax.benchmark.problems.registry.get_problem(problem_name: str, **additional_kwargs: Any) BenchmarkProblem [source]¶
Benchmark Problems High Dimensional Embedding¶
- ax.benchmark.problems.hd_embedding.embed_higher_dimension(problem: TProblem, total_dimensionality: int) TProblem [source]¶
Return a new BenchmarkProblem with enough RangeParameter`s added to the search space to make its total dimensionality equal to `total_dimensionality and add total_dimensionality to its name.
The search space of the original problem is within the search space of the new problem, and the constraints are copied from the original problem.
Benchmark Problems Surrogate¶
- class ax.benchmark.problems.surrogate.MOOSurrogateBenchmarkProblem(maximum_hypervolume: float, reference_point: List[float], *, name: str, search_space: SearchSpace, optimization_config: MultiObjectiveOptimizationConfig, num_trials: int, outcome_names: List[str], observe_noise_stds: Union[bool, Dict[str, bool]] = False, noise_stds: Union[float, Dict[str, float]] = 0.0, get_surrogate_and_datasets: Optional[Callable[[], Tuple[Surrogate, List[SupervisedDataset]]]] = None, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None, _runner: Optional[Runner] = None)[source]¶
Bases:
SurrogateBenchmarkProblemBase
Has the same attributes/properties as a MultiObjectiveBenchmarkProblem, but its runner is not constructed until needed, to allow for deferring constructing the surrogate.
Simple aspects of the problem problem such as its search space are defined immediately, while the surrogate is only defined when [TODO] in order to avoid expensive operations like downloading files and fitting a model.
- optimization_config: MultiObjectiveOptimizationConfig¶
- class ax.benchmark.problems.surrogate.SOOSurrogateBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, num_trials: int, outcome_names: List[str], observe_noise_stds: Union[bool, Dict[str, bool]] = False, noise_stds: Union[float, Dict[str, float]] = 0.0, get_surrogate_and_datasets: Optional[Callable[[], Tuple[Surrogate, List[SupervisedDataset]]]] = None, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None, _runner: Optional[Runner] = None)[source]¶
Bases:
SurrogateBenchmarkProblemBase
Has the same attributes/properties as a SingleObjectiveBenchmarkProblem, but allows for constructing from a surrogate.
- class ax.benchmark.problems.surrogate.SurrogateBenchmarkProblemBase(*, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, num_trials: int, outcome_names: List[str], observe_noise_stds: Union[bool, Dict[str, bool]] = False, noise_stds: Union[float, Dict[str, float]] = 0.0, get_surrogate_and_datasets: Optional[Callable[[], Tuple[Surrogate, List[SupervisedDataset]]]] = None, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None, _runner: Optional[Runner] = None)[source]¶
Bases:
Base
Base class for SOOSurrogateBenchmarkProblem and MOOSurrogateBenchmarkProblem.
Allows for lazy creation of objects needed to construct a runner, including a surrogate and datasets.
Benchmark Problems Mixed Integer Synthetic¶
Mixed integer extensions of some common synthetic test functions. These are adapted from [Daulton2022bopr].
References
S. Daulton, X. Wan, D. Eriksson, M. Balandat, M. A. Osborne, E. Bakshy. Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization. Advances in Neural Information Processing Systems 35, 2022.
- ax.benchmark.problems.synthetic.discretized.mixed_integer.get_discrete_ackley(num_trials: int = 50, observe_noise_sd: bool = False, bounds: Optional[List[Tuple[float, float]]] = None) BenchmarkProblem [source]¶
13D Ackley problem where first 10 dimensions are discretized.
This also restricts Ackley evaluation bounds to [0, 1].
Benchmark Problems Jenatton¶
- ax.benchmark.problems.synthetic.hss.jenatton.get_jenatton_benchmark_problem(num_trials: int = 50, observe_noise_sd: bool = False) SingleObjectiveBenchmarkProblem [source]¶
Benchmark Problems PyTorchCNN¶
- class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]¶
Bases:
SingleObjectiveBenchmarkProblem
- classmethod from_datasets(name: str, num_trials: int, train_set: Dataset, test_set: Dataset) PyTorchCNNBenchmarkProblem [source]¶
- class ax.benchmark.problems.hpo.pytorch_cnn.PyTorchCNNRunner(name: str, train_set: Dataset, test_set: Dataset)[source]¶
Bases:
Runner
- class CNN[source]¶
Bases:
Module
- forward(x: Tensor) Tensor [source]¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]] [source]¶
Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax
Scheduler
.NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.
- Parameters:
trials – Trials to poll.
- Returns:
A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).
Benchmark Problems PyTorchCNN TorchVision¶
- class ax.benchmark.problems.hpo.torchvision.PyTorchCNNTorchvisionBenchmarkProblem(optimal_value: float, *, name: str, search_space: SearchSpace, optimization_config: OptimizationConfig, runner: Runner, num_trials: int, is_noiseless: bool = False, observe_noise_sd: bool = False, has_ground_truth: bool = False, tracking_metrics: Optional[List[BenchmarkMetricBase]] = None)[source]¶
Bases:
PyTorchCNNBenchmarkProblem
- classmethod from_dataset_name(name: str, num_trials: int) PyTorchCNNTorchvisionBenchmarkProblem [source]¶
- class ax.benchmark.problems.hpo.torchvision.PyTorchCNNTorchvisionRunner(name: str, train_set: Dataset, test_set: Dataset)[source]¶
Bases:
PyTorchCNNRunner
A subclass to aid in serialization. This allows us to save only the name of the dataset and reload it from TorchVision at deserialization time.
- classmethod deserialize_init_args(args: Dict[str, Any], decoder_registry: Optional[Dict[str, Type]] = None, class_decoder_registry: Optional[Dict[str, Callable[[Dict[str, Any]], Any]]] = None) Dict[str, Any] [source]¶
Given a dictionary, deserialize the properties needed to initialize the object. Used for storage.
Benchmark Runners Base¶
- class ax.benchmark.runners.base.BenchmarkRunner[source]¶
-
- get_Y_Ystd(arm: Arm) Tuple[Tensor, Optional[Tensor]] [source]¶
Function returning the observed values and their standard errors for a given arm. This function is unused for problems that have a ground truth (in this case get_Y_true() is used), and is required for problems that do not have a ground truth.
- get_Y_true(arm: Arm) Tensor [source]¶
Function returning the ground truth values for a given arm. The synthetic noise is added as part of the Runner’s run() method. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.
- get_noise_stds() Union[None, float, Dict[str, float]] [source]¶
Function returning the standard errors for the synthetic noise to be applied to the observed values. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.
- abstract property outcome_names: List[str]¶
The names of the outcomes of the problem (in the order of the outcomes).
- run(trial: BaseTrial) Dict[str, Any] [source]¶
Run the trial by evaluating its parameterization(s).
- Parameters:
trial – The trial to evaluate.
- Returns:
- Ys: A dict mapping arm names to lists of corresponding outcomes,
where the order of the outcomes is the same as in outcome_names.
- Ystds: A dict mapping arm names to lists of corresponding outcome
noise standard deviations (possibly nan if the noise level is unobserved), where the order of the outcomes is the same as in outcome_names.
- Ys_true: A dict mapping arm names to lists of corresponding ground
truth outcomes, where the order of the outcomes is the same as in outcome_names. If the benchmark problem does not provide a ground truth, this key will not be present in the dict returned by this function.
”outcome_names”: A list of metric names.
- Return type:
A dictionary with the following keys
Benchmark Runners BoTorch Test¶
- class ax.benchmark.runners.botorch_test.BotorchTestProblemRunner(test_problem_class: Type[BaseTestProblem], test_problem_kwargs: Dict[str, Any], outcome_names: List[str], modified_bounds: Optional[List[Tuple[float, float]]] = None)[source]¶
Bases:
BenchmarkRunner
A Runner for evaluating Botorch BaseTestProblems.
Given a trial the Runner will evaluate the BaseTestProblem.forward method for each arm in the trial, as well as return some metadata about the underlying Botorch problem such as the noise_std. We compute the full result on the Runner (as opposed to the Metric as is typical in synthetic test problems) because the BoTorch problem computes all metrics in one stacked tensor in the MOO case, and we wish to avoid recomputation per metric.
- classmethod deserialize_init_args(args: Dict[str, Any], decoder_registry: Optional[Dict[str, Type]] = None, class_decoder_registry: Optional[Dict[str, Callable[[Dict[str, Any]], Any]]] = None) Dict[str, Any] [source]¶
Given a dictionary, deserialize the properties needed to initialize the runner. Used for storage.
- get_Y_true(arm: Arm) Tensor [source]¶
Converts X to original bounds – only if modified bounds were provided – and evaluates the test problem. See __init__ docstring for details.
- Parameters:
X – A batch_shape x d-dim tensor of point(s) at which to evaluate the test problem.
- Returns:
A batch_shape x m-dim tensor of ground truth (noiseless) evaluations.
- get_noise_stds() Union[None, float, Dict[str, float]] [source]¶
Function returning the standard errors for the synthetic noise to be applied to the observed values. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.
- property outcome_names: List[str]¶
The names of the outcomes of the problem (in the order of the outcomes).
- poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]] [source]¶
Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax
Scheduler
.NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.
- Parameters:
trials – Trials to poll.
- Returns:
A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).
- classmethod serialize_init_args(obj: Any) Dict[str, Any] [source]¶
Serialize the properties needed to initialize the runner. Used for storage.
- test_problem: BaseTestProblem¶
Benchmark Runners Surrogate¶
- class ax.benchmark.runners.surrogate.SurrogateRunner(name: str, surrogate: Surrogate, datasets: List[SupervisedDataset], search_space: SearchSpace, outcome_names: List[str], noise_stds: Union[float, Dict[str, float]] = 0.0)[source]¶
Bases:
BenchmarkRunner
- classmethod deserialize_init_args(args: Dict[str, Any], decoder_registry: Optional[Dict[str, Type]] = None, class_decoder_registry: Optional[Dict[str, Callable[[Dict[str, Any]], Any]]] = None) Dict[str, Any] [source]¶
Given a dictionary, deserialize the properties needed to initialize the object. Used for storage.
- get_Y_true(arm: Arm) Tensor [source]¶
Function returning the ground truth values for a given arm. The synthetic noise is added as part of the Runner’s run() method. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.
- get_noise_stds() Union[None, float, Dict[str, float]] [source]¶
Function returning the standard errors for the synthetic noise to be applied to the observed values. For problems that do not have a ground truth, the Runner must implement the get_Y_Ystd() method instead.
- property outcome_names: List[str]¶
The names of the outcomes of the problem (in the order of the outcomes).
- poll_trial_status(trials: Iterable[BaseTrial]) Dict[TrialStatus, Set[int]] [source]¶
Checks the status of any non-terminal trials and returns their indices as a mapping from TrialStatus to a list of indices. Required for runners used with Ax
Scheduler
.NOTE: Does not need to handle waiting between polling calls while trials are running; this function should just perform a single poll.
- Parameters:
trials – Trials to poll.
- Returns:
A dictionary mapping TrialStatus to a list of trial indices that have the respective status at the time of the polling. This does not need to include trials that at the time of polling already have a terminal (ABANDONED, FAILED, COMPLETED) status (but it may).
- run(trial: BaseTrial) Dict[str, Any] [source]¶
Run the trial by evaluating its parameterization(s) on the surrogate model.
Note: This also sets the status of the trial to COMPLETED.
- Parameters:
trial – The trial to evaluate.
- Returns:
outcome_names: The names of the metrics being evaluated.
- Ys: A dict mapping arm names to lists of corresponding outcomes,
where the order of the outcomes is the same as in outcome_names.
- Ystds: A dict mapping arm names to lists of corresponding outcome
noise standard deviations (possibly nan if the noise level is unobserved), where the order of the outcomes is the same as in outcome_names.
- Ys_true: A dict mapping arm names to lists of corresponding ground
truth outcomes, where the order of the outcomes is the same as in outcome_names.
- Return type:
A dictionary with the following keys
- classmethod serialize_init_args(obj: Any) Dict[str, Any] [source]¶
Serialize the properties needed to initialize the runner. Used for storage.
WARNING: Because of issues with consistently saving and loading BoTorch and GPyTorch modules the SurrogateRunner cannot be serialized at this time. At load time the runner will be replaced with a SyntheticRunner.