ax.models

Base Models & Utilities

ax.models.base

class ax.models.base.Model[source]

Bases: object

Base class for an Ax model.

Note: the core methods each model has: fit, predict, gen, cross_validate, and best_point are not present in this base class, because the signatures for those methods vary based on the type of the model. This class only contains the methods that all models have in common and for which they all share the signature.

classmethod deserialize_state(serialized_state: Dict[str, Any]) Dict[str, Any][source]

Restores model’s state from its serialized form, to the format it expects to receive as kwargs.

feature_importances() Any[source]
classmethod serialize_state(raw_state: Dict[str, Any]) Dict[str, Any][source]

Serialized output of self._get_state to a JSON-ready dict. This may involve storing part of state in files / external storage and saving handles for that storage in the resulting serialized state.

ax.models.discrete_base module

class ax.models.discrete_base.DiscreteModel[source]

Bases: Model

This class specifies the interface for a model based on discrete parameters.

These methods should be implemented to have access to all of the features of Ax.

best_point(n: int, parameter_values: List[List[Union[None, str, bool, float, int]]], objective_weights: Optional[ndarray], outcome_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, Union[None, str, bool, float, int]]] = None, pending_observations: Optional[List[List[List[Union[None, str, bool, float, int]]]]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Optional[List[Union[None, str, bool, float, int]]][source]

Obtains the point that has the best value according to the model prediction and its model predictions.

Returns:

(1 x d) parameter value list representing the point with the best value according to the model prediction. None if this function is not implemented for the given model.

cross_validate(Xs_train: List[List[List[Union[None, str, bool, float, int]]]], Ys_train: List[List[float]], Yvars_train: List[List[float]], X_test: List[List[Union[None, str, bool, float, int]]]) Tuple[ndarray, ndarray][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • Xs_train – A list of m lists X of parameterizations (each parameterization is a list of parameter values of length d), each of length k_i, for each outcome.

  • Ys_train – The corresponding list of m lists Y, each of length k_i, for each outcome.

  • Yvars_train – The variances of each entry in Ys, same shape.

  • X_test – List of the j parameterizations at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) array of outcome predictions at X.

  • (j x m x m) array of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

fit(Xs: List[List[List[Union[None, str, bool, float, int]]]], Ys: List[List[float]], Yvars: List[List[float]], parameter_values: List[List[Union[None, str, bool, float, int]]], outcome_names: List[str]) None[source]

Fit model to m outcomes.

Parameters:
  • Xs – A list of m lists X of parameterizations (each parameterization is a list of parameter values of length d), each of length k_i, for each outcome.

  • Ys – The corresponding list of m lists Y, each of length k_i, for each outcome.

  • Yvars – The variances of each entry in Ys, same shape.

  • parameter_values – A list of possible values for each parameter.

  • outcome_names – A list of m outcome names.

gen(n: int, parameter_values: List[List[Union[None, str, bool, float, int]]], objective_weights: Optional[ndarray], outcome_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, Union[None, str, bool, float, int]]] = None, pending_observations: Optional[List[List[List[Union[None, str, bool, float, int]]]]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Tuple[List[List[Union[None, str, bool, float, int]]], List[float], Dict[str, Any]][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • parameter_values – A list of possible values for each parameter.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • pending_observations – A list of m lists of parameterizations (each parameterization is a list of parameter values of length d), each of length k_i, for each outcome i.

  • model_gen_options – A config dictionary that can contain model-specific options.

Returns:

2-element tuple containing

  • List of n generated points, where each point is represented by a list of parameter values.

  • List of weights for each of the n points.

predict(X: List[List[Union[None, str, bool, float, int]]]) Tuple[ndarray, ndarray][source]

Predict

Parameters:

X – List of the j parameterizations at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) array of outcome predictions at X.

  • (j x m x m) array of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

ax.models.torch_base module

class ax.models.torch_base.TorchGenResults(points: ~torch.Tensor, weights: ~torch.Tensor, gen_metadata: ~typing.Dict[str, ~typing.Any] = <factory>, candidate_metadata: ~typing.Optional[~typing.List[~typing.Optional[~typing.Dict[str, ~typing.Any]]]] = None)[source]

Bases: object

points: (n x d) Tensor of generated points. weights: n-tensor of weights for each point. gen_metadata: Generation metadata Dictionary of model-specific metadata for the given

generation candidates

candidate_metadata: Optional[List[Optional[Dict[str, Any]]]] = None
gen_metadata: Dict[str, Any]
points: Tensor
weights: Tensor
class ax.models.torch_base.TorchModel[source]

Bases: Model

This class specifies the interface for a torch-based model.

These methods should be implemented to have access to all of the features of Ax.

best_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) Optional[Tensor][source]

Identify the current best point, satisfying the constraints in the same format as to gen.

Return None if no such point can be identified.

Parameters:
  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

d-tensor of the best point.

cross_validate(datasets: List[SupervisedDataset], X_test: Tensor, search_space_digest: SearchSpaceDigest) Tuple[Tensor, Tensor][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • X_test – (j x d) tensor of the j points at which to make predictions.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

device: Optional[device] = None
dtype: Optional[dtype] = None
evaluate_acquisition_function(X: Tensor, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, acq_options: Optional[Dict[str, Any]] = None) Tensor[source]

Evaluate the acquisition function on the candidate set X.

Parameters:
  • X – (j x d) tensor of the j points at which to evaluate the acquisition function.

  • search_space_digest – A dataclass used to compactly represent a search space.

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

  • acq_options – Keyword arguments used to contruct the acquisition function.

Returns:

A single-element tensor with the acquisition value for these points.

fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResult container.

predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predict

Parameters:

X – (j x d) tensor of the j points at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

update(datasets: List[SupervisedDataset], metric_names: List[str], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Update the model.

Updating the model requires both existing and additional data. The data passed into this method will become the new training data.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome). None means that there is no additional data for the corresponding outcome.

  • metric_names – A list of metric names, with the i-th metric corresponding to the i-th dataset.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

class ax.models.torch_base.TorchOptConfig(objective_weights: ~torch.Tensor, outcome_constraints: ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = None, objective_thresholds: ~typing.Optional[~torch.Tensor] = None, linear_constraints: ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = None, fixed_features: ~typing.Optional[~typing.Dict[int, float]] = None, pending_observations: ~typing.Optional[~typing.List[~torch.Tensor]] = None, model_gen_options: ~typing.Dict[str, ~typing.Optional[~typing.Union[int, float, str, ~botorch.acquisition.acquisition.AcquisitionFunction, ~typing.Dict[int, ~typing.Any], ~typing.Dict[str, ~typing.Any], ~ax.core.optimization_config.OptimizationConfig, ~ax.models.winsorization_config.WinsorizationConfig]]] = <factory>, rounding_func: ~typing.Optional[~typing.Callable[[~torch.Tensor], ~torch.Tensor]] = None, opt_config_metrics: ~typing.Optional[~typing.Dict[str, ~ax.core.metric.Metric]] = None, is_moo: bool = False, risk_measure: ~typing.Optional[~botorch.acquisition.risk_measures.RiskMeasureMCObjective] = None)[source]

Bases: object

Container for lightweight representation of optimization arguments.

This is used for communicating between modelbridge and models. This is an ephemeral object and not meant to be stored / serialized.

objective_weights

If doing multi-objective optimization, these denote which objectives should be maximized and which should be minimized. Otherwise, the objective is to maximize a weighted sum of the columns of f(x). These are the weights.

Type:

torch.Tensor

outcome_constraints

A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

Type:

Optional[Tuple[torch.Tensor, torch.Tensor]]

objective_thresholds

A tensor containing thresholds forming a reference point from which to calculate pareto frontier hypervolume. Points that do not dominate the objective_thresholds contribute nothing to hypervolume.

Type:

Optional[torch.Tensor]

linear_constraints

A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b for feasible x.

Type:

Optional[Tuple[torch.Tensor, torch.Tensor]]

fixed_features

A map {feature_index: value} for features that should be fixed to a particular value during generation.

Type:

Optional[Dict[int, float]]

pending_observations

A list of m (k_i x d) feature tensors X for m outcomes and k_i pending observations for outcome i.

Type:

Optional[List[torch.Tensor]]

model_gen_options

A config dictionary that can contain model-specific options. This commonly includes optimizer_kwargs, which often specifies the optimizer options to be passed to the optimizer while optimizing the acquisition function. These are generally expected to mimic the signature of optimize_acqf, though not all models may support all possible arguments and some models may support additional arguments that are not passed to the optimizer. While constructing a generation strategy, these options can be passed in as follows: >>> model_gen_kwargs = { >>> “model_gen_options”: { >>> “optimizer_kwargs”: { >>> “num_restarts”: 20, >>> “sequential”: False, >>> “options”: { >>> “batch_limit: 5, >>> “maxiter”: 200, >>> }, >>> }, >>> }, >>> }

Type:

Dict[str, Optional[Union[int, float, str, botorch.acquisition.acquisition.AcquisitionFunction, Dict[int, Any], Dict[str, Any], ax.core.optimization_config.OptimizationConfig, ax.models.winsorization_config.WinsorizationConfig]]]

rounding_func

A function that rounds an optimization result appropriately (i.e., according to round-trip transformations).

Type:

Optional[Callable[[torch.Tensor], torch.Tensor]]

opt_config_metrics

A dictionary of metrics that are included in the optimization config.

Type:

Optional[Dict[str, ax.core.metric.Metric]]

is_moo

A boolean denoting whether this is for an MOO problem.

Type:

bool

risk_measure

An optional risk measure, used for robust optimization.

Type:

Optional[botorch.acquisition.risk_measures.RiskMeasureMCObjective]

fixed_features: Optional[Dict[int, float]] = None
is_moo: bool = False
linear_constraints: Optional[Tuple[Tensor, Tensor]] = None
model_gen_options: Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]
objective_thresholds: Optional[Tensor] = None
objective_weights: Tensor
opt_config_metrics: Optional[Dict[str, Metric]] = None
outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None
pending_observations: Optional[List[Tensor]] = None
risk_measure: Optional[RiskMeasureMCObjective] = None
rounding_func: Optional[Callable[[Tensor], Tensor]] = None

ax.models.model_utils module

class ax.models.model_utils.TorchModelLike(*args, **kwargs)[source]

Bases: Protocol

A protocol that stands in for TorchModel like objects that have a predict method.

predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predicts outcomes given an input tensor.

Parameters:

X – A n x d tensor of input parameters.

Returns:

The predicted posterior mean as an n x o-dim tensor. Tensor: The predicted posterior covariance as a n x o x o-dim tensor.

Return type:

Tensor

ax.models.model_utils.add_fixed_features(tunable_points: ndarray, d: int, fixed_features: Optional[Dict[int, float]], tunable_feature_indices: ndarray) ndarray[source]

Add fixed features to points in tunable space.

Parameters:
  • tunable_points – Points in tunable space.

  • d – Dimension of parameter space.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • tunable_feature_indices – Parameter indices (in d) which are tunable.

Returns:

Points in the full d-dimensional space, defined by bounds.

Return type:

points

ax.models.model_utils.as_array(x: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ...]]) Union[ndarray, Tuple[ndarray, ...]][source]

Convert every item in a tuple of tensors/arrays into an array.

Parameters:

x – A tensor, array, or a tuple of potentially mixed tensors and arrays.

Returns:

x, with everything converted to array.

ax.models.model_utils.best_in_sample_point(Xs: Union[List[Tensor], List[ndarray]], model: TorchModelLike, bounds: List[Tuple[float, float]], objective_weights: Optional[Union[Tensor, ndarray]], outcome_constraints: Optional[Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]]] = None, linear_constraints: Optional[Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]]] = None, fixed_features: Optional[Dict[int, float]] = None, risk_measure: Optional[RiskMeasureMCObjective] = None, options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Optional[Tuple[Union[Tensor, ndarray], float]][source]

Select the best point that has been observed.

Implements two approaches to selecting the best point.

For both approaches, only points that satisfy parameter space constraints (bounds, linear_constraints, fixed_features) will be returned. Points must also be observed for all objective and constraint outcomes. Returned points may violate outcome constraints, depending on the method below.

1: Select the point that maximizes the expected utility (objective_weights^T posterior_objective_means - baseline) * Prob(feasible) Here baseline should be selected so that at least one point has positive utility. It can be specified in the options dict, otherwise min (objective_weights^T posterior_objective_means) will be used, where the min is over observed points.

2: Select the best-objective point that is feasible with at least probability p.

The following quantities may be specified in the options dict:

  • best_point_method: ‘max_utility’ (default) or ‘feasible_threshold’ to select between the two approaches described above.

  • utility_baseline: Value for the baseline used in max_utility approach. If not provided, defaults to min objective value.

  • probability_threshold: Threshold for the feasible_threshold approach. Defaults to p=0.95.

  • feasibility_mc_samples: Number of MC samples used for estimating the probability of feasibility (defaults 10k).

Parameters:
  • Xs – Training data for the points, among which to select the best.

  • model – A Torch model or Surrogate.

  • bounds – A list of (lower, upper) tuples for each feature.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value in the best point.

  • risk_measure – An optional risk measure for reporting best robust point.

  • options – A config dictionary with settings described above.

Returns:

  • d-array of the best point,

  • utility at the best point.

Return type:

A two-element tuple or None if no feasible point exist. In tuple

ax.models.model_utils.best_observed_point(model: TorchModelLike, bounds: List[Tuple[float, float]], objective_weights: Optional[Union[Tensor, ndarray]], outcome_constraints: Optional[Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]]] = None, linear_constraints: Optional[Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]]] = None, fixed_features: Optional[Dict[int, float]] = None, risk_measure: Optional[RiskMeasureMCObjective] = None, options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Optional[Union[Tensor, ndarray]][source]

Select the best point that has been observed.

Implements two approaches to selecting the best point.

For both approaches, only points that satisfy parameter space constraints (bounds, linear_constraints, fixed_features) will be returned. Points must also be observed for all objective and constraint outcomes. Returned points may violate outcome constraints, depending on the method below.

1: Select the point that maximizes the expected utility (objective_weights^T posterior_objective_means - baseline) * Prob(feasible) Here baseline should be selected so that at least one point has positive utility. It can be specified in the options dict, otherwise min (objective_weights^T posterior_objective_means) will be used, where the min is over observed points.

2: Select the best-objective point that is feasible with at least probability p.

The following quantities may be specified in the options dict:

  • best_point_method: ‘max_utility’ (default) or ‘feasible_threshold’ to select between the two approaches described above.

  • utility_baseline: Value for the baseline used in max_utility approach. If not provided, defaults to min objective value.

  • probability_threshold: Threshold for the feasible_threshold approach. Defaults to p=0.95.

  • feasibility_mc_samples: Number of MC samples used for estimating the probability of feasibility (defaults 10k).

Parameters:
  • model – A Torch model or Surrogate.

  • bounds – A list of (lower, upper) tuples for each feature.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value in the best point.

  • risk_measure – An optional risk measure for reporting best robust point.

  • options – A config dictionary with settings described above.

Returns:

A d-array of the best point, or None if no feasible point exists.

ax.models.model_utils.check_duplicate(point: ndarray, points: ndarray) bool[source]

Check if a point exists in another array.

Parameters:
  • point – Newly generated point to check.

  • points – Points previously generated.

Returns:

True if the point is contained in points, else False

ax.models.model_utils.check_param_constraints(linear_constraints: Tuple[ndarray, ndarray], point: ndarray) Tuple[bool, ndarray][source]

Check if a point satisfies parameter constraints.

Parameters:
  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • point – A candidate point in d-dimensional space, as a (1 x d) matrix.

Returns:

2-element tuple containing

  • Flag that is True if all constraints are satisfied by the point.

  • Indices of constraints which are violated by the point.

ax.models.model_utils.enumerate_discrete_combinations(discrete_choices: Dict[int, List[Union[int, float]]]) List[Dict[int, Union[float, int]]][source]
ax.models.model_utils.filter_constraints_and_fixed_features(X: Union[Tensor, ndarray], bounds: List[Tuple[float, float]], linear_constraints: Optional[Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]]] = None, fixed_features: Optional[Dict[int, float]] = None) Union[Tensor, ndarray][source]

Filter points to those that satisfy bounds, linear_constraints, and fixed_features.

Parameters:
  • X – An tensor or array of points.

  • bounds – A list of (lower, upper) tuples for each feature.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value in the best point.

Returns:

Feasible points.

ax.models.model_utils.get_observed(Xs: Union[List[Tensor], List[ndarray]], objective_weights: Union[Tensor, ndarray], outcome_constraints: Optional[Tuple[Union[Tensor, ndarray], Union[Tensor, ndarray]]] = None) Union[Tensor, ndarray][source]

Filter points to those that are observed for objective outcomes and outcomes that show up in outcome_constraints (if there are any).

Parameters:
  • Xs – A list of m (k_i x d) feature matrices X. Number of rows k_i can vary from i=1,…,m.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

Returns:

Points observed for all objective outcomes and outcome constraints.

ax.models.model_utils.mk_discrete_choices(ssd: SearchSpaceDigest, fixed_features: Optional[Dict[int, float]] = None) Dict[int, List[Union[int, float]]][source]
ax.models.model_utils.rejection_sample(gen_unconstrained: Callable[[int, int, ndarray, Optional[Dict[int, float]]], ndarray], n: int, d: int, tunable_feature_indices: ndarray, linear_constraints: Optional[Tuple[ndarray, ndarray]] = None, deduplicate: bool = False, max_draws: Optional[int] = None, fixed_features: Optional[Dict[int, float]] = None, rounding_func: Optional[Callable[[ndarray], ndarray]] = None, existing_points: Optional[ndarray] = None) Tuple[ndarray, int][source]

Rejection sample in parameter space. Parameter space is typically [0, 1] for all tunable parameters.

Models must implement a gen_unconstrained method in order to support rejection sampling via this utility.

Parameters:
  • gen_unconstrained – A callable that generates unconstrained points in the parameter space. This is typically the _gen_unconstrained method of a RandomModel.

  • n – Number of samples to generate.

  • d – Dimensionality of the parameter space.

  • tunable_feature_indices – Indices of the tunable features in the parameter space.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • deduplicate – If true, reject points that are duplicates of previously generated points. The points are deduplicated after applying the rounding function.

  • max_draws – Maximum number of attemped draws before giving up.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • rounding_func – A function that rounds an optimization result appropriately (e.g., according to round-trip transformations).

  • existing_points – A set of previously generated points to use for deduplication. These should be provided in the parameter space model operates in.

ax.models.model_utils.tunable_feature_indices(bounds: List[Tuple[float, float]], fixed_features: Optional[Dict[int, float]] = None) ndarray[source]

Get the feature indices of tunable features.

Parameters:
  • bounds – A list of (lower, upper) tuples for each column of X.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

Returns:

The indices of tunable features.

ax.models.model_utils.validate_bounds(bounds: List[Tuple[float, float]], fixed_feature_indices: ndarray) None[source]

Ensure the requested space is [0,1]^d.

Parameters:
  • bounds – A list of d (lower, upper) tuples for each column of X.

  • fixed_feature_indices – Indices of features which are fixed at a particular value.

ax.models.types

ax.models.winsorization_config module

class ax.models.winsorization_config.WinsorizationConfig(lower_quantile_margin: float = 0.0, upper_quantile_margin: float = 0.0, lower_boundary: Optional[float] = None, upper_boundary: Optional[float] = None)[source]

Bases: object

Dataclass for storing Winsorization configuration parameters

Attributes: lower_quantile_margin: Winsorization will increase any metric value below this

quantile to this quantile’s value.

upper_quantile_margin: Winsorization will decrease any metric value above this

quantile to this quantile’s value. NOTE: this quantile will be inverted before any operations, e.g., a value of 0.2 will decrease values above the 80th percentile to the value of the 80th percentile.

lower_boundary: If this value is lesser than the metric value corresponding to

lower_quantile_margin, set metric values below lower_boundary to lower_boundary and leave larger values unaffected.

upper_boundary: If this value is greater than the metric value corresponding to

upper_quantile_margin, set metric values above upper_boundary to upper_boundary and leave smaller values unaffected.

lower_boundary: Optional[float] = None
lower_quantile_margin: float = 0.0
upper_boundary: Optional[float] = None
upper_quantile_margin: float = 0.0

Discrete Models

ax.models.discrete.eb_thompson module

class ax.models.discrete.eb_thompson.EmpiricalBayesThompsonSampler(num_samples: int = 10000, min_weight: Optional[float] = None, uniform_weights: bool = False)[source]

Bases: ThompsonSampler

Generator for Thompson sampling using Empirical Bayes estimates.

The generator applies positive-part James-Stein Estimator to the data passed in via fit and then performs Thompson Sampling.

ax.models.discrete.full_factorial module

class ax.models.discrete.full_factorial.FullFactorialGenerator(max_cardinality: int = 100, check_cardinality: bool = True)[source]

Bases: DiscreteModel

Generator for full factorial designs.

Generates arms for all possible combinations of parameter values, each with weight 1.

The value of n supplied to gen will be ignored, as the number of arms generated is determined by the list of parameter values. To suppress this warning, use n = -1.

gen(n: int, parameter_values: List[List[Union[None, str, bool, float, int]]], objective_weights: Optional[ndarray], outcome_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, Union[None, str, bool, float, int]]] = None, pending_observations: Optional[List[List[List[Union[None, str, bool, float, int]]]]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Tuple[List[List[Union[None, str, bool, float, int]]], List[float], Dict[str, Any]][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • parameter_values – A list of possible values for each parameter.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • pending_observations – A list of m lists of parameterizations (each parameterization is a list of parameter values of length d), each of length k_i, for each outcome i.

  • model_gen_options – A config dictionary that can contain model-specific options.

Returns:

2-element tuple containing

  • List of n generated points, where each point is represented by a list of parameter values.

  • List of weights for each of the n points.

ax.models.discrete.thompson module

class ax.models.discrete.thompson.ThompsonSampler(num_samples: int = 10000, min_weight: Optional[float] = None, uniform_weights: bool = False)[source]

Bases: DiscreteModel

Generator for Thompson sampling.

The generator performs Thompson sampling on the data passed in via fit. Arms are given weight proportional to the probability that they are winners, according to Monte Carlo simulations.

fit(Xs: List[List[List[Union[None, str, bool, float, int]]]], Ys: List[List[float]], Yvars: List[List[float]], parameter_values: List[List[Union[None, str, bool, float, int]]], outcome_names: List[str]) None[source]

Fit model to m outcomes.

Parameters:
  • Xs – A list of m lists X of parameterizations (each parameterization is a list of parameter values of length d), each of length k_i, for each outcome.

  • Ys – The corresponding list of m lists Y, each of length k_i, for each outcome.

  • Yvars – The variances of each entry in Ys, same shape.

  • parameter_values – A list of possible values for each parameter.

  • outcome_names – A list of m outcome names.

gen(n: int, parameter_values: List[List[Union[None, str, bool, float, int]]], objective_weights: Optional[ndarray], outcome_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, Union[None, str, bool, float, int]]] = None, pending_observations: Optional[List[List[List[Union[None, str, bool, float, int]]]]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Tuple[List[List[Union[None, str, bool, float, int]]], List[float], Dict[str, Any]][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • parameter_values – A list of possible values for each parameter.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • pending_observations – A list of m lists of parameterizations (each parameterization is a list of parameter values of length d), each of length k_i, for each outcome i.

  • model_gen_options – A config dictionary that can contain model-specific options.

Returns:

2-element tuple containing

  • List of n generated points, where each point is represented by a list of parameter values.

  • List of weights for each of the n points.

predict(X: List[List[Union[None, str, bool, float, int]]]) Tuple[ndarray, ndarray][source]

Predict

Parameters:

X – List of the j parameterizations at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) array of outcome predictions at X.

  • (j x m x m) array of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

Random Models

ax.models.random.base module

class ax.models.random.base.RandomModel(deduplicate: bool = True, seed: Optional[int] = None, generated_points: Optional[ndarray] = None, fallback_to_sample_polytope: bool = False)[source]

Bases: Model

This class specifies the basic skeleton for a random model.

As random generators do not make use of models, they do not implement the fit or predict methods.

These models do not need data, or optimization configs.

To satisfy search space parameter constraints, these models can use rejection sampling. To enable rejection sampling for a subclass, only only _gen_samples needs to be implemented, or alternatively, _gen_unconstrained/gen can be directly implemented.

deduplicate

If True (defaults to True), a single instantiation of the model will not return the same point twice. This flag is used in rejection sampling.

seed

An optional seed value for scrambling.

generated_points

A set of previously generated points to use for deduplication. These should be provided in the raw transformed space the model operates in.

fallback_to_sample_polytope

If True, when rejection sampling fails, we fall back to the HitAndRunPolytopeSampler.

gen(n: int, bounds: List[Tuple[float, float]], linear_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, float]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None, rounding_func: Optional[Callable[[ndarray], ndarray]] = None) Tuple[ndarray, ndarray][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • bounds – A list of (lower, upper) tuples for each column of X. Defined on [0, 1]^d.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • model_gen_options – A config dictionary that is passed along to the model.

  • rounding_func – A function that rounds an optimization result appropriately (e.g., according to round-trip transformations).

Returns:

2-element tuple containing

  • (n x d) array of generated points.

  • Uniform weights, an n-array of ones for each point.

ax.models.random.uniform module

class ax.models.random.uniform.UniformGenerator(deduplicate: bool = True, seed: Optional[int] = None, generated_points: Optional[ndarray] = None, fallback_to_sample_polytope: bool = False)[source]

Bases: RandomModel

This class specifies a uniform random generation algorithm.

As a uniform generator does not make use of a model, it does not implement the fit or predict methods.

See base RandomModel for a description of model attributes.

ax.models.random.sobol module

class ax.models.random.sobol.SobolGenerator(seed: Optional[int] = None, deduplicate: bool = True, init_position: int = 0, scramble: bool = True, generated_points: Optional[ndarray] = None, fallback_to_sample_polytope: bool = False)[source]

Bases: RandomModel

This class specifies the generation algorithm for a Sobol generator.

As Sobol does not make use of a model, it does not implement the fit or predict methods.

init_position

The initial state of the Sobol generator. Starts at 0 by default.

scramble

If True, permutes the parameter values among the elements of the Sobol sequence. Default is True.

See base `RandomModel` for a description of remaining attributes.
property engine: Optional[SobolEngine]

Return a singleton SobolEngine.

gen(n: int, bounds: List[Tuple[float, float]], linear_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, float]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None, rounding_func: Optional[Callable[[ndarray], ndarray]] = None) Tuple[ndarray, ndarray][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • bounds – A list of (lower, upper) tuples for each column of X.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • rounding_func – A function that rounds an optimization result appropriately (e.g., according to round-trip transformations).

Returns:

2-element tuple containing

  • (n x d) array of generated points.

  • Uniform weights, an n-array of ones for each point.

init_engine(n_tunable_features: int) SobolEngine[source]

Initialize singleton SobolEngine, only on gen.

Parameters:

n_tunable_features – The number of features which can be searched over.

Returns:

SobolEngine, which can generate Sobol points.

ax.models.random.alebo_initializer module

class ax.models.random.alebo_initializer.ALEBOInitializer(B: ndarray, nsamp: int = 10000, init_bound: int = 16, seed: Optional[int] = None)[source]

Bases: UniformGenerator

Sample in a low-dimensional linear embedding, to initialize ALEBO.

Generates points on a linear subspace of [-1, 1]^D by generating points in [-b, b]^D, projecting them down with a matrix B, and then projecting them back up with the pseudoinverse of B. Thus points thus all lie in a linear subspace defined by B. Points whose up-projection falls outside of [-1, 1]^D are thrown out, via rejection sampling.

To generate n points, we start with nsamp points in [-b, b]^D, which are mapped down to the embedding and back up as described above. If >=n points fall within [-1, 1]^D after being mapped up, then the first n are returned. If there are less than n points in [-1, 1]^D, then b is constricted (halved) and the process is repeated until there are at least n points in [-1, 1]^D. There exists a b small enough that all points will project to [-1, 1]^D, so this is guaranteed to terminate, typically after few rounds.

Parameters:
  • B – A (dxD) projection down.

  • nsamp – Number of samples to use for rejection sampling.

  • init_bound – b for the initial sampling space described above.

  • seed – seed for UniformGenerator

gen(n: int, bounds: List[Tuple[float, float]], linear_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, float]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None, rounding_func: Optional[Callable[[ndarray], ndarray]] = None) Tuple[ndarray, ndarray][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • bounds – A list of (lower, upper) tuples for each column of X. Defined on [0, 1]^d.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • model_gen_options – A config dictionary that is passed along to the model.

  • rounding_func – A function that rounds an optimization result appropriately (e.g., according to round-trip transformations).

Returns:

2-element tuple containing

  • (n x d) array of generated points.

  • Uniform weights, an n-array of ones for each point.

ax.models.random.rembo_initializer module

class ax.models.random.rembo_initializer.REMBOInitializer(A: ndarray, bounds_d: List[Tuple[float, float]], seed: Optional[int] = None)[source]

Bases: UniformGenerator

Sample in a low-dimensional linear embedding.

Generates points in [-1, 1]^D by generating points in a d-dimensional embedding, with box bounds as specified. When points are projected up, if they fall outside [-1, 1]^D they are clamped to those bounds.

Parameters:
  • A – A (Dxd) linear embedding

  • bounds_d – Box bounds in the low-d space

  • seed – seed for UniformGenerator

gen(n: int, bounds: List[Tuple[float, float]], linear_constraints: Optional[Tuple[ndarray, ndarray]] = None, fixed_features: Optional[Dict[int, float]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None, rounding_func: Optional[Callable[[ndarray], ndarray]] = None) Tuple[ndarray, ndarray][source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • bounds – A list of (lower, upper) tuples for each column of X. Defined on [0, 1]^d.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • model_gen_options – A config dictionary that is passed along to the model.

  • rounding_func – A function that rounds an optimization result appropriately (e.g., according to round-trip transformations).

Returns:

2-element tuple containing

  • (n x d) array of generated points.

  • Uniform weights, an n-array of ones for each point.

project_up(X: ndarray) ndarray[source]

Project to high-dimensional space.

Torch Models & Utilities

ax.models.torch.alebo module

class ax.models.torch.alebo.ALEBO(B: Tensor, laplace_nsamp: int = 25, fit_restarts: int = 10)[source]

Bases: BotorchModel

Does Bayesian optimization in a linear subspace with ALEBO.

The (d x D) projection down matrix B must be provided, and must be that used for the initialization.

Function evaluations happen in the high-D space. We only evaluate points such that x = pinverse(B) @ B @ x (that is, points inside the subspace). Under that constraint, the projection is invertible.

Parameters:
  • B – (d x D) projection matrix (projects down).

  • laplace_nsamp – Number of samples for posterior sampling of kernel hyperparameters.

  • fit_restarts – Number of random restarts for MAP estimation.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
best_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) Optional[Tensor][source]

Identify the current best point, satisfying the constraints in the same format as to gen.

Return None if no such point can be identified.

Parameters:
  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

d-tensor of the best point.

cross_validate(datasets: List[SupervisedDataset], X_test: Tensor, **kwargs: Any) Tuple[Tensor, Tensor][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • X_test – (j x d) tensor of the j points at which to make predictions.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate candidates.

Candidates are generated in the linear embedding with the polytope constraints described in the paper.

model_gen_options can contain ‘raw_samples’ (number of samples used for initializing the acquisition function optimization) and ‘num_restarts’ (number of restarts for acquisition function optimization).

get_and_fit_model(Xs: List[Tensor], Ys: List[Tensor], Yvars: List[Tensor], state_dicts: Optional[List[MutableMapping[str, Tensor]]] = None) GPyTorchModel[source]

Get a fitted ALEBO model for each outcome.

Parameters:
  • Xs – X for each outcome, already projected down.

  • Ys – Y for each outcome.

  • Yvars – Noise variance of Y for each outcome.

  • state_dicts – State dicts to initialize model fitting.

Returns: Fitted ALEBO model.

predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predict

Parameters:

X – (j x d) tensor of the j points at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

class ax.models.torch.alebo.ALEBOGP(B: Tensor, train_X: Tensor, train_Y: Tensor, train_Yvar: Tensor)[source]

Bases: SingleTaskGP

The GP for ALEBO.

Uses the Mahalanobis kernel defined in ALEBOKernel, along with a ScaleKernel to add a kernel variance and a fitted constant mean.

In non-batch mode, there is a single kernel that produces MVN predictions as usual for a GP. With b batches, each batch has its own set of kernel hyperparameters and each batch represents a sample from the hyperparameter posterior distribution. When making a prediction (with __call__), these samples are integrated over using moment matching. So, the predictions are an MVN as usual with the same shape as in non-batch mode.

Parameters:
  • B – (d x D) Projection matrix.

  • train_X – (n x d) X training data.

  • train_Y – (n x 1) Y training data.

  • train_Yvar – (n x 1) Noise variances of each training Y.

posterior(X: Tensor, output_indices: Optional[List[int]] = None, observation_noise: Union[bool, Tensor] = False, posterior_transform: Optional[PosteriorTransform] = None, **kwargs: Any) GPyTorchPosterior[source]

Computes the posterior over model outputs at the provided points.

Parameters:
  • X – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • output_indices – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q x m).

  • posterior_transform – An optional PosteriorTransform.

Returns:

A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes observation noise if specified.

class ax.models.torch.alebo.ALEBOKernel(B: Tensor, batch_shape: Size)[source]

Bases: Kernel

The kernel for ALEBO.

Suppose there exists an ARD RBF GP on an (unknown) linear embedding with projection matrix A. We make function evaluations in a different linear embedding with projection matrix B (known). This is the appropriate kernel for fitting those data.

This kernel computes a Mahalanobis distance, and the (d x d) PD distance matrix Gamma is a parameter that must be fit. This is done by fitting its upper Cholesky decomposition, U.

Parameters:
  • B – (d x D) Projection matrix.

  • batch_shape – Batch shape as usual for gpytorch kernels.

forward(x1: Tensor, x2: Tensor, diag: bool = False, last_dim_is_batch: bool = False, **params: Any) Tensor[source]

Compute kernel distance.

training: bool
ax.models.torch.alebo.alebo_acqf_optimizer(acq_function: AcquisitionFunction, bounds: Tensor, n: int, inequality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]], fixed_features: Optional[Dict[int, float]], rounding_func: Optional[Callable[[Tensor], Tensor]], raw_samples: int, num_restarts: int, B: Tensor) Tuple[Tensor, Tensor][source]

Optimize the acquisition function for ALEBO.

We are optimizing over a polytope within the subspace, and so begin each random restart of the acquisition function optimization with points that lie within that polytope.

ax.models.torch.alebo.ei_or_nei(model: Union[ALEBOGP, ModelListGP], objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]], X_observed: Tensor, X_pending: Optional[Tensor], q: int, noiseless: bool) AcquisitionFunction[source]

Use analytic EI if appropriate, otherwise Monte Carlo NEI.

Analytic EI can be used if: Single outcome, no constraints, no pending points, not batch, and no noise.

Parameters:
  • model – GP.

  • objective_weights – Weights on each outcome for the objective.

  • outcome_constraints – Outcome constraints.

  • X_observed – Observed points for NEI.

  • X_pending – Pending points.

  • q – Batch size.

  • noiseless – True if evaluations are noiseless.

Returns: An AcquisitionFunction, either analytic EI or MC NEI.

ax.models.torch.alebo.extract_map_statedict(m_b: Union[ALEBOGP, ModelListGP], num_outputs: int) List[MutableMapping[str, Tensor]][source]

Extract MAP statedict from the batch-mode ALEBO GP.

The batch GP can be either a single ALEBO GP or a ModelListGP of ALEBO GPs.

Parameters:
  • m_b – Batch-mode GP.

  • num_outputs – Number of outputs being modeled.

ax.models.torch.alebo.get_batch_model(B: Tensor, train_X: Tensor, train_Y: Tensor, train_Yvar: Tensor, Uvec_batch: Tensor, mean_constant_batch: Tensor, output_scale_batch: Tensor) ALEBOGP[source]

Construct a batch-mode ALEBO GP using batch tensors of hyperparameters.

Parameters:
  • B – Projection matrix.

  • train_X – X training data.

  • train_Y – Y training data.

  • train_Yvar – Noise variances of each training Y.

  • Uvec_batch – Batch tensor of Uvec hyperparameters.

  • mean_constant_batch – Batch tensor of mean constant hyperparameter.

  • output_scale_batch – Batch tensor of output scale hyperparameter.

Returns: Batch-mode ALEBO GP.

ax.models.torch.alebo.get_fitted_model(B: Tensor, train_X: Tensor, train_Y: Tensor, train_Yvar: Tensor, restarts: int, nsamp: int, init_state_dict: Optional[Dict[str, Tensor]]) ALEBOGP[source]

Get a fitted ALEBO GP.

We do random restart optimization to get a MAP model, then use the Laplace approximation to draw posterior samples of kernel hyperparameters, and finally construct a batch-mode model where each batch is one of those sampled sets of kernel hyperparameters.

Parameters:
  • B – Projection matrix.

  • train_X – X training data.

  • train_Y – Y training data.

  • train_Yvar – Noise variances of each training Y.

  • restarts – Number of restarts for MAP estimation.

  • nsamp – Number of samples to draw from kernel hyperparameter posterior.

  • init_state_dict – Optionally begin MAP estimation with this state dict.

Returns: Batch-mode (nsamp batches) fitted ALEBO GP.

ax.models.torch.alebo.get_map_model(B: Tensor, train_X: Tensor, train_Y: Tensor, train_Yvar: Tensor, restarts: int, init_state_dict: Optional[Dict[str, Tensor]]) ExactMarginalLogLikelihood[source]

Do random-restart optimization for MAP fitting of an ALEBO GP model.

Parameters:
  • B – Projection matrix.

  • train_X – X training data.

  • train_Y – Y training data.

  • train_Yvar – Noise variances of each training Y.

  • restarts – Number of restarts for MAP estimation.

  • init_state_dict – Optionally begin MAP estimation with this state dict.

Returns: non-batch ALEBO GP with MAP kernel hyperparameters.

ax.models.torch.alebo.laplace_sample_U(mll: ExactMarginalLogLikelihood, nsamp: int) Tuple[Tensor, Tensor, Tensor][source]

Draw posterior samples of kernel hyperparameters using Laplace approximation.

Only the Mahalanobis distance matrix is sampled.

The diagonal of the Hessian is estimated using finite differences of the autograd gradients. The Laplace approximation is then N(p_map, inv(-H)). We construct a set of nsamp kernel hyperparameters by drawing nsamp-1 values from this distribution, and prepending as the first sample the MAP parameters.

Parameters:
  • mll – MLL object of MAP ALEBO GP.

  • nsamp – Number of samples to return.

Returns: Batch tensors of the kernel hyperparameters Uvec, mean constant,

and output scale.

ax.models.torch.alebo.module_to_array(module: Module) Tuple[ndarray, Dict[str, TorchAttr], Optional[ndarray]][source]

Extract named parameters from a module into a numpy array.

Only extracts parameters with requires_grad, since it is meant for optimizing.

NOTE: module_to_array was originally a BoTorch function and was later deprecated. It has been copied here because ALEBO depends on it, and because ALEBO itself is deprecated, it is not worth moving ALEBO to the new syntax.

Parameters:

module – A module with parameters. May specify parameter constraints in a named_parameters_and_constraints method.

Returns:

3-element tuple containing - The parameter values as a numpy array. - An ordered dictionary with the name and tensor attributes of each parameter. - A 2 x n_params numpy array with lower and upper bounds if at least one constraint is finite, and None otherwise.

Example

>>> mll = ExactMarginalLogLikelihood(model.likelihood, model)
>>> parameter_array, property_dict, bounds_out = module_to_array(mll)
ax.models.torch.alebo.set_params_with_array(module: TModule, x: ndarray, property_dict: Dict[str, TorchAttr]) TModule[source]

Set module parameters with values from numpy array.

NOTE: set_params_with_array was originally a BoTorch function and was later deprecated. It has been copied here because ALEBO depends on it, and because ALEBO itself is deprecated, it is not worth moving ALEBO to the new syntax.

Parameters:
  • module – Module with parameters to be set

  • x – Numpy array with parameter values

  • property_dict – Dictionary of parameter names and torch attributes as returned by module_to_array.

Returns:

module with parameters updated in-place.

Return type:

Module

Example

>>> mll = ExactMarginalLogLikelihood(model.likelihood, model)
>>> parameter_array, property_dict, bounds_out = module_to_array(mll)
>>> parameter_array += 0.1  # perturb parameters (for example only)
>>> mll = set_params_with_array(mll, parameter_array,  property_dict)

ax.models.torch.botorch module

class ax.models.torch.botorch.BotorchModel(model_constructor: ~typing.Callable[[~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[int], ~typing.List[int], ~typing.List[str], ~typing.Optional[~typing.Dict[str, ~torch.Tensor]], ~typing.Any], ~botorch.models.model.Model] = <function get_and_fit_model>, model_predictor: ~typing.Callable[[~botorch.models.model.Model, ~torch.Tensor], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function predict_from_model>, acqf_constructor: ~ax.models.torch.botorch_defaults.TAcqfConstructor = <function get_qLogNEI>, acqf_optimizer: ~typing.Callable[[~botorch.acquisition.acquisition.AcquisitionFunction, ~torch.Tensor, int, ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Callable[[~torch.Tensor], ~torch.Tensor]], ~typing.Any], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function scipy_optimizer>, best_point_recommender: ~typing.Callable[[~ax.models.torch_base.TorchModel, ~typing.List[~typing.Tuple[float, float]], ~torch.Tensor, ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Dict[str, ~typing.Optional[~typing.Union[int, float, str, ~botorch.acquisition.acquisition.AcquisitionFunction, ~typing.Dict[int, ~typing.Any], ~typing.Dict[str, ~typing.Any], ~ax.core.optimization_config.OptimizationConfig, ~ax.models.winsorization_config.WinsorizationConfig]]]], ~typing.Optional[~typing.Dict[int, float]]], ~typing.Optional[~torch.Tensor]] = <function recommend_best_observed_point>, refit_on_cv: bool = False, refit_on_update: bool = True, warm_start_refitting: bool = True, use_input_warping: bool = False, use_loocv_pseudo_likelihood: bool = False, prior: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, **kwargs: ~typing.Any)[source]

Bases: TorchModel

Customizable botorch model.

By default, this uses a noisy Log Expected Improvement (qLogNEI) acquisition function on top of a model made up of separate GPs, one for each outcome. This behavior can be modified by providing custom implementations of the following components:

  • a model_constructor that instantiates and fits a model on data

  • a model_predictor that predicts outcomes using the fitted model

  • a acqf_constructor that creates an acquisition function from a fitted model

  • a acqf_optimizer that optimizes the acquisition function

  • a best_point_recommender that recommends a current “best” point (i.e.,

    what the model recommends if the learning process ended now)

Parameters:
  • model_constructor – A callable that instantiates and fits a model on data, with signature as described below.

  • model_predictor – A callable that predicts using the fitted model, with signature as described below.

  • acqf_constructor – A callable that creates an acquisition function from a fitted model, with signature as described below.

  • acqf_optimizer – A callable that optimizes the acquisition function, with signature as described below.

  • best_point_recommender – A callable that recommends the best point, with signature as described below.

  • refit_on_cv – If True, refit the model for each fold when performing cross-validation.

  • refit_on_update – If True, refit the model after updating the training data using the update method.

  • warm_start_refitting – If True, start model refitting from previous model parameters in order to speed up the fitting process.

  • prior

    An optional dictionary that contains the specification of GP model prior. Currently, the keys include: - covar_module_prior: prior on covariance matrix e.g.

    {“lengthscale_prior”: GammaPrior(3.0, 6.0)}.

    • type: type of prior on task covariance matrix e.g.`LKJCovariancePrior`.

    • sd_prior: A scalar prior over nonnegative numbers, which is used for the

      default LKJCovariancePrior task_covar_prior.

    • eta: The eta parameter on the default LKJ task_covar_prior.

Call signatures:

model_constructor(
    Xs,
    Ys,
    Yvars,
    task_features,
    fidelity_features,
    metric_names,
    state_dict,
    **kwargs,
) -> model

Here Xs, Ys, Yvars are lists of tensors (one element per outcome), task_features identifies columns of Xs that should be modeled as a task, fidelity_features is a list of ints that specify the positions of fidelity parameters in ‘Xs’, metric_names provides the names of each Y in Ys, state_dict is a pytorch module state dict, and model is a BoTorch Model. Optional kwargs are being passed through from the BotorchModel constructor. This callable is assumed to return a fitted BoTorch model that has the same dtype and lives on the same device as the input tensors.

model_predictor(model, X) -> [mean, cov]

Here model is a fitted botorch model, X is a tensor of candidate points, and mean and cov are the posterior mean and covariance, respectively.

acqf_constructor(
    model,
    objective_weights,
    outcome_constraints,
    X_observed,
    X_pending,
    **kwargs,
) -> acq_function

Here model is a botorch Model, objective_weights is a tensor of weights for the model outputs, outcome_constraints is a tuple of tensors describing the (linear) outcome constraints, X_observed are previously observed points, and X_pending are points whose evaluation is pending. acq_function is a BoTorch acquisition function crafted from these inputs. For additional details on the arguments, see get_qLogNEI.

acqf_optimizer(
    acq_function,
    bounds,
    n,
    inequality_constraints,
    equality_constraints,
    fixed_features,
    rounding_func,
    **kwargs,
) -> candidates

Here acq_function is a BoTorch AcquisitionFunction, bounds is a tensor containing bounds on the parameters, n is the number of candidates to be generated, inequality_constraints are inequality constraints on parameter values, fixed_features specifies features that should be fixed during generation, and rounding_func is a callback that rounds an optimization result appropriately. candidates is a tensor of generated candidates. For additional details on the arguments, see scipy_optimizer.

best_point_recommender(
    model,
    bounds,
    objective_weights,
    outcome_constraints,
    linear_constraints,
    fixed_features,
    model_gen_options,
    target_fidelities,
) -> candidates

Here model is a TorchModel, bounds is a list of tuples containing bounds on the parameters, objective_weights is a tensor of weights for the model outputs, outcome_constraints is a tuple of tensors describing the (linear) outcome constraints, linear_constraints is a tuple of tensors describing constraints on the design, fixed_features specifies features that should be fixed during generation, model_gen_options is a config dictionary that can contain model-specific options, and target_fidelities is a map from fidelity feature column indices to their respective target fidelities, used for multi-fidelity optimization problems. % TODO: refer to an example.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
best_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) Optional[Tensor][source]

Identify the current best point, satisfying the constraints in the same format as to gen.

Return None if no such point can be identified.

Parameters:
  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

d-tensor of the best point.

cross_validate(datasets: List[SupervisedDataset], X_test: Tensor, **kwargs: Any) Tuple[Tensor, Tensor][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • X_test – (j x d) tensor of the j points at which to make predictions.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

feature_importances() ndarray[source]
fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResult container.

property model: Model
predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predict

Parameters:

X – (j x d) tensor of the j points at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

property search_space_digest: SearchSpaceDigest
ax.models.torch.botorch.get_feature_importances_from_botorch_model(model: Optional[Union[Model, ModuleList]]) ndarray[source]

Get feature importances from a list of BoTorch models.

Parameters:

models – BoTorch model to get feature importances from.

Returns:

The feature importances as a numpy array where each row sums to 1.

ax.models.torch.botorch.get_rounding_func(rounding_func: Optional[Callable[[Tensor], Tensor]]) Optional[Callable[[Tensor], Tensor]][source]

ax.models.torch.botorch_defaults module

class ax.models.torch.botorch_defaults.TAcqfConstructor(*args, **kwargs)[source]

Bases: Protocol

ax.models.torch.botorch_defaults.get_NEI() None[source]

TAcqfConstructor instantiating qNEI. See docstring of get_qEI for details.

ax.models.torch.botorch_defaults.get_acqf(acquisition_function_name: str) Callable[[Callable[[], None]], TAcqfConstructor][source]

Returns a decorator whose wrapper function instantiates an acquisition function.

NOTE: This is a decorator factory instead of a simple factory as serialization of Botorch model kwargs requires callables to be have module-level paths, and closures created by a simple factory do not have such paths. We solve this by wrapping “empty” module-level functions with this decorator, we ensure that they are serialized correctly, in addition to reducing code duplication.

Example

>>> @get_acqf("qEI")
... def get_qEI() -> None:
...     pass
>>> acqf = get_qEI(
...     model=model,
...     objective_weights=objective_weights,
...     outcome_constraints=outcome_constraints,
...     X_observed=X_observed,
...     X_pending=X_pending,
...     **kwargs,
... )
>>> type(acqf)
... botorch.acquisition.monte_carlo.qExpectedImprovement
Parameters:

acquisition_function_name – The name of the acquisition function to be instantiated by the returned function.

Returns:

A decorator whose wrapper function is a TAcqfConstructor, i.e. it requires a model, objective_weights, and optional outcome_constraints, X_observed, and X_pending as inputs, as well as kwargs, and returns an AcquisitionFunction instance that corresponds to acquisition_function_name.

ax.models.torch.botorch_defaults.get_and_fit_model(Xs: List[Tensor], Ys: List[Tensor], Yvars: List[Tensor], task_features: List[int], fidelity_features: List[int], metric_names: List[str], state_dict: Optional[Dict[str, Tensor]] = None, refit_model: bool = True, use_input_warping: bool = False, use_loocv_pseudo_likelihood: bool = False, prior: Optional[Dict[str, Any]] = None, *, multitask_gp_ranks: Optional[Dict[str, Union[Prior, float]]] = None, **kwargs: Any) GPyTorchModel[source]

Instantiates and fits a botorch GPyTorchModel using the given data. N.B. Currently, the logic for choosing ModelListGP vs other models is handled using if-else statements in lines 96-137. In the future, this logic should be taken care of by modular botorch.

Parameters:
  • Xs – List of X data, one tensor per outcome.

  • Ys – List of Y data, one tensor per outcome.

  • Yvars – List of observed variance of Ys.

  • task_features – List of columns of X that are tasks.

  • fidelity_features – List of columns of X that are fidelity parameters.

  • metric_names – Names of each outcome Y in Ys.

  • state_dict – If provided, will set model parameters to this state dictionary. Otherwise, will fit the model.

  • refit_model – Flag for refitting model.

  • prior

    Optional[Dict]. A dictionary that contains the specification of GP model prior. Currently, the keys include: - covar_module_prior: prior on covariance matrix e.g.

    {“lengthscale_prior”: GammaPrior(3.0, 6.0)}.

    • type: type of prior on task covariance matrix e.g.`LKJCovariancePrior`.

    • sd_prior: A scalar prior over nonnegative numbers, which is used for the

      default LKJCovariancePrior task_covar_prior.

    • eta: The eta parameter on the default LKJ task_covar_prior.

  • kwargs – Passed to _get_model.

Returns:

A fitted GPyTorchModel.

ax.models.torch.botorch_defaults.get_qEI() None[source]

A TAcqfConstructor to instantiate a qEI acquisition function. The function body is filled in by the decorator function get_acqf to simultaneously reduce code duplication and allow serialization in Ax. TODO: Deprecate with legacy Ax model.

ax.models.torch.botorch_defaults.get_qLogEI() None[source]

TAcqfConstructor instantiating qLogEI. See docstring of get_qEI for details.

ax.models.torch.botorch_defaults.get_qLogNEI() None[source]

TAcqfConstructor instantiating qLogNEI. See docstring of get_qEI for details.

ax.models.torch.botorch_defaults.get_warping_transform(d: int, batch_shape: Optional[Size] = None, task_feature: Optional[int] = None) Warp[source]

Construct input warping transform.

Parameters:
  • d – The dimension of the input, including task features

  • batch_shape – The batch_shape of the model

  • task_feature – The index of the task feature

Returns:

The input warping transform.

ax.models.torch.botorch_defaults.recommend_best_observed_point(model: TorchModel, bounds: List[Tuple[float, float]], objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, linear_constraints: Optional[Tuple[Tensor, Tensor]] = None, fixed_features: Optional[Dict[int, float]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None, target_fidelities: Optional[Dict[int, float]] = None) Optional[Tensor][source]

A wrapper around ax.models.model_utils.best_observed_point for TorchModel that recommends a best point from previously observed points using either a “max_utility” or “feasible_threshold” strategy.

Parameters:
  • model – A TorchModel.

  • bounds – A list of (lower, upper) tuples for each column of X.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value in the best point.

  • model_gen_options – A config dictionary that can contain model-specific options. See TorchOptConfig for details.

  • target_fidelities – A map {feature_index: value} of fidelity feature column indices to their respective target fidelities. Used for multi-fidelity optimization.

Returns:

A d-array of the best point, or None if no feasible point was observed.

ax.models.torch.botorch_defaults.recommend_best_out_of_sample_point(model: TorchModel, bounds: List[Tuple[float, float]], objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, linear_constraints: Optional[Tuple[Tensor, Tensor]] = None, fixed_features: Optional[Dict[int, float]] = None, model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None, target_fidelities: Optional[Dict[int, float]] = None) Optional[Tensor][source]

Identify the current best point by optimizing the posterior mean of the model. This is “out-of-sample” because it considers un-observed designs as well.

Return None if no such point can be identified.

Parameters:
  • model – A TorchModel.

  • bounds – A list of (lower, upper) tuples for each column of X.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value in the best point.

  • model_gen_options – A config dictionary that can contain model-specific options. See TorchOptConfig for details.

  • target_fidelities – A map {feature_index: value} of fidelity feature column indices to their respective target fidelities. Used for multi-fidelity optimization.

Returns:

A d-array of the best point, or None if no feasible point exists.

ax.models.torch.botorch_defaults.scipy_optimizer(acq_function: AcquisitionFunction, bounds: Tensor, n: int, inequality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]] = None, equality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]] = None, fixed_features: Optional[Dict[int, float]] = None, rounding_func: Optional[Callable[[Tensor], Tensor]] = None, *, num_restarts: int = 20, raw_samples: Optional[int] = None, joint_optimization: bool = False, options: Optional[Dict[str, Union[bool, float, int, str]]] = None) Tuple[Tensor, Tensor][source]

Optimizer using scipy’s minimize module on a numpy-adpator.

Parameters:
  • acq_function – A botorch AcquisitionFunction.

  • bounds – A 2 x d-dim tensor, where bounds[0] (bounds[1]) are the lower (upper) bounds of the feasible hyperrectangle.

  • n – The number of candidates to generate.

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) == rhs

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • rounding_func – A function that rounds an optimization result appropriately (i.e., according to round-trip transformations).

Returns:

2-element tuple containing

  • A n x d-dim tensor of generated candidates.

  • In the case of joint optimization, a scalar tensor containing the joint acquisition value of the n points. In the case of sequential optimization, a n-dim tensor of conditional acquisition values, where i-th element is the expected acquisition value conditional on having observed candidates 0,1,…,i-1.

ax.models.torch.botorch_kg module

class ax.models.torch.botorch_kg.KnowledgeGradient(cost_intercept: float = 1.0, linear_truncated: bool = True, use_input_warping: bool = False, **kwargs: Any)[source]

Bases: BotorchModel

The Knowledge Gradient with one shot optimization.

Parameters:
  • cost_intercept – The cost intercept for the affine cost of the form cost_intercept + n, where n is the number of generated points. Only used for multi-fidelity optimzation (i.e., if fidelity_features are present).

  • linear_truncated – If False, use an alternate downsampling + exponential decay Kernel instead of the default LinearTruncatedFidelityKernel (only relevant for multi-fidelity optimization).

  • kwargs – Model-specific kwargs.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
fidelity_features: List[int]
gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResults container, containing

  • (n x d) tensor of generated points.

  • n-tensor of weights for each point.

  • Dictionary of model-specific metadata for the given

    generation candidates.

metric_names: List[str]
task_features: List[int]

ax.models.torch.botorch_mes module

class ax.models.torch.botorch_mes.MaxValueEntropySearch(cost_intercept: float = 1.0, linear_truncated: bool = True, use_input_warping: bool = False, **kwargs: Any)[source]

Bases: BotorchModel

Max-value entropy search.

Parameters:
  • cost_intercept – The cost intercept for the affine cost of the form cost_intercept + n, where n is the number of generated points. Only used for multi-fidelity optimzation (i.e., if fidelity_features are present).

  • linear_truncated – If False, use an alternate downsampling + exponential decay Kernel instead of the default LinearTruncatedFidelityKernel (only relevant for multi-fidelity optimization).

  • kwargs – Model-specific kwargs.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
fidelity_features: List[int]
gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResult container.

metric_names: List[str]
task_features: List[int]

ax.models.torch.botorch_moo module

class ax.models.torch.botorch_moo.MultiObjectiveBotorchModel(model_constructor: ~typing.Callable[[~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[int], ~typing.List[int], ~typing.List[str], ~typing.Optional[~typing.Dict[str, ~torch.Tensor]], ~typing.Any], ~botorch.models.model.Model] = <function get_and_fit_model>, model_predictor: ~typing.Callable[[~botorch.models.model.Model, ~torch.Tensor], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function predict_from_model>, acqf_constructor: ~ax.models.torch.botorch_defaults.TAcqfConstructor = <function get_qLogNEHVI>, acqf_optimizer: ~typing.Callable[[~botorch.acquisition.acquisition.AcquisitionFunction, ~torch.Tensor, int, ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Callable[[~torch.Tensor], ~torch.Tensor]], ~typing.Any], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function scipy_optimizer>, best_point_recommender: ~typing.Callable[[~ax.models.torch_base.TorchModel, ~typing.List[~typing.Tuple[float, float]], ~torch.Tensor, ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Dict[str, ~typing.Optional[~typing.Union[int, float, str, ~botorch.acquisition.acquisition.AcquisitionFunction, ~typing.Dict[int, ~typing.Any], ~typing.Dict[str, ~typing.Any], ~ax.core.optimization_config.OptimizationConfig, ~ax.models.winsorization_config.WinsorizationConfig]]]], ~typing.Optional[~typing.Dict[int, float]]], ~typing.Optional[~torch.Tensor]] = <function recommend_best_observed_point>, frontier_evaluator: ~typing.Callable[[~ax.models.torch_base.TorchModel, ~torch.Tensor, ~typing.Optional[~torch.Tensor], ~typing.Optional[~torch.Tensor], ~typing.Optional[~torch.Tensor], ~typing.Optional[~torch.Tensor], ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]]], ~typing.Tuple[~torch.Tensor, ~torch.Tensor, ~torch.Tensor]] = <function pareto_frontier_evaluator>, refit_on_cv: bool = False, refit_on_update: bool = True, warm_start_refitting: bool = False, use_input_warping: bool = False, use_loocv_pseudo_likelihood: bool = False, prior: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, **kwargs: ~typing.Any)[source]

Bases: BotorchModel

Customizable multi-objective model.

By default, this uses an Expected Hypervolume Improvment function to find the pareto frontier of a function with multiple outcomes. This behavior can be modified by providing custom implementations of the following components:

  • a model_constructor that instantiates and fits a model on data

  • a model_predictor that predicts outcomes using the fitted model

  • a acqf_constructor that creates an acquisition function from a fitted model

  • a acqf_optimizer that optimizes the acquisition function

Parameters:
  • model_constructor – A callable that instantiates and fits a model on data, with signature as described below.

  • model_predictor – A callable that predicts using the fitted model, with signature as described below.

  • acqf_constructor – A callable that creates an acquisition function from a fitted model, with signature as described below.

  • acqf_optimizer – A callable that optimizes an acquisition function, with signature as described below.

Call signatures:

model_constructor(
    Xs,
    Ys,
    Yvars,
    task_features,
    fidelity_features,
    metric_names,
    state_dict,
    **kwargs,
) -> model

Here Xs, Ys, Yvars are lists of tensors (one element per outcome), task_features identifies columns of Xs that should be modeled as a task, fidelity_features is a list of ints that specify the positions of fidelity parameters in ‘Xs’, metric_names provides the names of each Y in Ys, state_dict is a pytorch module state dict, and model is a BoTorch Model. Optional kwargs are being passed through from the BotorchModel constructor. This callable is assumed to return a fitted BoTorch model that has the same dtype and lives on the same device as the input tensors.

model_predictor(model, X) -> [mean, cov]

Here model is a fitted botorch model, X is a tensor of candidate points, and mean and cov are the posterior mean and covariance, respectively.

acqf_constructor(
    model,
    objective_weights,
    outcome_constraints,
    X_observed,
    X_pending,
    **kwargs,
) -> acq_function

Here model is a botorch Model, objective_weights is a tensor of weights for the model outputs, outcome_constraints is a tuple of tensors describing the (linear) outcome constraints, X_observed are previously observed points, and X_pending are points whose evaluation is pending. acq_function is a BoTorch acquisition function crafted from these inputs. For additional details on the arguments, see get_qLogNEHVI.

acqf_optimizer(
    acq_function,
    bounds,
    n,
    inequality_constraints,
    fixed_features,
    rounding_func,
    **kwargs,
) -> candidates

Here acq_function is a BoTorch AcquisitionFunction, bounds is a tensor containing bounds on the parameters, n is the number of candidates to be generated, inequality_constraints are inequality constraints on parameter values, fixed_features specifies features that should be fixed during generation, and rounding_func is a callback that rounds an optimization result appropriately. candidates is a tensor of generated candidates. For additional details on the arguments, see scipy_optimizer.

frontier_evaluator(
    model,
    objective_weights,
    objective_thresholds,
    X,
    Y,
    Yvar,
    outcome_constraints,
)

Here model is a botorch Model, objective_thresholds is used in hypervolume evaluations, objective_weights is a tensor of weights applied to the objectives (sign represents direction), X, Y, Yvar are tensors, outcome_constraints is a tuple of tensors describing the (linear) outcome constraints.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResult container.

ax.models.torch.botorch_moo_defaults module

References

[Daulton2020qehvi]

S. Daulton, M. Balandat, and E. Bakshy. Differentiable Expected Hypervolume Improvement for Parallel Multi-Objective Bayesian Optimization. Advances in Neural Information Processing Systems 33, 2020.

[Daulton2021nehvi]

S. Daulton, M. Balandat, and E. Bakshy. Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement. Advances in Neural Information Processing Systems 34, 2021.

[Ament2023logei]

S. Ament, S. Daulton, D. Eriksson, M. Balandat, and E. Bakshy. Unexpected Improvements to Expected Improvement for Bayesian Optimization. Advances in Neural Information Processing Systems 36, 2023.

ax.models.torch.botorch_moo_defaults.get_EHVI(model: Model, objective_weights: Tensor, objective_thresholds: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, *, mc_samples: int = 128, alpha: Optional[float] = None, seed: Optional[int] = None) qExpectedHypervolumeImprovement[source]

Instantiates a qExpectedHyperVolumeImprovement acquisition function.

Parameters:
  • model – The underlying model which the acqusition function uses to estimate acquisition values of candidates.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • objective_thresholds – A tensor containing thresholds forming a reference point from which to calculate pareto frontier hypervolume. Points that do not dominate the objective_thresholds contribute nothing to hypervolume.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

  • X_observed – A tensor containing points observed for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • X_pending – A tensor containing points whose evaluation is pending (i.e. that have been submitted for evaluation) present for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • mc_samples – The number of MC samples to use (default: 512).

  • alpha – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives m increases, consider increasing this parameter in order to limit computational complexity.

  • seed – The random seed for generating random starting points for optimization.

Returns:

The instantiated acquisition function.

Return type:

qExpectedHypervolumeImprovement

ax.models.torch.botorch_moo_defaults.get_NEHVI(model: Model, objective_weights: Tensor, objective_thresholds: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, *, prune_baseline: bool = True, mc_samples: int = 128, alpha: Optional[float] = None, marginalize_dim: Optional[int] = None, cache_root: bool = True, seed: Optional[int] = None) qNoisyExpectedHypervolumeImprovement[source]

Instantiates a qNoisyExpectedHyperVolumeImprovement acquisition function.

Parameters:
  • model – The underlying model which the acqusition function uses to estimate acquisition values of candidates.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

  • X_observed – A tensor containing points observed for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • X_pending – A tensor containing points whose evaluation is pending (i.e. that have been submitted for evaluation) present for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • prune_baseline – If True, prune the baseline points for NEI (default: True).

  • mc_samples – The number of MC samples to use (default: 512).

  • alpha – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives m increases, consider increasing this parameter in order to limit computational complexity (default: None).

  • marginalize_dim – The dimension along which to marginalize over, used for fully Bayesian models (default: None).

  • cache_root – If True, cache the root of the covariance matrix (default: True).

  • seed – The random seed for generating random starting points for optimization ( default: None).

Returns:

The instantiated acquisition function.

Return type:

qNoisyExpectedHyperVolumeImprovement

ax.models.torch.botorch_moo_defaults.get_default_frontier_evaluator() Callable[[TorchModel, Tensor, Optional[Tensor], Optional[Tensor], Optional[Tensor], Optional[Tensor], Optional[Tuple[Tensor, Tensor]]], Tuple[Tensor, Tensor, Tensor]][source]
ax.models.torch.botorch_moo_defaults.get_qLogEHVI(model: Model, objective_weights: Tensor, objective_thresholds: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, *, mc_samples: int = 128, alpha: Optional[float] = None, seed: Optional[int] = None) qLogExpectedHypervolumeImprovement[source]

Instantiates a qLogExpectedHyperVolumeImprovement acquisition function.

Parameters:
  • model – The underlying model which the acqusition function uses to estimate acquisition values of candidates.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • objective_thresholds – A tensor containing thresholds forming a reference point from which to calculate pareto frontier hypervolume. Points that do not dominate the objective_thresholds contribute nothing to hypervolume.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

  • X_observed – A tensor containing points observed for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • X_pending – A tensor containing points whose evaluation is pending (i.e. that have been submitted for evaluation) present for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • mc_samples – The number of MC samples to use (default: 512).

  • alpha – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives m increases, consider increasing this parameter in order to limit computational complexity.

  • seed – The random seed for generating random starting points for optimization.

Returns:

The instantiated acquisition function.

Return type:

qLogExpectedHypervolumeImprovement

ax.models.torch.botorch_moo_defaults.get_qLogNEHVI(model: Model, objective_weights: Tensor, objective_thresholds: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, *, prune_baseline: bool = True, mc_samples: int = 128, alpha: Optional[float] = None, marginalize_dim: Optional[int] = None, cache_root: bool = True, seed: Optional[int] = None) qLogNoisyExpectedHypervolumeImprovement[source]

Instantiates a qLogNoisyExpectedHyperVolumeImprovement acquisition function.

Parameters:
  • model – The underlying model which the acqusition function uses to estimate acquisition values of candidates.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

  • X_observed – A tensor containing points observed for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • X_pending – A tensor containing points whose evaluation is pending (i.e. that have been submitted for evaluation) present for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • prune_baseline – If True, prune the baseline points for NEI (default: True).

  • mc_samples – The number of MC samples to use (default: 512).

  • alpha – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives m increases, consider increasing this parameter in order to limit computational complexity (default: None).

  • marginalize_dim – The dimension along which to marginalize over, used for fully Bayesian models (default: None).

  • cache_root – If True, cache the root of the covariance matrix (default: True).

  • seed – The random seed for generating random starting points for optimization ( default: None).

Returns:

The instantiated acquisition function.

Return type:

qLogNoisyExpectedHyperVolumeImprovement

ax.models.torch.botorch_moo_defaults.get_weighted_mc_objective_and_objective_thresholds(objective_weights: Tensor, objective_thresholds: Tensor) Tuple[WeightedMCMultiOutputObjective, Tensor][source]

Construct weighted objective and apply the weights to objective thresholds.

Parameters:
  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • objective_thresholds – A tensor containing thresholds forming a reference point from which to calculate pareto frontier hypervolume. Points that do not dominate the objective_thresholds contribute nothing to hypervolume.

Returns:

  • The objective

  • The objective thresholds

Return type:

A two-element tuple with the objective and objective thresholds

ax.models.torch.botorch_moo_defaults.infer_objective_thresholds(model: Model, objective_weights: Tensor, bounds: Optional[List[Tuple[float, float]]] = None, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, linear_constraints: Optional[Tuple[Tensor, Tensor]] = None, fixed_features: Optional[Dict[int, float]] = None, subset_idcs: Optional[Tensor] = None, Xs: Optional[List[Tensor]] = None, X_observed: Optional[Tensor] = None, objective_thresholds: Optional[Tensor] = None) Tensor[source]

Infer objective thresholds.

This method uses the model-estimated Pareto frontier over the in-sample points to infer absolute (not relativized) objective thresholds.

This uses a heuristic that sets the objective threshold to be a scaled nadir point, where the nadir point is scaled back based on the range of each objective across the current in-sample Pareto frontier.

See botorch.utils.multi_objective.hypervolume.infer_reference_point for details on the heuristic.

Parameters:
  • model – A fitted botorch Model.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights. These should not be subsetted.

  • bounds – A list of (lower, upper) tuples for each column of X.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. These should not be subsetted.

  • linear_constraints – A tuple of (A, b). For k linear constraints on d-dimensional x, A is (k x d) and b is (k x 1) such that A x <= b.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • subset_idcs – The indices of the outcomes that are modeled by the provided model. If subset_idcs not None, this method infers whether the model is subsetted.

  • Xs – A list of m (k_i x d) feature tensors X. Number of rows k_i can vary from i=1,…,m.

  • X_observed – A n x d-dim tensor of in-sample points to use for determining the current in-sample Pareto frontier.

  • objective_thresholds – Any known objective thresholds to pass to infer_reference_point heuristic. This should not be subsetted. If only a subset of the objectives have known thresholds, the remaining objectives should be NaN. If no objective threshold was provided, this can be None.

Returns:

A m-dim tensor of objective thresholds, where the objective

threshold is nan if the outcome is not an objective.

ax.models.torch.botorch_moo_defaults.pareto_frontier_evaluator(model: Optional[TorchModel], objective_weights: Tensor, objective_thresholds: Optional[Tensor] = None, X: Optional[Tensor] = None, Y: Optional[Tensor] = None, Yvar: Optional[Tensor] = None, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None) Tuple[Tensor, Tensor, Tensor][source]

Return outcomes predicted to lie on a pareto frontier.

Given a model and points to evaluate, use the model to predict which points lie on the Pareto frontier.

Parameters:
  • model – Model used to predict outcomes.

  • objective_weights – A m tensor of values indicating the weight to put on different outcomes. For pareto frontiers only the sign matters.

  • objective_thresholds – A tensor containing thresholds forming a reference point from which to calculate pareto frontier hypervolume. Points that do not dominate the objective_thresholds contribute nothing to hypervolume.

  • X – A n x d tensor of features to evaluate.

  • Y – A n x m tensor of outcomes to use instead of predictions.

  • Yvar – A n x m x m tensor of input covariances (NaN if unobserved).

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b.

Returns:

3-element tuple containing

  • A j x m tensor of outcome on the pareto frontier. j is the number

    of frontier points.

  • A j x m x m tensor of predictive covariances.

    cov[j, m1, m2] is Cov[m1@j, m2@j].

  • A j tensor of the index of each frontier point in the input Y.

ax.models.torch.botorch_moo_defaults.scipy_optimizer_list(acq_function_list: List[AcquisitionFunction], bounds: Tensor, inequality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]] = None, fixed_features: Optional[Dict[int, float]] = None, rounding_func: Optional[Callable[[Tensor], Tensor]] = None, num_restarts: int = 20, raw_samples: Optional[int] = None, options: Optional[Dict[str, Union[bool, float, int, str]]] = None) Tuple[Tensor, Tensor][source]

Sequential optimizer using scipy’s minimize module on a numpy-adaptor.

The ith acquisition in the sequence uses the ith given acquisition_function.

Parameters:
  • acq_function_list – A list of botorch AcquisitionFunctions, optimized sequentially.

  • bounds – A 2 x d-dim tensor, where bounds[0] (bounds[1]) are the lower (upper) bounds of the feasible hyperrectangle.

  • n – The number of candidates to generate.

  • constraints (inequality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • rounding_func – A function that rounds an optimization result appropriately (i.e., according to round-trip transformations).

Returns:

2-element tuple containing

  • A n x d-dim tensor of generated candidates.

  • A n-dim tensor of conditional acquisition values, where i-th element is the expected acquisition value conditional on having observed candidates 0,1,…,i-1.

ax.models.torch.botorch_modular.acquisition module

class ax.models.torch.botorch_modular.acquisition.Acquisition(surrogates: Dict[str, Surrogate], search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, botorch_acqf_class: Type[AcquisitionFunction], options: Optional[Dict[str, Any]] = None)[source]

Bases: Base

All classes in ‘botorch_modular’ directory are under construction, incomplete, and should be treated as alpha versions only.

Ax wrapper for BoTorch AcquisitionFunction, subcomponent of BoTorchModel and is not meant to be used outside of it.

Parameters:
  • surrogates – Dict of name => Surrogate model pairs, with which this acquisition function will be used.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

  • botorch_acqf_class – Type of BoTorch AcquistitionFunction that should be used. Subclasses of Acquisition often specify these via default_botorch_acqf_class attribute, in which case specifying one here is not required.

  • options – Optional mapping of kwargs to the underlying Acquisition Function in BoTorch.

acqf: AcquisitionFunction
property botorch_acqf_class: Type[AcquisitionFunction]

BoTorch AcquisitionFunction class underlying this Acquisition.

compute_model_dependencies(surrogates: Mapping[str, Surrogate], search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, options: Optional[Dict[str, Any]] = None) Dict[str, Any][source]

Computes inputs to acquisition function class based on the given surrogate model.

NOTE: When subclassing Acquisition from a superclass where this method returns a non-empty dictionary of kwargs to AcquisitionFunction, call super().compute_model_dependencies and then update that dictionary of options with the options for the subclass you are creating (unless the superclass’ model dependencies should not be propagated to the subclass). See MultiFidelityKnowledgeGradient.compute_model_dependencies for an example.

Parameters:
  • surrogates – Mapping from names to Surrogate objects containing BoTorch Model`s, with which this `Acquisition is to be used.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

  • options – The options kwarg dict, passed on initialization of the Acquisition object.

Returns: A dictionary of surrogate model-dependent options, to be passed

as kwargs to BoTorch`AcquisitionFunction` constructor.

property device: Optional[device]

Torch device type of the tensors in the training data used in the model, of which this Acquisition is a subcomponent.

property dtype: Optional[dtype]

Torch data type of the tensors in the training data used in the model, of which this Acquisition is a subcomponent.

evaluate(X: Tensor) Tensor[source]

Evaluate the acquisition function on the candidate set X.

Parameters:

X – A batch_shape x q x d-dim Tensor of t-batches with q d-dim design points each.

Returns:

A batch_shape’-dim Tensor of acquisition values at the given design points X, where batch_shape’ is the broadcasted batch shape of model and input X.

get_botorch_objective_and_transform(botorch_acqf_class: Type[AcquisitionFunction], model: Model, objective_weights: Tensor, objective_thresholds: Optional[Tensor] = None, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, risk_measure: Optional[RiskMeasureMCObjective] = None) Tuple[Optional[MCAcquisitionObjective], Optional[PosteriorTransform]][source]
property objective_thresholds: Optional[Tensor]

The objective thresholds for all outcomes.

For non-objective outcomes, the objective thresholds are nans.

property objective_weights: Optional[Tensor]

The objective weights for all outcomes.

optimize(n: int, search_space_digest: SearchSpaceDigest, inequality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]] = None, fixed_features: Optional[Dict[int, float]] = None, rounding_func: Optional[Callable[[Tensor], Tensor]] = None, optimizer_options: Optional[Dict[str, Any]] = None) Tuple[Tensor, Tensor][source]

Generate a set of candidates via multi-start optimization. Obtains candidates and their associated acquisition function values.

Parameters:
  • n – The number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing search space properties, e.g. bounds for optimization.

  • inequality_constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • rounding_func – A function that post-processes an optimization result appropriately. This is typically passed down from ModelBridge to ensure compatibility of the candidates with with Ax transforms. For additional post processing, use post_processing_func option in optimizer_options.

  • optimizer_options – Options for the optimizer function, e.g. sequential or raw_samples. This can also include a post_processing_func which is applied to the candidates before the rounding_func. post_processing_func can be used to support more customized options that typically only exist in MBM, such as BoTorch transforms. See the docstring of TorchOptConfig for more information on passing down these options while constructing a generation strategy.

Returns:

A two-element tuple containing an n x d-dim tensor of generated candidates and a tensor with the associated acquisition value.

options: Dict[str, Any]
surrogates: Dict[str, Surrogate]

ax.models.torch.botorch_modular.default_options module

ax.models.torch.botorch_modular.list_surrogate module

ax.models.torch.randomforest module

class ax.models.torch.randomforest.RandomForest(max_features: Optional[str] = 'sqrt', num_trees: int = 500)[source]

Bases: TorchModel

A Random Forest model.

Uses a parametric bootstrap to handle uncertainty in Y.

Can be used to fit data, make predictions, and do cross validation; however gen is not implemented and so this model cannot generate new points.

Parameters:
  • max_features – Maximum number of features at each split. With one-hot encoding, this should be set to None. Defaults to “sqrt”, which is Breiman’s version of Random Forest.

  • num_trees – Number of trees.

cross_validate(datasets: List[SupervisedDataset], X_test: Tensor, **kwargs: Any) Tuple[Tensor, Tensor][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • X_test – (j x d) tensor of the j points at which to make predictions.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predict

Parameters:

X – (j x d) tensor of the j points at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

ax.models.torch.botorch_modular.model module

class ax.models.torch.botorch_modular.model.BoTorchModel(surrogate_specs: Optional[Mapping[str, SurrogateSpec]] = None, surrogate: Optional[Surrogate] = None, acquisition_class: Optional[Type[Acquisition]] = None, acquisition_options: Optional[Dict[str, Any]] = None, botorch_acqf_class: Optional[Type[AcquisitionFunction]] = None, refit_on_update: bool = True, refit_on_cv: bool = False, warm_start_refit: bool = True)[source]

Bases: TorchModel, Base

All classes in ‘botorch_modular’ directory are under construction, incomplete, and should be treated as alpha versions only.

Modular Model class for combining BoTorch subcomponents in Ax. Specified via Surrogate and Acquisition, which wrap BoTorch Model and AcquisitionFunction, respectively, for convenient use in Ax.

Parameters:
  • acquisition_class – Type of Acquisition to be used in this model, auto-selected based on experiment and data if not specified.

  • acquisition_options – Optional dict of kwargs, passed to the constructor of BoTorch AcquisitionFunction.

  • botorch_acqf_class – Type of AcquisitionFunction to be used in this model, auto-selected based on experiment and data if not specified.

  • surrogate_specs – Optional Mapping of names onto SurrogateSpecs, which specify how to initialize specific Surrogates to model specific outcomes. If None is provided a single Surrogate will be created and set up automatically based on the data provided.

  • surrogate – In liu of SurrogateSpecs, an instance of Surrogate may be provided to be used as the sole Surrogate for all outcomes

  • refit_on_update – Unused.

  • refit_on_cv – Whether to reoptimize model parameters during call to BoTorchmodel.cross_validate.

  • warm_start_refit – Whether to load parameters from either the provided state dict or the state dict of the current BoTorch Model during refitting. If False, model parameters will be reoptimized from scratch on refit. NOTE: This setting is ignored during cross_validate if the corresponding refit_on_… is False.

property Xs: List[Tensor]

A list of tensors, each of shape batch_shape x n_i x d, where n_i is the number of training inputs for the i-th model.

NOTE: This is an accessor for self.surrogate.Xs and returns it unchanged.

acquisition_class: Type[Acquisition]
acquisition_options: Dict[str, Any]
best_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) Optional[Tensor][source]

Identify the current best point, satisfying the constraints in the same format as to gen.

Return None if no such point can be identified.

Parameters:
  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

d-tensor of the best point.

property botorch_acqf_class: Type[AcquisitionFunction]

BoTorch AcquisitionFunction class, associated with this model. Raises an error if one is not yet set.

cross_validate(datasets: List[SupervisedDataset], X_test: Tensor, search_space_digest: SearchSpaceDigest, **additional_model_inputs: Any) Tuple[Tensor, Tensor][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • X_test – (j x d) tensor of the j points at which to make predictions.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

property device: device

Torch device type of the tensors in the training data used in the model, of which this Acquisition is a subcomponent.

property dtype: dtype

Torch data type of the tensors in the training data used in the model, of which this Acquisition is a subcomponent.

evaluate_acquisition_function(X: Tensor, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, acq_options: Optional[Dict[str, Any]] = None) Tensor[source]

Evaluate the acquisition function on the candidate set X.

Parameters:
  • X – (j x d) tensor of the j points at which to evaluate the acquisition function.

  • search_space_digest – A dataclass used to compactly represent a search space.

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

  • acq_options – Keyword arguments used to contruct the acquisition function.

Returns:

A single-element tensor with the acquisition value for these points.

feature_importances() ndarray[source]

Compute feature importances from the model.

Caveat: This assumes the following:
  1. There is a single surrogate model (potentially a ModelList).

  2. We can get model lengthscales from covar_module.base_kernel.lengthscale

Returns:

The feature importances as a numpy array of size len(metrics) x 1 x dim where each row sums to 1.

fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None, state_dicts: Optional[Mapping[str, OrderedDict[str, Tensor]]] = None, refit: bool = True, **additional_model_inputs: Any) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one or more outcomes.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

  • state_dicts – Optional state dict to load by model label as passed in via surrogate_specs. If using a single, pre-instantiated model use `Keys.ONLY_SURROGATE.

  • refit – Whether to re-optimize model parameters.

  • additional_model_inputs – Additional kwargs to pass to the model input constructor in Surrogate.fit.

gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResult container.

property outcomes_by_surrogate_label: Dict[str, List[str]]

Returns a dictionary mapping from surrogate label to a list of outcomes.

property output_order: List[int]
predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predicts, potentially from multiple surrogates.

If predictions are from multiple surrogates, will stitch outputs together in same order as input datasets, using self.output_order.

Parameters:

X – (n x d) Tensor of input locations.

Returns: Tuple of tensors: (n x m) mean, (n x m x m) covariance.

predict_from_surrogate(surrogate_label: str, X: Tensor) Tuple[Tensor, Tensor][source]

Predict from the Surrogate with the given label.

property search_space_digest: SearchSpaceDigest
property surrogate: Surrogate

Surrogate, if there is only one.

surrogate_specs: Dict[str, SurrogateSpec]
property surrogates: Dict[str, Surrogate]

Surrogates by label

class ax.models.torch.botorch_modular.model.SurrogateSpec(botorch_model_class: ~typing.Optional[~typing.Type[~botorch.models.model.Model]] = None, botorch_model_kwargs: ~typing.Dict[str, ~typing.Any] = <factory>, mll_class: ~typing.Type[~gpytorch.mlls.marginal_log_likelihood.MarginalLogLikelihood] = <class 'gpytorch.mlls.exact_marginal_log_likelihood.ExactMarginalLogLikelihood'>, mll_kwargs: ~typing.Dict[str, ~typing.Any] = <factory>, covar_module_class: ~typing.Optional[~typing.Type[~gpytorch.kernels.kernel.Kernel]] = None, covar_module_kwargs: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, likelihood_class: ~typing.Optional[~typing.Type[~gpytorch.likelihoods.likelihood.Likelihood]] = None, likelihood_kwargs: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, input_transform_classes: ~typing.Optional[~typing.List[~typing.Type[~botorch.models.transforms.input.InputTransform]]] = None, input_transform_options: ~typing.Optional[~typing.Dict[str, ~typing.Dict[str, ~typing.Any]]] = None, outcome_transform_classes: ~typing.Optional[~typing.List[~typing.Type[~botorch.models.transforms.outcome.OutcomeTransform]]] = None, outcome_transform_options: ~typing.Optional[~typing.Dict[str, ~typing.Dict[str, ~typing.Any]]] = None, allow_batched_models: bool = True, outcomes: ~typing.List[str] = <factory>)[source]

Bases: object

Fields in the SurrogateSpec dataclass correspond to arguments in Surrogate.__init__, except for outcomes which is used to specify which outcomes the Surrogate is responsible for modeling. When BotorchModel.fit is called, these fields will be used to construct the requisite Surrogate objects. If outcomes is left empty then no outcomes will be fit to the Surrogate.

allow_batched_models: bool = True
botorch_model_class: Optional[Type[Model]] = None
botorch_model_kwargs: Dict[str, Any]
covar_module_class: Optional[Type[Kernel]] = None
covar_module_kwargs: Optional[Dict[str, Any]] = None
input_transform_classes: Optional[List[Type[InputTransform]]] = None
input_transform_options: Optional[Dict[str, Dict[str, Any]]] = None
likelihood_class: Optional[Type[Likelihood]] = None
likelihood_kwargs: Optional[Dict[str, Any]] = None
mll_class

alias of ExactMarginalLogLikelihood

mll_kwargs: Dict[str, Any]
outcome_transform_classes: Optional[List[Type[OutcomeTransform]]] = None
outcome_transform_options: Optional[Dict[str, Dict[str, Any]]] = None
outcomes: List[str]
ax.models.torch.botorch_modular.model.single_surrogate_only(f: Callable[[...], T]) Callable[[...], T][source]

For use as a decorator on functions only implemented for BotorchModels with a single Surrogate.

ax.models.torch.botorch_modular.multi_fidelity module

class ax.models.torch.botorch_modular.multi_fidelity.MultiFidelityAcquisition(surrogates: Dict[str, Surrogate], search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, botorch_acqf_class: Type[AcquisitionFunction], options: Optional[Dict[str, Any]] = None)[source]

Bases: Acquisition

X_observed: Tensor
X_pending: Optional[Tensor]
acqf: AcquisitionFunction
compute_model_dependencies(surrogates: Mapping[str, Surrogate], search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, options: Optional[Dict[str, Any]] = None) Dict[str, Any][source]

Computes inputs to acquisition function class based on the given surrogate model.

NOTE: When subclassing Acquisition from a superclass where this method returns a non-empty dictionary of kwargs to AcquisitionFunction, call super().compute_model_dependencies and then update that dictionary of options with the options for the subclass you are creating (unless the superclass’ model dependencies should not be propagated to the subclass). See MultiFidelityKnowledgeGradient.compute_model_dependencies for an example.

Parameters:
  • surrogates – Mapping from names to Surrogate objects containing BoTorch Model`s, with which this `Acquisition is to be used.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

  • options – The options kwarg dict, passed on initialization of the Acquisition object.

Returns: A dictionary of surrogate model-dependent options, to be passed

as kwargs to BoTorch`AcquisitionFunction` constructor.

options: Dict[str, Any]
surrogates: Dict[str, Surrogate]

ax.models.torch.botorch_modular.optimizer_argparse module

ax.models.torch.botorch_modular.sebo module

ax.models.torch.botorch_modular.sebo.L1_norm_func(X: Tensor, init_point: Tensor) Tensor[source]

L1_norm takes in a a batch_shape x n x d-dim input tensor X to a batch_shape x n x 1-dimensional L1 norm tensor. To be used for constructing a GenericDeterministicModel.

class ax.models.torch.botorch_modular.sebo.SEBOAcquisition(surrogates: Dict[str, Surrogate], search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, botorch_acqf_class: Type[AcquisitionFunction], options: Optional[Dict[str, Any]] = None)[source]

Bases: Acquisition

Implement the acquisition function of Sparsity Exploring Bayesian Optimization (SEBO).

The SEBO is a hyperparameter-free method to simultaneously maximize a target objective and sparsity. When L0 norm is used, SEBO uses a novel differentiable relaxation based on homotopy continuation to efficiently optimize for sparsity.

X_observed: Tensor
X_pending: Optional[Tensor]
acqf: AcquisitionFunction
optimize(n: int, search_space_digest: SearchSpaceDigest, inequality_constraints: Optional[List[Tuple[Tensor, Tensor, float]]] = None, fixed_features: Optional[Dict[int, float]] = None, rounding_func: Optional[Callable[[Tensor], Tensor]] = None, optimizer_options: Optional[Dict[str, Any]] = None) Tuple[Tensor, Tensor][source]

Generate a set of candidates via multi-start optimization. Obtains candidates and their associated acquisition function values.

Parameters:
  • n – The number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing search space properties, e.g. bounds for optimization.

  • inequality_constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

  • fixed_features – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • rounding_func – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • optimizer_options – Options for the optimizer function, e.g. sequential or raw_samples.

options: Dict[str, Any]
surrogates: Dict[str, Surrogate]
ax.models.torch.botorch_modular.sebo.clamp_candidates(X: Tensor, target_point: Tensor, clamp_tol: float, **tkwargs: Any) Tensor[source]

Clamp generated candidates within the given ranges to the target point.

ax.models.torch.botorch_modular.sebo.get_batch_initial_conditions(acq_function: AcquisitionFunction, raw_samples: int, X_pareto: Tensor, target_point: Tensor, num_restarts: int = 20, **tkwargs: Any) Tensor[source]

Generate starting points for the SEBO acquisition function optimization.

ax.models.torch.botorch_modular.surrogate module

class ax.models.torch.botorch_modular.surrogate.Surrogate(botorch_model_class: ~typing.Optional[~typing.Type[~botorch.models.model.Model]] = None, model_options: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, mll_class: ~typing.Type[~gpytorch.mlls.marginal_log_likelihood.MarginalLogLikelihood] = <class 'gpytorch.mlls.exact_marginal_log_likelihood.ExactMarginalLogLikelihood'>, mll_options: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, outcome_transform_classes: ~typing.Optional[~typing.List[~typing.Type[~botorch.models.transforms.outcome.OutcomeTransform]]] = None, outcome_transform_options: ~typing.Optional[~typing.Dict[str, ~typing.Dict[str, ~typing.Any]]] = None, input_transform_classes: ~typing.Optional[~typing.List[~typing.Type[~botorch.models.transforms.input.InputTransform]]] = None, input_transform_options: ~typing.Optional[~typing.Dict[str, ~typing.Dict[str, ~typing.Any]]] = None, covar_module_class: ~typing.Optional[~typing.Type[~gpytorch.kernels.kernel.Kernel]] = None, covar_module_options: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, likelihood_class: ~typing.Optional[~typing.Type[~gpytorch.likelihoods.likelihood.Likelihood]] = None, likelihood_options: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, allow_batched_models: bool = True)[source]

Bases: Base

All classes in ‘botorch_modular’ directory are under construction, incomplete, and should be treated as alpha versions only.

Ax wrapper for BoTorch Model, subcomponent of BoTorchModel and is not meant to be used outside of it.

Parameters:
  • botorch_model_classModel class to be used as the underlying BoTorch model. If None is provided a model class will be selected (either one for all outcomes or a ModelList with separate models for each outcome) will be selected automatically based off the datasets at construct time.

  • model_options – Dictionary of options / kwargs for the BoTorch Model constructed during Surrogate.fit. Note that the corresponding attribute will later be updated to include any additional kwargs passed into BoTorchModel.fit.

  • mll_classMarginalLogLikelihood class to use for model-fitting.

  • mll_options – Dictionary of options / kwargs for the MLL.

  • outcome_transform_classes – List of BoTorch outcome transforms classes. Passed down to the BoTorch Model. Multiple outcome transforms can be chained together using ChainedOutcomeTransform.

  • outcome_transform_options

    Outcome transform classes kwargs. The keys are class string names and the values are dictionaries of outcome transform kwargs. For example, ` outcome_transform_classes = [Standardize] outcome_transform_options = {

    ”Standardize”: {“m”: 1},

    ` For more options see botorch/models/transforms/outcome.py.

  • input_transform_classes – List of BoTorch input transforms classes. Passed down to the BoTorch Model. Multiple input transforms will be chained together using ChainedInputTransform.

  • input_transform_options

    Input transform classes kwargs. The keys are class string names and the values are dictionaries of input transform kwargs. For example, ` input_transform_classes = [Normalize, Round] input_transform_options = {

    ”Normalize”: {“d”: 3}, “Round”: {“integer_indices”: [0], “categorical_features”: {1: 2}},

    For more input options see botorch/models/transforms/input.py.

  • covar_module_class – Covariance module class. This gets initialized after parsing the covar_module_options in covar_module_argparse, and gets passed to the model constructor as covar_module.

  • covar_module_options – Covariance module kwargs.

  • likelihoodLikelihood class. This gets initialized with likelihood_options and gets passed to the model constructor.

  • likelihood_options – Likelihood options.

  • allow_batched_models – Set to true to fit the models in a batch if supported. Set to false to fit individual models to each metric in a loop.

property Xs: List[Tensor]
best_in_sample_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Tuple[Tensor, float][source]

Finds the best observed point and the corresponding observed outcome values.

best_out_of_sample_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig, options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Tuple[Tensor, Tensor][source]

Finds the best predicted point and the corresponding value of the appropriate best point acquisition function.

clone_reset() Surrogate[source]
compute_diagnostics() Dict[str, Any][source]

Computes model diagnostics like cross-validation measure of fit, etc.

property device: device
property dtype: dtype
fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None, state_dict: Optional[OrderedDict[str, Tensor]] = None, refit: bool = True) None[source]

Fits the underlying BoTorch Model to m outcomes.

NOTE: state_dict and refit keyword arguments control how the undelying BoTorch Model will be fit: whether its parameters will be reoptimized and whether it will be warm-started from a given state.

There are three possibilities:

  • fit(state_dict=None): fit model from scratch (optimize model parameters and set its training data used for inference),

  • fit(state_dict=some_state_dict, refit=True): warm-start refit with a state dict of parameters (still re-optimize model parameters and set the training data),

  • fit(state_dict=some_state_dict, refit=False): load model parameters without refitting, but set new training data (used in cross-validation, for example).

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome), to be passed to Model.construct_inputs in BoTorch.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

  • state_dict – Optional state dict to load.

  • refit – Whether to re-optimize model parameters.

property model: Model
property outcomes: List[str]
pareto_frontier() Tuple[Tensor, Tensor][source]

For multi-objective optimization, retrieve Pareto frontier instead of best point.

Returns: A two-tuple of:
  • tensor of points in the feature space,

  • tensor of corresponding (multiple) outcomes.

predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predicts outcomes given an input tensor.

Parameters:

X – A n x d tensor of input parameters.

Returns:

The predicted posterior mean as an n x o-dim tensor. Tensor: The predicted posterior covariance as a n x o x o-dim tensor.

Return type:

Tensor

property training_data: List[SupervisedDataset]

ax.models.torch.botorch_modular.utils module

ax.models.torch.botorch_modular.utils.check_outcome_dataset_match(outcome_names: List[str], datasets: List[SupervisedDataset], exact_match: bool) None[source]

Check that the given outcome names match those of datasets.

Based on exact_match we either require that outcome names are a subset of all outcomes or require the them to be the same.

Also checks that there are no duplicates in outcome names.

Parameters:
  • outcome_names – A list of outcome names.

  • datasets – A list of SupervisedDataset objects.

  • exact_match – If True, outcome_names must be the same as the union of outcome names of the datasets. Otherwise, we check that the outcome_names are a subset of all outcomes.

Raises:

ValueError – If there is no match.

ax.models.torch.botorch_modular.utils.choose_botorch_acqf_class(pending_observations: Optional[List[Tensor]] = None, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, linear_constraints: Optional[Tuple[Tensor, Tensor]] = None, fixed_features: Optional[Dict[int, float]] = None, objective_thresholds: Optional[Tensor] = None, objective_weights: Optional[Tensor] = None) Type[AcquisitionFunction][source]

Chooses a BoTorch AcquisitionFunction class.

ax.models.torch.botorch_modular.utils.choose_model_class(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest) Type[Model][source]

Chooses a BoTorch Model using the given data (currently just Yvars) and its properties (information about task and fidelity features).

Parameters:
  • Yvars – List of tensors, each representing observation noise for a given outcome, where outcomes are in the same order as in Xs.

  • task_features – List of columns of X that are tasks.

  • fidelity_features – List of columns of X that are fidelity parameters.

Returns:

A BoTorch Model class.

ax.models.torch.botorch_modular.utils.construct_acquisition_and_optimizer_options(acqf_options: Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]], model_gen_options: Optional[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]] = None) Tuple[Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]], Dict[str, Optional[Union[int, float, str, AcquisitionFunction, Dict[int, Any], Dict[str, Any], OptimizationConfig, WinsorizationConfig]]]][source]

Extract acquisition and optimizer options from model_gen_options.

ax.models.torch.botorch_modular.utils.convert_to_block_design(datasets: List[SupervisedDataset], force: bool = False) List[SupervisedDataset][source]
ax.models.torch.botorch_modular.utils.fit_botorch_model(model: Model, mll_class: Type[MarginalLogLikelihood], mll_options: Optional[Dict[str, Any]] = None) None[source]

Fit a BoTorch model.

ax.models.torch.botorch_modular.utils.get_post_processing_func(rounding_func: Optional[Callable[[Tensor], Tensor]], optimizer_options: Dict[str, Any]) Optional[Callable[[Tensor], Tensor]][source]

Get the post processing function by combining the rounding function with the post processing function provided as part of the optimizer options. If both are given, the post processing function is applied before applying the rounding function. If only one of them is given, then it is used as the post processing function.

ax.models.torch.botorch_modular.utils.get_subset_datasets(datasets: List[SupervisedDataset], subset_outcome_names: List[str]) List[SupervisedDataset][source]

Get the list of datasets corresponding to the given subset of outcome names. This is used to separate out datasets that are used by one surrogate.

Parameters:
  • datasets – A list of SupervisedDataset objects.

  • subset_outcome_names – A list of outcome names to get datasets for.

Returns:

A list of SupervisedDataset objects corresponding to the given subset of outcome names.

ax.models.torch.botorch_modular.utils.subset_state_dict(state_dict: OrderedDict[str, Tensor], submodel_index: int) OrderedDict[str, Tensor][source]

Get the state dict for a submodel from the state dict of a model list.

Parameters:
  • state_dict – A state dict.

  • submodel_index – The index of the submodel to extract.

Returns:

The state dict for the submodel.

ax.models.torch.botorch_modular.utils.use_model_list(datasets: List[SupervisedDataset], botorch_model_class: Type[Model], allow_batched_models: bool = True) bool[source]

ax.models.torch.botorch_modular.kernels module

class ax.models.torch.botorch_modular.kernels.ScaleMaternKernel(ard_num_dims: Optional[int] = None, batch_shape: Optional[Size] = None, lengthscale_prior: Optional[Prior] = None, outputscale_prior: Optional[Prior] = None, lengthscale_constraint: Optional[Interval] = None, outputscale_constraint: Optional[Interval] = None, **kwargs: Any)[source]

Bases: ScaleKernel

training: bool
class ax.models.torch.botorch_modular.kernels.TemporalKernel(dim: int, temporal_features: List[int], matern_ard_num_dims: Optional[int] = None, batch_shape: Optional[Size] = None, lengthscale_prior: Optional[Prior] = None, temporal_lengthscale_prior: Optional[Prior] = None, period_length_prior: Optional[Prior] = None, fixed_period_length: Optional[float] = None, outputscale_prior: Optional[Prior] = None, lengthscale_constraint: Optional[Interval] = None, outputscale_constraint: Optional[Interval] = None, temporal_lengthscale_constraint: Optional[Interval] = None, period_length_constraint: Optional[Interval] = None, **kwargs: Any)[source]

Bases: ScaleKernel

A product kernel of a periodic kernel and a Matern kernel.

The periodic kernel computes the similarity between temporal features such as the time of day.

The Matern kernel computes the similarity between the tunable parameters.

training: bool

ax.models.torch.botorch_modular.input_constructors.covar_modules module

ax.models.torch.botorch_modular.input_constructors.input_transforms module

ax.models.torch.botorch_modular.input_constructors.outcome_transform module

ax.models.torch.cbo_lcea module

class ax.models.torch.cbo_lcea.LCEABO(decomposition: Dict[str, List[str]], cat_feature_dict: Optional[Dict] = None, embs_feature_dict: Optional[Dict] = None, context_weight_dict: Optional[Dict] = None, embs_dim_list: Optional[List[int]] = None, gp_model_args: Optional[Dict[str, Any]] = None)[source]

Bases: BotorchModel

Does Bayesian optimization with Latent Context Embedding Additive (LCE-A) GP. The parameter space decomposition must be provided.

Parameters:
  • decomposition – Keys are context names. Values are the lists of parameter names belong to the context, e.g. {‘context1’: [‘p1_c1’, ‘p2_c1’],’context2’: [‘p1_c2’, ‘p2_c2’]}.

  • gp_model_args – Dictionary of kwargs to pass to GP model training. - train_embedding: Boolen. If true, we will train context embedding; otherwise, we use pre-trained embeddings from embds_feature_dict only. Default is True.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
best_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) Optional[Tensor][source]

Identify the current best point, satisfying the constraints in the same format as to gen.

Return None if no such point can be identified.

Parameters:
  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

d-tensor of the best point.

fidelity_features: List[int]
fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

get_and_fit_model(Xs: List[Tensor], Ys: List[Tensor], Yvars: List[Tensor], task_features: List[int], fidelity_features: List[int], metric_names: List[str], state_dict: Optional[Dict[str, Tensor]] = None, fidelity_model_id: Optional[int] = None, **kwargs: Any) GPyTorchModel[source]

Get a fitted LCEAGP model for each outcome. :param Xs: X for each outcome. :param Ys: Y for each outcome. :param Yvars: Noise variance of Y for each outcome.

Returns: Fitted LCEAGP model.

metric_names: List[str]
property model: Union[LCEAGP, ModelListGP]
task_features: List[int]
ax.models.torch.cbo_lcea.get_map_model(train_X: Tensor, train_Y: Tensor, train_Yvar: Tensor, decomposition: Dict[str, List[int]], train_embedding: bool = True, cat_feature_dict: Optional[Dict] = None, embs_feature_dict: Optional[Dict] = None, embs_dim_list: Optional[List[int]] = None, context_weight_dict: Optional[Dict] = None) Tuple[LCEAGP, ExactMarginalLogLikelihood][source]

Obtain MAP fitting of Latent Context Embedding Additive (LCE-A) GP.

ax.models.torch.cbo_lcem module

class ax.models.torch.cbo_lcem.LCEMBO(context_cat_feature: Optional[Tensor] = None, context_emb_feature: Optional[Tensor] = None, embs_dim_list: Optional[List[int]] = None)[source]

Bases: BotorchModel

Does Bayesian optimization with LCE-M GP.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
fidelity_features: List[int]
get_and_fit_model(Xs: List[Tensor], Ys: List[Tensor], Yvars: List[Tensor], task_features: List[int], fidelity_features: List[int], metric_names: List[str], state_dict: Optional[Dict[str, Tensor]] = None, fidelity_model_id: Optional[int] = None, **kwargs: Any) ModelListGP[source]

Get a fitted multi-task contextual GP model for each outcome. :param Xs: List of X data, one tensor per outcome. :param Ys: List of Y data, one tensor per outcome. :param Yvars: List of Noise variance of Yvar data, one tensor per outcome. :param task_features: List of columns of X that are tasks.

Returns: ModeListGP that each model is a fitted LCEM GP model.

metric_names: List[str]
task_features: List[int]

ax.models.torch.cbo_sac module

class ax.models.torch.cbo_sac.SACBO(decomposition: Dict[str, List[str]])[source]

Bases: BotorchModel

Does Bayesian optimization with structural additive contextual GP (SACGP). The parameter space decomposition must be provided.

Parameters:

decomposition – Keys are context names. Values are the lists of parameter names belong to the context, e.g. {‘context1’: [‘p1_c1’, ‘p2_c1’],’context2’: [‘p1_c2’, ‘p2_c2’]}.

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
fidelity_features: List[int]
fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

get_and_fit_model(Xs: List[Tensor], Ys: List[Tensor], Yvars: List[Tensor], task_features: List[int], fidelity_features: List[int], metric_names: List[str], state_dict: Optional[Dict[str, Tensor]] = None, fidelity_model_id: Optional[int] = None, **kwargs: Any) GPyTorchModel[source]

Get a fitted StructuralAdditiveContextualGP model for each outcome. :param Xs: X for each outcome. :param Ys: Y for each outcome. :param Yvars: Noise variance of Y for each outcome.

Returns: Fitted StructuralAdditiveContextualGP model.

metric_names: List[str]
task_features: List[int]
ax.models.torch.cbo_sac.generate_model_space_decomposition(decomposition: Dict[str, List[str]], feature_names: List[str]) Dict[str, List[int]][source]

ax.models.torch.frontier_utils module

ax.models.torch.frontier_utils.get_default_frontier_evaluator() Callable[[TorchModel, Tensor, Optional[Tensor], Optional[Tensor], Optional[Tensor], Optional[Tensor], Optional[Tuple[Tensor, Tensor]]], Tuple[Tensor, Tensor, Tensor]][source]
ax.models.torch.frontier_utils.get_weighted_mc_objective_and_objective_thresholds(objective_weights: Tensor, objective_thresholds: Tensor) Tuple[WeightedMCMultiOutputObjective, Tensor][source]

Construct weighted objective and apply the weights to objective thresholds.

Parameters:
  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • objective_thresholds – A tensor containing thresholds forming a reference point from which to calculate pareto frontier hypervolume. Points that do not dominate the objective_thresholds contribute nothing to hypervolume.

Returns:

  • The objective

  • The objective thresholds

Return type:

A two-element tuple with the objective and objective thresholds

ax.models.torch.fully_bayesian module

Models and utilities for fully bayesian inference.

TODO: move some of this into botorch.

References

[Eriksson2021saasbo]

D. Eriksson, M. Jankowiak. High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Subspaces. Proceedings of the Thirty- Seventh Conference on Uncertainty in Artificial Intelligence, 2021.

[Eriksson2021nas]

D. Eriksson, P. Chuang, S. Daulton, et al. Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization. ICML AutoML Workshop, 2021.

class ax.models.torch.fully_bayesian.FullyBayesianBotorchModel(model_constructor: ~typing.Callable[[~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[int], ~typing.List[int], ~typing.List[str], ~typing.Optional[~typing.Dict[str, ~torch.Tensor]], ~typing.Any], ~botorch.models.model.Model] = <function get_and_fit_model_mcmc>, model_predictor: ~typing.Callable[[~botorch.models.model.Model, ~torch.Tensor], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function predict_from_model_mcmc>, acqf_constructor: ~ax.models.torch.botorch_defaults.TAcqfConstructor = <function get_fully_bayesian_acqf>, acqf_optimizer: ~typing.Callable[[~botorch.acquisition.acquisition.AcquisitionFunction, ~torch.Tensor, int, ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Callable[[~torch.Tensor], ~torch.Tensor]], ~typing.Any], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function scipy_optimizer>, best_point_recommender: ~typing.Callable[[~ax.models.torch_base.TorchModel, ~typing.List[~typing.Tuple[float, float]], ~torch.Tensor, ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Dict[str, ~typing.Optional[~typing.Union[int, float, str, ~botorch.acquisition.acquisition.AcquisitionFunction, ~typing.Dict[int, ~typing.Any], ~typing.Dict[str, ~typing.Any], ~ax.core.optimization_config.OptimizationConfig, ~ax.models.winsorization_config.WinsorizationConfig]]]], ~typing.Optional[~typing.Dict[int, float]]], ~typing.Optional[~torch.Tensor]] = <function recommend_best_observed_point>, refit_on_cv: bool = False, refit_on_update: bool = True, warm_start_refitting: bool = True, use_input_warping: bool = False, use_saas: ~typing.Optional[bool] = None, num_samples: int = 256, warmup_steps: int = 512, thinning: int = 16, max_tree_depth: int = 6, disable_progbar: bool = False, gp_kernel: str = 'matern', verbose: bool = False, jit_compile: bool = False, **kwargs: ~typing.Any)[source]

Bases: FullyBayesianBotorchModelMixin, BotorchModel

Fully Bayesian Model that uses NUTS to sample from hyperparameter posterior.

This includes support for using sparse axis-aligned subspace priors (SAAS). See [Eriksson2021saasbo] for details.

class ax.models.torch.fully_bayesian.FullyBayesianBotorchModelMixin[source]

Bases: object

feature_importances() ndarray[source]
class ax.models.torch.fully_bayesian.FullyBayesianMOOBotorchModel(model_constructor: ~typing.Callable[[~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[~torch.Tensor], ~typing.List[int], ~typing.List[int], ~typing.List[str], ~typing.Optional[~typing.Dict[str, ~torch.Tensor]], ~typing.Any], ~botorch.models.model.Model] = <function get_and_fit_model_mcmc>, model_predictor: ~typing.Callable[[~botorch.models.model.Model, ~torch.Tensor], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function predict_from_model_mcmc>, acqf_constructor: ~ax.models.torch.botorch_defaults.TAcqfConstructor = <function get_fully_bayesian_acqf_nehvi>, acqf_optimizer: ~typing.Callable[[~botorch.acquisition.acquisition.AcquisitionFunction, ~torch.Tensor, int, ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.List[~typing.Tuple[~torch.Tensor, ~torch.Tensor, float]]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Callable[[~torch.Tensor], ~torch.Tensor]], ~typing.Any], ~typing.Tuple[~torch.Tensor, ~torch.Tensor]] = <function scipy_optimizer>, best_point_recommender: ~typing.Callable[[~ax.models.torch_base.TorchModel, ~typing.List[~typing.Tuple[float, float]], ~torch.Tensor, ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]], ~typing.Optional[~typing.Dict[int, float]], ~typing.Optional[~typing.Dict[str, ~typing.Optional[~typing.Union[int, float, str, ~botorch.acquisition.acquisition.AcquisitionFunction, ~typing.Dict[int, ~typing.Any], ~typing.Dict[str, ~typing.Any], ~ax.core.optimization_config.OptimizationConfig, ~ax.models.winsorization_config.WinsorizationConfig]]]], ~typing.Optional[~typing.Dict[int, float]]], ~typing.Optional[~torch.Tensor]] = <function recommend_best_observed_point>, frontier_evaluator: ~typing.Callable[[~ax.models.torch_base.TorchModel, ~torch.Tensor, ~typing.Optional[~torch.Tensor], ~typing.Optional[~torch.Tensor], ~typing.Optional[~torch.Tensor], ~typing.Optional[~torch.Tensor], ~typing.Optional[~typing.Tuple[~torch.Tensor, ~torch.Tensor]]], ~typing.Tuple[~torch.Tensor, ~torch.Tensor, ~torch.Tensor]] = <function pareto_frontier_evaluator>, refit_on_cv: bool = False, refit_on_update: bool = True, warm_start_refitting: bool = False, use_input_warping: bool = False, num_samples: int = 256, warmup_steps: int = 512, thinning: int = 16, max_tree_depth: int = 6, use_saas: ~typing.Optional[bool] = None, disable_progbar: bool = False, gp_kernel: str = 'matern', verbose: bool = False, jit_compile: bool = False, **kwargs: ~typing.Any)[source]

Bases: FullyBayesianBotorchModelMixin, MultiObjectiveBotorchModel

Fully Bayesian Model that uses qNEHVI.

This includes support for using qNEHVI + SAASBO as in [Eriksson2021nas].

ax.models.torch.fully_bayesian.compute_dists(X: Tensor, Z: Tensor, lengthscale: Tensor) Tensor[source]

Compute kernel distances.

ax.models.torch.fully_bayesian.get_and_fit_model_mcmc(Xs: List[Tensor], Ys: List[Tensor], Yvars: List[Tensor], task_features: List[int], fidelity_features: List[int], metric_names: List[str], state_dict: Optional[Dict[str, Tensor]] = None, refit_model: bool = True, use_input_warping: bool = False, use_loocv_pseudo_likelihood: bool = False, num_samples: int = 256, warmup_steps: int = 512, thinning: int = 16, max_tree_depth: int = 6, disable_progbar: bool = False, gp_kernel: str = 'matern', verbose: bool = False, jit_compile: bool = False, **kwargs: Any) GPyTorchModel[source]

Instantiates a batched GPyTorchModel(ModelListGP) based on the given data and fit the model based on MCMC in pyro. The batch dimension corresponds to sampled hyperparameters from MCMC.

ax.models.torch.fully_bayesian.get_fully_bayesian_acqf(model: Model, objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, **kwargs: Any) AcquisitionFunction[source]

NOTE: An acqf_constructor with which the underlying acquisition function is constructed is optionally extracted from kwargs and defaults to NEI.

We did not add acqf_constructor directly to the argument list of get_fully_bayesian_acqf so that it satisfies the TAcqfConstructor Protocol that is shared by all other legacy Ax acquisition function constructors.

ax.models.torch.fully_bayesian.get_fully_bayesian_acqf_nehvi(model: Model, objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, **kwargs: Any) AcquisitionFunction[source]
ax.models.torch.fully_bayesian.matern_kernel(X: Tensor, Z: Tensor, lengthscale: Tensor, nu: float = 2.5) Tensor[source]

Scaled Matern kernel.

ax.models.torch.fully_bayesian.predict_from_model_mcmc(model: Model, X: Tensor) Tuple[Tensor, Tensor][source]

Predicts outcomes given a model and input tensor.

This method integrates over the hyperparameter posterior.

Parameters:
  • model – A batched botorch Model where the batch dimension corresponds to sampled hyperparameters.

  • X – A n x d tensor of input parameters.

Returns:

The predicted posterior mean as an n x o-dim tensor. Tensor: The predicted posterior covariance as a n x o x o-dim tensor.

Return type:

Tensor

ax.models.torch.fully_bayesian.rbf_kernel(X: Tensor, Z: Tensor, lengthscale: Tensor) Tensor[source]

Scaled RBF kernel.

ax.models.torch.fully_bayesian.run_inference(pyro_model: Callable, X: Tensor, Y: Tensor, Yvar: Tensor, num_samples: int = 256, warmup_steps: int = 512, thinning: int = 16, use_input_warping: bool = False, max_tree_depth: int = 6, disable_progbar: bool = False, gp_kernel: str = 'matern', verbose: bool = False, task_feature: Optional[int] = None, rank: Optional[int] = None, jit_compile: bool = False) Dict[str, Tensor][source]
ax.models.torch.fully_bayesian.single_task_pyro_model(X: Tensor, Y: Tensor, Yvar: Tensor, use_input_warping: bool = False, eps: float = 1e-07, gp_kernel: str = 'matern', task_feature: Optional[int] = None, rank: Optional[int] = None) None[source]

Instantiates a single task pyro model for running fully bayesian inference.

Parameters:
  • X – A n x d tensor of input parameters.

  • Y – A n x 1 tensor of output.

  • Yvar – A n x 1 tensor of observed noise.

  • use_input_warping – A boolean indicating whether to use input warping

  • task_feature – Column index of task feature in X.

  • gp_kernel – kernel name. Currently only two kernels are supported: “matern” for Matern Kernel and “rbf” for RBFKernel.

  • rank – num of latent task features to learn for task covariance.

ax.models.torch.fully_bayesian_model_utils module

ax.models.torch.fully_bayesian_model_utils.load_mcmc_samples_to_model(model: GPyTorchModel, mcmc_samples: Dict) None[source]

Load MCMC samples into GPyTorchModel.

ax.models.torch.fully_bayesian_model_utils.pyro_sample_input_warping(dim: int, **tkwargs: Any) Tuple[Tensor, Tensor][source]
ax.models.torch.fully_bayesian_model_utils.pyro_sample_mean(**tkwargs: Any) Tensor[source]
ax.models.torch.fully_bayesian_model_utils.pyro_sample_noise(**tkwargs: Any) Tensor[source]
ax.models.torch.fully_bayesian_model_utils.pyro_sample_outputscale(concentration: float = 2.0, rate: float = 0.15, **tkwargs: Any) Tensor[source]
ax.models.torch.fully_bayesian_model_utils.pyro_sample_saas_lengthscales(dim: int, alpha: float = 0.1, **tkwargs: Any) Tensor[source]

ax.models.torch.posterior_mean module

ax.models.torch.posterior_mean.get_PosteriorMean(model: Model, objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, X_pending: Optional[Tensor] = None, **kwargs: Any) AcquisitionFunction[source]

Instantiates a PosteriorMean acquisition function.

Note: If no OutcomeConstraints given, return an analytic acquisition function. This requires {optimizer_kwargs: {joint_optimization: True}} or an optimizer that does not assume pending point support.

Parameters:
  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

  • X_observed – A tensor containing points observed for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

  • X_pending – A tensor containing points whose evaluation is pending (i.e. that have been submitted for evaluation) present for all objective outcomes and outcomes that appear in the outcome constraints (if there are any).

Returns:

The instantiated acquisition function.

Return type:

PosteriorMean

ax.models.torch.rembo module

class ax.models.torch.rembo.REMBO(A: Tensor, initial_X_d: Tensor, bounds_d: List[Tuple[float, float]], **kwargs: Any)[source]

Bases: BotorchModel

Implements REMBO (Bayesian optimization in a linear subspace).

The (D x d) projection matrix A must be provided, and must be that used for the initialization. In the original REMBO paper A ~ N(0, 1). Box bounds in the low-d space must also be provided, which in the REMBO paper should be [(-sqrt(d), sqrt(d)]^d.

Function evaluations happen in the high-D space, and so the arms on the experiment will also be tracked in the high-D space. This class maintains a list of points in the low-d spac that have been launched, so we can match arms in high-D space back to their low-d point on update.

Parameters:
  • A – (D x d) projection matrix.

  • initial_X_d – Points in low-d space for initial data.

  • bounds_d – Box bounds in the low-d space.

  • kwargs – kwargs for BotorchModel init

Xs: List[Tensor]
Ys: List[Tensor]
Yvars: List[Tensor]
best_point(search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) Optional[Tensor][source]

Identify the current best point, satisfying the constraints in the same format as to gen.

Return None if no such point can be identified.

Parameters:
  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

d-tensor of the best point.

cross_validate(datasets: List[SupervisedDataset], X_test: Tensor, **kwargs: Any) Tuple[Tensor, Tensor][source]

Do cross validation with the given training and test sets.

Training set is given in the same format as to fit. Test set is given in the same format as to predict.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • X_test – (j x d) tensor of the j points at which to make predictions.

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in X.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

fidelity_features: List[int]
fit(datasets: List[SupervisedDataset], search_space_digest: SearchSpaceDigest, candidate_metadata: Optional[List[List[Optional[Dict[str, Any]]]]] = None) None[source]

Fit model to m outcomes.

Parameters:
  • datasets – A list of SupervisedDataset containers, each corresponding to the data of one metric (outcome).

  • search_space_digest – A SearchSpaceDigest object containing metadata on the features in the datasets.

  • candidate_metadata – Model-produced metadata for candidates, in the order corresponding to the Xs.

from_01(X_d01: Tensor) Tensor[source]

Map points from [0, 1] to bounds_d.

Parameters:

X_d01 – Tensor in [0, 1]

Returns: Tensor in bounds_d.

gen(n: int, search_space_digest: SearchSpaceDigest, torch_opt_config: TorchOptConfig) TorchGenResults[source]

Generate new candidates.

Parameters:
  • n – Number of candidates to generate.

  • search_space_digest – A SearchSpaceDigest object containing metadata about the search space (e.g. bounds, parameter types).

  • torch_opt_config – A TorchOptConfig object containing optimization arguments (e.g., objective weights, constraints).

Returns:

A TorchGenResult container.

metric_names: List[str]
predict(X: Tensor) Tuple[Tensor, Tensor][source]

Predict

Parameters:

X – (j x d) tensor of the j points at which to make predictions.

Returns:

2-element tuple containing

  • (j x m) tensor of outcome predictions at X.

  • (j x m x m) tensor of predictive covariances at X. cov[j, m1, m2] is Cov[m1@j, m2@j].

project_down(X_D: Tensor) Tensor[source]

Map points in the high-D space to the low-d space by looking them up in self.X_d.

We assume that X_D = self.project_up(self.X_d), except possibly with rows shuffled. If a value in X_d cannot be found for each row in X_D, an error will be raised.

This is quite fast relative to model fitting, so we do it in O(n^2) time and don’t worry about it.

Parameters:

X_D – Tensor in high-D space.

Returns:

Tensor in low-d space.

Return type:

X_d

project_up(X: Tensor) Tensor[source]

Project to high-dimensional space.

task_features: List[int]
to_01(X_d: Tensor) Tensor[source]

Map points from bounds_d to [0, 1].

Parameters:

X_d – Tensor in bounds_d

Returns: Tensor in [0, 1].

ax.models.torch.utils module

class ax.models.torch.utils.SubsetModelData(model: botorch.models.model.Model, objective_weights: torch.Tensor, outcome_constraints: Optional[Tuple[torch.Tensor, torch.Tensor]], objective_thresholds: Optional[torch.Tensor], indices: torch.Tensor)[source]

Bases: object

indices: Tensor
model: Model
objective_thresholds: Optional[Tensor]
objective_weights: Tensor
outcome_constraints: Optional[Tuple[Tensor, Tensor]]
ax.models.torch.utils.get_botorch_objective_and_transform(botorch_acqf_class: Type[AcquisitionFunction], model: Model, objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, X_observed: Optional[Tensor] = None, risk_measure: Optional[RiskMeasureMCObjective] = None) Tuple[Optional[MCAcquisitionObjective], Optional[PosteriorTransform]][source]

Constructs a BoTorch AcquisitionObjective object.

Parameters:
  • botorch_acqf_class – The acquisition function class the objective and posterior transform are to be used with. This is mainly used to determine whether to construct a multi-output or a single-output objective.

  • model – A BoTorch Model.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

  • X_observed – Observed points that are feasible and appear in the objective or the constraints. None if there are no such points.

  • risk_measure – An optional risk measure for robust optimization.

Returns:

A two-tuple containing (optionally) an MCAcquisitionObjective and (optionally) a PosteriorTransform.

ax.models.torch.utils.get_out_of_sample_best_point_acqf(model: Model, Xs: List[Tensor], X_observed: Tensor, objective_weights: Tensor, mc_samples: int = 512, fixed_features: Optional[Dict[int, float]] = None, fidelity_features: Optional[List[int]] = None, target_fidelities: Optional[Dict[int, float]] = None, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, seed_inner: Optional[int] = None, qmc: bool = True, risk_measure: Optional[RiskMeasureMCObjective] = None, **kwargs: Any) Tuple[AcquisitionFunction, Optional[List[int]]][source]

Picks an appropriate acquisition function to find the best out-of-sample (predicted by the given surrogate model) point and instantiates it.

NOTE: Typically the appropriate function is the posterior mean, but can differ to account for fidelities etc.

ax.models.torch.utils.is_noiseless(model: Model) bool[source]

Check if a given (single-task) botorch model is noiseless

ax.models.torch.utils.normalize_indices(indices: List[int], d: int) List[int][source]

Normalize a list of indices to ensure that they are positive.

Parameters:
  • indices – A list of indices (may contain negative indices for indexing “from the back”).

  • d – The dimension of the tensor to index.

Returns:

A normalized list of indices such that each index is between 0 and d-1.

ax.models.torch.utils.pick_best_out_of_sample_point_acqf_class(outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, mc_samples: int = 512, qmc: bool = True, seed_inner: Optional[int] = None, risk_measure: Optional[RiskMeasureMCObjective] = None) Tuple[Type[AcquisitionFunction], Dict[str, Any]][source]
ax.models.torch.utils.predict_from_model(model: Model, X: Tensor) Tuple[Tensor, Tensor][source]

Predicts outcomes given a model and input tensor.

For a GaussianMixturePosterior we currently use a Gaussian approximation where we compute the mean and variance of the Gaussian mixture. This should ideally be changed to compute quantiles instead when Ax supports non-Gaussian distributions.

Parameters:
  • model – A botorch Model.

  • X – A n x d tensor of input parameters.

Returns:

The predicted posterior mean as an n x o-dim tensor. Tensor: The predicted posterior covariance as a n x o x o-dim tensor.

Return type:

Tensor

ax.models.torch.utils.randomize_objective_weights(objective_weights: Tensor, random_scalarization_distribution: str = 'simplex') Tensor[source]

Generate a random weighting based on acquisition function settings.

Parameters:
  • objective_weights – Base weights to multiply by random values.

  • random_scalarization_distribution – “simplex” or “hypersphere”.

Returns:

A normalized list of indices such that each index is between 0 and d-1.

ax.models.torch.utils.subset_model(model: Model, objective_weights: Tensor, outcome_constraints: Optional[Tuple[Tensor, Tensor]] = None, objective_thresholds: Optional[Tensor] = None) SubsetModelData[source]

Subset a botorch model to the outputs used in the optimization.

Parameters:
  • model – A BoTorch Model. If the model does not implement the subset_outputs method, this function is a null-op and returns the input arguments.

  • objective_weights – The objective is to maximize a weighted sum of the columns of f(x). These are the weights.

  • objective_thresholds – The m-dim tensor of objective thresholds. There is one for each modeled metric.

  • outcome_constraints – A tuple of (A, b). For k outcome constraints and m outputs at f(x), A is (k x m) and b is (k x 1) such that A f(x) <= b. (Not used by single task models)

Returns:

A SubsetModelData dataclass containing the model, objective_weights, outcome_constraints, objective thresholds, all subset to only those outputs that appear in either the objective weights or the outcome constraints, along with the indices of the outputs.

ax.models.torch.utils.tensor_callable_to_array_callable(tensor_func: Callable[[Tensor], Tensor], device: device) Callable[[ndarray], ndarray][source]

transfer a tensor callable to an array callable