lf2i.simulator package#

Submodules#

lf2i.simulator.gaussian module#

class lf2i.simulator.gaussian.GaussianMean(likelihood_cov: float | Tensor, prior: str, poi_space_bounds: Dict[str, float], poi_grid_size: int, poi_dim: int, data_dim: int, batch_size: int, prior_kwargs: Dict[str, float | Tensor] | None = None)[source]#

Bases: Simulator

Gaussian simulator with fixed covariance structure. Supports any parameter dimensionality and batch size. Assumes diagonal covariance matrix.

Parameter of interest: mean.

Parameters:

likelihood_cov (Union[float, torch.Tensor]) – Covariance structure of the likelihood. If float or Tensor with only one value, it is interpreted as the (equal) variance for each component. If Tensor with poi_dim values, the i-th one is the variance of the i-th component.
prior (str) – Either gaussian or uniform.
poi_space_bounds (Dict[str, float]) – Bounds of the space of parameters of interest. Used to construct the parameter grid, which contains the evaluation points for the confidence regions. Must contain low and high. Assumes that each dimension of the parameter has the same bounds.
poi_grid_size (int) – Number of points in the parameter grid. If (poi_grid_size)**(1/poi_dim) is not an integer, the closest larger number is chosen. E.g., if poi_grid_size == 1000 and poi_dim == 2, then the grid will have 32 x 32 = 1024 points.
poi_dim (int) – Dimensionality of the parameter of interest.
data_dim (int) – Dimensionality of the data.
batch_size (int) – Size of each batch of samples generated from a specific parameter value.
prior_kwargs (Optional[Dict[Union[float, torch.Tensor]]], optional) – If prior == ‘gaussian’, must contain loc and cov. These can be scalars or tensors, as specified for likelihood_cov. If prior == ‘uniform’, must contain ‘low’ and ‘high’. Assumes that each dimension of the parameter has the same bounds. If None, parameter_space_bounds is used.

simulate_for_test_statistic(size: int, estimation_method: str) → Tuple[Tensor][source]#

Simulate a training set used to estimate the test statistic.

Parameters:

size (int) – Number of simulations.
estimation_method (str) – The method with which the test statistic is estimated. If likelihood-based test statistics are used, such as ACORE and BFF, then ‘likelihood’. If prediction/posterior-based test statistics are used, such as WALDO, then ‘prediction’ or ‘posterior’.

Returns:

Y, parameters, samples (depending on the specific needs of the test statistic).

Return type:

Tuple[Union[np.ndarray, torch.Tensor]]

simulate_for_critical_values(size: int) → Tuple[Tensor][source]#

Simulate a training set used to estimate the critical values via quantile regression.

Parameters:: size (int) – Number of simulations. Note that each simulation will be a batch with dimensions (batch_size, data_dim).
Returns:: Parameters, samples.
Return type:: Tuple[Union[np.ndarray, torch.Tensor], Union[np.ndarray, torch.Tensor]]

simulate_for_diagnostics(size: int) → Tuple[Tensor][source]#

Simulate a training set used to estimate conditional coverage via the diagnostics branch.

Parameters:: size (int) – Number of simulations. Note that each simulation will be a batch with dimensions (batch_size, data_dim).
Returns:: Parameters, samples.
Return type:: Tuple[Union[np.ndarray, torch.Tensor], Union[np.ndarray, torch.Tensor]]

lf2i.simulator.hep module#

Bases: Simulator

Poisson counting experiment as detailed in https://arxiv.org/abs/2107.03920.

POI is signal strength mu. Nuisance is background scaling factor nu. In addition, the following are treated as fixed hyperparameters:

Nominally expected signal and background counts s and b.

Relationship in measurement time between the two processes tau.

property param_space_bounds: Dict[str, List[float]]#

simulate_for_test_statistic(size: int, estimation_method: str, p: float = 0.5) → Tuple[Tensor][source]#

Simulate a training set used to estimate the test statistic.

Parameters:

size (int) – Number of simulations.
estimation_method (str) – The method with which the test statistic is estimated. If likelihood-based test statistics are used, such as ACORE and BFF, then ‘likelihood’. If prediction/posterior-based test statistics are used, such as WALDO, then ‘prediction’ or ‘posterior’.

Returns:

Y, parameters, samples (depending on the specific needs of the test statistic).

Return type:

Tuple[Union[np.ndarray, torch.Tensor]]

simulate_for_critical_values(size: int) → Tuple[Tensor][source]#

Simulate a training set used to estimate the critical values via quantile regression.

Parameters:: size (int) – Number of simulations. Note that each simulation will be a batch with dimensions (batch_size, data_dim).
Returns:: Parameters, samples.
Return type:: Tuple[Union[np.ndarray, torch.Tensor], Union[np.ndarray, torch.Tensor]]

simulate_for_diagnostics(size: int) → Tuple[Tensor][source]#

Simulate a training set used to estimate conditional coverage via the diagnostics branch.

Parameters:: size (int) – Number of simulations. Note that each simulation will be a batch with dimensions (batch_size, data_dim).
Returns:: Parameters, samples.
Return type:: Tuple[Union[np.ndarray, torch.Tensor], Union[np.ndarray, torch.Tensor]]

Module contents#

class lf2i.simulator.Simulator(poi_dim: int, data_dim: int, batch_size: int, nuisance_dim: int | None = None)[source]#

Bases: ABC

Base class for simulators. This is a template from which every simulator should inherit.

Parameters:

poi_dim (int) – Dimensionality of the space of parameters of interest.
data_dim (int) – Dimensionality of a single datapoint X.
batch_size (int) – Size of data batches from a specific parameter configuration. Must be the same for observations and simulations. A simulated/observed sample batch from a specific parameter configuration will have dimensions (batch_size, data_dim).
nuisance_dim (Optional[int], optional) – Dimensionality of the space of nuisance parameters (systematics), by default 0.

abstract simulate_for_test_statistic(size: int, estimation_method: str) → Tuple[ndarray | Tensor][source]#

Simulate a training set used to estimate the test statistic.

Parameters:

size (int) – Number of simulations.
estimation_method (str) – The method with which the test statistic is estimated. If likelihood-based test statistics are used, such as ACORE and BFF, then ‘likelihood’. If prediction/posterior-based test statistics are used, such as WALDO, then ‘prediction’ or ‘posterior’.

Returns:

Y, parameters, samples (depending on the specific needs of the test statistic).

Return type:

Tuple[Union[np.ndarray, torch.Tensor]]

abstract simulate_for_critical_values(size: int) → Tuple[ndarray | Tensor, ndarray | Tensor][source]#

Simulate a training set used to estimate the critical values via quantile regression.

Parameters:: size (int) – Number of simulations. Note that each simulation will be a batch with dimensions (batch_size, data_dim).
Returns:: Parameters, samples.
Return type:: Tuple[Union[np.ndarray, torch.Tensor], Union[np.ndarray, torch.Tensor]]

abstract simulate_for_diagnostics(size: int) → Tuple[ndarray | Tensor, ndarray | Tensor][source]#

Simulate a training set used to estimate conditional coverage via the diagnostics branch.

Parameters:: size (int) – Number of simulations. Note that each simulation will be a batch with dimensions (batch_size, data_dim).
Returns:: Parameters, samples.
Return type:: Tuple[Union[np.ndarray, torch.Tensor], Union[np.ndarray, torch.Tensor]]