lf2i.inference package#

Module contents#

class lf2i.inference.LF2I(test_statistic: str | TestStatistic, **test_statistic_kwargs: Any)[source]#

Bases: object

High-level entry point to do inference with LF2I (https://arxiv.org/abs/2107.03920). This allows to quickly construct confidence regions for parameters of interest in an SBI setting leveraging an arbitrary estimator

of the likelihood, using for example the ACORE or BFF test statistics (https://arxiv.org/pdf/2002.10399.pdf, https://arxiv.org/abs/2107.03920);

of the posterior, using for example the Waldo test statistic (https://arxiv.org/abs/2205.15680);

of point estimates, i.e. a general prediction algorithm, using again the WALDO test statistic.

Alternatively, one can define a custom TestStatistic appropriate for the problem at hand.

NOTE: although this entry point contains all the main LF2I functionalities, using the single implemented components (test statistics, critical values, neyman inversion) provides a bit more flexibility and allows to control every single hyper-parameter.

Parameters:

test_statistic (Union[str, TestStatistic]) – Either acore, bff, waldo or an instance of a custom lf2i.test_statistics._base.TestStatistic
test_statistic_kwargs (Any) – Arguments specific to the chosen test statistic if one of acore, bff, waldo. See the dedicated documentation for each of them in lf2i/test_statistics/

inference(x: ndarray | Tensor, evaluation_grid: ndarray | Tensor, confidence_level: float, quantile_regressor: str | Any = 'gb', quantile_regressor_kwargs: Dict = {}, T: Tuple[ndarray | Tensor] | None = None, T_prime: Tuple[ndarray | Tensor] | None = None, simulator: Simulator | None = None, b: int | None = None, b_prime: int | None = None, re_estimate_test_statistics: bool = False, re_estimate_critical_values: bool = False) → List[ndarray][source]#

Estimate test statistic and critical values, and construct a confidence region for all observations in x.

Parameters:

x (Union[np.ndarray, torch.Tensor]) – Observed sample(s).
evaluation_grid (Union[np.ndarray, torch.Tensor]) – Grid of points over the parameter space over which to invert hypothesis tests and construct the confidence regions. Each confidence set will be a subset of this grid.
confidence_level (float) – Desired confidence level, must be in (0, 1).
quantile_regressor (Union[str, Any], optional) – If str, it is an identifier for the quantile regressor to use, by default ‘gb’. If Any, must be a quantile regressor with .fit(X=…, y=…) and .predict(X=…) methods. Currently available: [‘gb’, ‘nn’]
quantile_regressor_kwargs (Dict, optional) – Settings for the chosen quantile regressor, by default {}
T (Tuple[Union[np.ndarray, torch.Tensor]], optional) –
Simulated dataset to train the estimator for the test statistic. Must adhere to the following specifications:
- if using ACORE or BFF, must be a tuple of arrays or tensors (Y, theta, X) in this order as described by Algorithm 3 in https://arxiv.org/abs/2107.03920.
- if using Waldo, must be a tuple of arrays or tensors (theta, X) in this order as described by Algorithm 1 in https://arxiv.org/pdf/2205.15680.pdf.
- if using a custom test statistic, then an arbitrary tuple of arrays of tensors is expected.
If not given, must supply a simulator.
T_prime (Tuple[Union[np.ndarray, torch.Tensor]], optional) – Simulated dataset to train the quantile regressor to estimate critical values. Must be a tuple of arrays or tensors (theta, X). If not given, must supply a simulator.
simulator (Simulator, optional) – If T and T_prime are not given, must pass an instance of lf2i.simulator.Simulator.
b (int, optional) – Number of simulations used to estimate the test statistic. Used only if simulator is provided.
b_prime (int, optional) – Number of simulations used to estimate the critical values. Used only if simulator is provided.
re_estimate_test_statistics (bool, optional) – Whether to re-estimate the test statistics if a previous call to .infer() was made, by default False.
re_estimate_critical_values (bool, optional) – Whether to re-estimate the critical values if a previous call to .infer() was made, by default False.

Returns:

The i-th element is a confidence region for the i-th sample in x.

Return type:

List[np.ndarray]

diagnostics(region_type: str, coverage_estimator: str = 'splines', coverage_estimator_kwargs: Dict = {}, T_double_prime: Tuple[ndarray | Tensor] | None = None, simulator: Simulator | None = None, b_double_prime: int | None = None, new_parameters: ndarray | None = None, indicators: ndarray | None = None, parameters: ndarray | None = None, posterior_estimator: Any | None = None, evaluation_grid: ndarray | Tensor | None = None, confidence_level: float | None = None, num_p_levels: int | None = 10000) → Tuple[Any, ndarray, ndarray, ndarray, ndarray][source]#

Independent diagnostics check for the empirical coverage of a desired uncertainty quantification method across the whole parameter space. It estimates the coverage probability at all parameter values and provides 2-sigma prediction intervals around these estimates.

NOTE: this can be applied to any parameter region, even if it has not been constructed via LF2I.

Parameters:

region_type (Union[str, None]) –
Whether the parameter regions to be checked are confidence regions from
- LF2I (‘lf2i’);
- credible regions from a posterior distribution (‘posterior’);
- Gaussian prediction intervals centered around predictions (‘prediction’). For this, self.test_statistic must be Waldo.
If none of the above, then must provide indicators and parameters.
coverage_estimator (str, optional) – Probabilistic classifier to use to estimate coverage probabilities, by default ‘splines’
coverage_estimator_kwargs (Dict, optional) – Settings for the probabilistic classifier, by default {}
T_double_prime (Tuple[Union[np.ndarray, torch.Tensor]], optional) – Simulated dataset to learn the coverage probability via probabilistic classification. Must be a tuple of arrays or tensors (theta, X). If not given, must supply a simulator.
simulator (Simulator, optional) – If T_double_prime is not given, must pass an instance of lf2i.simulator.Simulator.
b_double_prime (int, optional) – Number of simulations used to estimate the coverage probability across the parameter space. Used only if simulator is provided.
new_parameters (Optional[np.ndarray], optional) – If provided, coverage probabilities are estimated conditional on these parameters, by default None. If None, parameters simulated uniformly over the parameter space are used.
indicators (Optional[np.ndarray], optional) – Pre-computed indicators (0-1) that mark whether the corresponding value in parameters is included or not in the target parameter region, by default None
parameters (Optional[np.ndarray], optional) – Array of parameters for which the corresponding indicators have been pre-computed, by default None
posterior_estimator (Any, optional) – If region_type == posterior and indicators are not provided, then a trained posterior estimator which implements the log_prob(…) method must be given.
evaluation_grid (Union[np.ndarray, torch.Tensor]) – If region_type in [`posterior, prediction]` and indicators are not provided, grid of points over the parameter space over which to construct a high-posterior-density credible region or a Gaussian interval centered around predictions.
confidence_level (float) – If region_type in [`posterior, prediction]` and indicators are not provided, must give the confidence level to construct credible regions or prediction intervals and compute indicators. If region_type == `lf2i, this information is already embedded in self.quantile_regressor. Must be in (0, 1).
num_p_levels (int, optional) – If region_type == posterior and indicators are not provided, number of level sets to consider to construct the high-posterior-density credible region, by default 100_000.

Returns:

Diagnostics estimator, Evaluated parameters and estimated conditional coverage probabilities (mean, upper 2-sigma bound, lower 2-sigma bound)

Return type:

Tuple[Any, np.ndarray, np.ndarray, np.ndarray, np.ndarray]

Raises:

ValueError – If region_type is not among those supported and indicators is None