SyntheticBeta¶
-
class
pycalib.benchmark.
SyntheticBeta
(run_dir, cal_methods, cal_method_names, beta_params, miscal_functions, miscal_function_names, size, marginal_probs=None, n_splits=10, test_size=0.9, train_size=None, random_state=None)[source]¶ Bases:
pycalib.benchmark.Benchmark
Model evaluation using synthetic data sampled from a Beta distribution.
Implements a data generation method returning a new evaluation data with maximum posterior probabilities sampled from a Beta distribution \(\hat{p}_{\max} \sim (1-\frac{1}{K})\text{Beta}(\alpha, \beta)+\frac{1}{K}\) and corresponding class labels sampled from a Bernoulli distribution with parameter \(f(\hat{p}_{\max})\), where \(f : [\frac{1}{n_\text{classes}},1] \rightarrow [\frac{1}{n_\text{classes}},1]\) is a miscalibration function.
- Parameters
run_dir (str) – Directory to run benchmarking in and save output and logs to.
cal_methods (list) – Calibration methods to benchmark.
cal_method_names (list) – Names of calibration methods.
beta_params (tuple or list, shape=(n,2)) – Parameters \((\alpha, \beta)\) of the Beta distribution.
miscal_functions (function or list) – Function(s) \(f : [0,1] \rightarrow [0,1]\) for miscalibration. When this function is different from the identity, the generated output from this function is miscalibrated. The function automatically gets rescaled to \(f : [\frac{1}{n_\text{classes}},1] \rightarrow [\frac{1}{n_\text{classes}},1]\).
miscal_function_names (str or list) – Names of miscalibration functions.
marginal_probs (float or list, default=None) – Marginal class probabilities.
n_splits (int, default=10) – Number of splits for cross validation.
test_size (float, default=0.9) – Size of test set.
train_size (float, default=None) – Size of calibration set.
random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
Methods Summary
data_gen
()Returns the full dataset or a generator of datasets.
plot
(**kwargs)Plots the result of the benchmark experiment.
plot_miscal_function
([function_names])Plots the miscalibration functions.
run
([n_jobs])Train all models, evaluate on test data and save the results.
sample_miscal_data
(alpha, beta, miscal_func, …)Sample a synthetic data set based on the Beta distribution and a miscalibration function.
Methods Documentation
-
data_gen
()[source]¶ Returns the full dataset or a generator of datasets.
- Returns
- Return type
X, y giving uncalibrated predictions and corresponding classes.
-
plot
(**kwargs)[source]¶ Plots the result of the benchmark experiment.
- Parameters
**kwargs – Additional arguments passed on to
matplotlib.plot()
.
-
plot_miscal_function
(function_names=None, **kwargs)[source]¶ Plots the miscalibration functions.
- Parameters
function_names (list) – List of miscalibration functions to plot.
**kwargs – Additional arguments passed on to
matplotlib.plot()
.
-
run
(n_jobs=None)¶ Train all models, evaluate on test data and save the results.
-
static
sample_miscal_data
(alpha, beta, miscal_func, miscal_func_name, size, marginal_probs, random_state=None)[source]¶ Sample a synthetic data set based on the Beta distribution and a miscalibration function.
- Parameters
alpha (float) – Parameter \(lpha\) of the Beta distribution.
beta (float) – Parameter \(lpha\) of the Beta distribution.
miscal_func (function) – Function \(f : [\frac{1}{n_\text{classes}},1] \rightarrow [\frac{1}{n_\text{classes}},1]\) giving the accuracy for a given confidence. When this function is different from the identity, the generated output from this function is miscalibrated.
miscal_func_name (str) – Name of the miscalibration function.
size (int) – Size of data set.
marginal_probs (np.ndarray, size=(n_classes,)) – Marginal class probabilities.
random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- Returns
- Return type
X, y giving uncalibrated predictions and corresponding classes.