Benchmark

class pycalib.benchmark.Benchmark(run_dir, cal_methods, cal_method_names, cross_validator, random_state=None)[source]

Bases: object

A benchmarking class for calibration methods.

Parameters
  • run_dir (str) – Directory to run benchmarking in and save output and logs to.

  • cal_methods (list) – Calibration methods to benchmark.

  • cal_method_names (list) – Names of calibration methods.

  • cross_validator (int, cross-validation generator or an iterable, optional) – Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 3-fold cross validation, - integer, to specify the number of folds in a (Stratified)KFold, - CV splitter, - An iterable yielding (train, test) splits as arrays of indices. For integer/None inputs, if the estimator is a classifier and y is either binary or multi-class, StratifiedKFold is used. In all other cases, KFold is used.

  • random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

Methods Summary

data_gen()

Returns the full dataset or a generator of datasets.

plot(out_file, results_file, score, methods)

Plot results from benchmark experiments as an error bar plot.

run([n_jobs])

Train all models, evaluate on test data and save the results.

Methods Documentation

data_gen()[source]

Returns the full dataset or a generator of datasets.

Returns

Return type

X, y giving uncalibrated predictions and corresponding classes.

static plot(out_file, results_file, score, methods, classifiers='all', width=5.0, height=2.5)[source]

Plot results from benchmark experiments as an error bar plot.

Parameters
  • out_file (str) – File location for the output plot.

  • results_file (str) – The location of the csv files containing experiment results.

  • score (str) – Type of score to plot.

  • methods (list) – Calibration methods to plot.

  • classifiers (list or "all") – List of classifiers for which to show results.

  • width (float, default=5.) – Width of the plot.

  • height (float, default=2.5) – Height of the plot.

run(n_jobs=None)[source]

Train all models, evaluate on test data and save the results.

Parameters

n_jobs (int or None, optional (default=None)) – The number of CPUs to use to do the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.