CIFARData

class pycalib.benchmark.CIFARData(run_dir, clf_output_dir, classifier_names, cal_methods, cal_method_names, use_logits=False, n_splits=10, test_size=9000, train_size=None, random_state=None)[source]

Bases: pycalib.benchmark.Benchmark

Model evaluation using the benchmark vision dataset CIFAR-100.

Implements a data generation method returning a new evaluation data set for each scoring round. Pre-trained CNNs are available at 1.

Parameters
  • run_dir (str) – Directory to run benchmarking in and save output and logs to.

  • clf_output_dir (str) – Directory containing calibration data obtained from CIFAR-100 classification.

  • classifier_names (list) – Names of classifiers to be calibrated. Classification results on CIFAR-100 must be contained in data_dir.

  • cal_methods (list) – Calibration methods to benchmark.

  • cal_method_names (list) – Names of calibration methods.

  • use_logits (bool, default=False) – Should the calibration methods be used with classification probabilities or logits.

  • n_splits (int, default=10) – Number of splits for cross validation.

  • test_size (float, default=0.9) – Size of test set.

  • train_size (float, default=None) – Size of calibration set.

  • random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

References

1

https://github.com/bearpaw/pytorch-classification

Methods Summary

classify_val_data(dataset_folder, clf_name)

Classify the CIFAR-100 evaluation data set with a given model.

data_gen()

Returns the full dataset or a generator of datasets.

load_pretrained_model(clf_name, checkpoint_dir)

Load pretrained CNN from checkpoints.

plot(out_file, results_file, score, methods)

Plot results from benchmark experiments as an error bar plot.

run([n_jobs])

Train all models, evaluate on test data and save the results.

Methods Documentation

static classify_val_data(dataset_folder, clf_name, n_classes=100, download_test_data=False, batch_size=100, data_folder='data', checkpoint_folder='pretrained_networks/', output_folder='clf_output')[source]

Classify the CIFAR-100 evaluation data set with a given model.

Parameters
  • dataset_folder (str) – Directory containing CIFAR-100 evaluation data. Folder ‘val’ under file is searched for images. Output from the model is saved in the given directory.

  • clf_name (str) – Name of classifier (CNN architecture) to classify data with.

  • n_classes (int, default=100) – Number of classes of the pre-trained model on CIFAR-100.

  • download_test_data (bool, default=False.) – Should the CIFAR-100 test data be downloaded to data_folder?

  • batch_size (int, default=100,) – Batch size for classification of the test set.

  • data_folder (str, default='/data') – Folder where validation images are contained. Images must be contained in folder named after their class.

  • checkpoint_folder (str, default="/models/pretrained_networks/") – Folder containing pre-trained models (i.e. checkpoints).

  • output_folder (str, default='clf_output') – Folder where output is stored.

data_gen()[source]

Returns the full dataset or a generator of datasets.

Returns

Return type

X, y giving uncalibrated predictions and corresponding classes.

static load_pretrained_model(clf_name, checkpoint_dir, n_classes=100)[source]

Load pretrained CNN from checkpoints.

Parameters
  • clf_name (str) – Name of pretrained network to load.

  • checkpoint_dir (str) – Directory containing checkpoints of network, i.e. parameters of the pretrained network.

  • n_classes (int) – Number of classes of the classification problem.

Returns

Return type

pretrained_model

static plot(out_file, results_file, score, methods, classifiers='all', width=5.0, height=2.5)

Plot results from benchmark experiments as an error bar plot.

Parameters
  • out_file (str) – File location for the output plot.

  • results_file (str) – The location of the csv files containing experiment results.

  • score (str) – Type of score to plot.

  • methods (list) – Calibration methods to plot.

  • classifiers (list or "all") – List of classifiers for which to show results.

  • width (float, default=5.) – Width of the plot.

  • height (float, default=2.5) – Height of the plot.

run(n_jobs=None)

Train all models, evaluate on test data and save the results.

Parameters

n_jobs (int or None, optional (default=None)) – The number of CPUs to use to do the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.