PlattScaling¶

class pycalib.calibration_methods.PlattScaling(regularization=1e-12, random_state=None)[source]¶

Bases: pycalib.calibration_methods.CalibrationMethod

Probability calibration using Platt scaling

Platt scaling 1 2 is a parametric method designed to output calibrated posterior probabilities for (non-probabilistic) binary classifiers. It was originally introduced in the context of SVMs. It works by fitting a logistic regression model to the model output using the negative log-likelihood as a loss function.

Parameters

regularization (float, default=10^(-12)) – Regularization constant, determining degree of regularization in logistic regression.
random_state (int, RandomState instance or None, optional (default=None)) – The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

References

1: Platt, J. C. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods in Advances in Large-Margin Classifiers (MIT Press, 1999)
2: Lin, H.-T., Lin, C.-J. & Weng, R. C. A note on Platt’s probabilistic outputs for support vector machines. Machine learning 68, 267–276 (2007)

Methods Summary

`fit`(X, y[, n_jobs])	Fit the calibration method based on the given uncalibrated class probabilities X and ground truth labels y.
`get_params`([deep])	Get parameters for this estimator.
`plot`(filename[, xlim])	Plot the calibration map.
`predict`(X)	Predict the class of new samples after scaling.
`predict_proba`(X)	Compute calibrated posterior probabilities for a given array of posterior probabilities from an arbitrary classifier.
`set_params`(**params)	Set the parameters of this estimator.

Methods Documentation

fit(X, y, n_jobs=None)[source]¶

Fit the calibration method based on the given uncalibrated class probabilities X and ground truth labels y.

Parameters

X (array-like, shape (n_samples, n_classes)) – Training data, i.e. predicted probabilities of the base classifier on the calibration set.
y (array-like, shape (n_samples,)) – Target classes.
n_jobs (int or None, optional (default=None)) – The number of jobs to use for the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

Returns

self – Returns an instance of self.

Return type

object

get_params(deep=True)¶

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: mapping of string to any

plot(filename, xlim=[0, 1], **kwargs)¶

Plot the calibration map.

Parameters

xlim (array-like) – Range of inputs of the calibration map to be plotted.
**kwargs – Additional arguments passed on to matplotlib.plot().

predict(X)¶

Predict the class of new samples after scaling. Predictions are identical to the ones from the uncalibrated classifier.

Parameters: X (array-like, shape (n_samples, n_classes)) – The uncalibrated posterior probabilities.
Returns: C – The predicted classes.
Return type: array, shape (n_samples,)

predict_proba(X)[source]¶

Compute calibrated posterior probabilities for a given array of posterior probabilities from an arbitrary classifier.

Parameters: X (array-like, shape (n_samples, n_classes)) – The uncalibrated posterior probabilities.
Returns: P – The predicted probabilities.
Return type: array, shape (n_samples, n_classes)

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: object