Isotonic Regression

The isotonic regression is technique of fitting where a low-dimensional embedding for data points is sought such that order of distances between points in the embedding matches order of dissimilarity between points. Isotonic regression is used iteratively to fit ideal distances to preserve relative dissimilarity of order.

The isotonic regression finds a non-decreasing approximation of a function while minimizing the mean squared error on the training data. The benefit of such a model is that it does not assume any form for the target function such as linearity. For comparison a linear regression is also presented. This model finds the best least squares fit to a set of points, given the constraint that the fit must be a non-decreasing function.

The isotonic estimator, ${\displaystyle g^{*}}$, minimizes the weighted least squares-like condition:

${\displaystyle \min _{g\in {\mathcal {A}}}\sum _{i=1}^{n}w_{i}(g(x_{i})-f(x_{i}))^{2}} $

isotonic_regression

In [70]:
# Imports
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
import scipy as sp
import scipy.interpolate
import pandas as pd
import sklearn.ensemble
import sklearn.metrics
from sklearn.linear_model import LinearRegression
from sklearn.isotonic import IsotonicRegression
from sklearn.utils import check_random_state
import sklearn.cross_validation
In [71]:
n = 100
x = np.arange(n)
rs = check_random_state(0)
y = rs.randint(-50, 50, size=(n,)) + 50. * np.log(1 + np.arange(n))
In [72]:
# Fit IsotonicRegression and LinearRegression models

ir = IsotonicRegression()

y_ = ir.fit_transform(x, y)

lr = LinearRegression()
lr.fit(x[:, np.newaxis], y)  # x needs to be 2d for LinearRegression
Out[72]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)
In [73]:
# Plot result

segments = [[[i, y[i]], [i, y_[i]]] for i in range(n)]
lc = LineCollection(segments, zorder=0)
lc.set_array(np.ones(len(y)))
lc.set_linewidths(0.5 * np.ones(n))
In [74]:
fig = plt.figure(figsize=(16,4))
plt.plot(x, y, 'r.', markersize=12)
plt.plot(x, y_, 'g.-', markersize=12)
plt.plot(x, lr.predict(x[:, np.newaxis]), 'b-')
plt.gca().add_collection(lc)
plt.legend(('Data', 'Isotonic Fit', 'Linear Fit'), loc='lower right')
plt.title('Isotonic regression')
plt.show()
In [ ]: