Suppose we have a set of points in the plane where the X-coordinate is generated uniformly from the interval , and the Y-coordinate is given by , with the noise coming from the distribution. This typically generates serious outliers.
>
|
|
>
|
|
>
|
|
We shrink the view a little, so that we can see more of what's going on.
>
|
|
Now we would like to find the model from the data. Using standard (least-squares) linear regression, we get an unsatisfactory fit:
>
|
|
| (2) |
>
|
|
The repeated median estimator, however, deals well with the errors.
>
|
|
| (3) |
>
|
|
In the following example, we have one outlier in the data.
>
|
|
| (4) |
>
|
|
Once again, we compare the least squares and repeated median estimators.
>
|
|
| (5) |
>
|
|
| (6) |
>
|
|
In this case, the difference is less dramatic than in the first example. It is clear, though, that the outlier has very little influence on the repeated median estimator, and some influence on the least squares fit.