# Robust Regression via Majorization–Minimization (MM)

## Background & Motivation: 

In real-world regression tasks, outliers can severely skew ordinary least squares (OLS) estimates. For example, in housing price prediction, an unusually overpriced or underpriced home can distort a linear model’s fit. To make the model outlier-resistant, we minimize the sum of absolute errors (L1 loss) instead of squared errors. This yields the least absolute deviations (LAD) regression, which is more robust to outliers but harder to optimize directly (the L1 objective is not differentiable at 0). We will apply a Majorization–Minimization (MM) algorithm to solve LAD regression efficiently by iteratively solving easier weighted-least-squares problems. MM is a general framework that iteratively optimizes a surrogate function that majorizes the original objective, guaranteeing the objective is non-increasing each iteration ￼ ￼. (The classic Expectation–Maximization is actually a special case of MM ￼.)

## Data:

Boston Housing dataset has 506 instances, 13 features, and the outcome is the median home value. This dataset (available from the UCI Machine Learning Repository and in scikit-learn) contains housing prices and various attributes (crime rate, number of rooms, etc.) for Boston suburbs. It’s known that some homes in the data have unusually high or capped prices (e.g. many at the maximum value 50.0), which can act as outliers. 


In [2]:
from sklearn.datasets import fetch_openml
import pandas as pd

# Load Boston housing dataset from OpenML
boston = fetch_openml(name="Boston", version=1, as_frame=True)
df_boston = boston.frame  # returns a pandas DataFrame

# Features and target
X = df_boston.drop(columns="MEDV")  # MEDV = Median value of owner-occupied homes
y = df_boston["MEDV"]

print(X.shape, y.shape)
print(df_boston.head())

(506, 13) (506,)
      CRIM    ZN  INDUS CHAS    NOX     RM   AGE     DIS RAD    TAX  PTRATIO  \
0  0.00632  18.0   2.31    0  0.538  6.575  65.2  4.0900   1  296.0     15.3   
1  0.02731   0.0   7.07    0  0.469  6.421  78.9  4.9671   2  242.0     17.8   
2  0.02729   0.0   7.07    0  0.469  7.185  61.1  4.9671   2  242.0     17.8   
3  0.03237   0.0   2.18    0  0.458  6.998  45.8  6.0622   3  222.0     18.7   
4  0.06905   0.0   2.18    0  0.458  7.147  54.2  6.0622   3  222.0     18.7   

        B  LSTAT  MEDV  
0  396.90   4.98  24.0  
1  396.90   9.14  21.6  
2  392.83   4.03  34.7  
3  394.63   2.94  33.4  
4  396.90   5.33  36.2  


## Tasks:

- Write out the least absolute deviations (LAD) regression as an optimization problem, and lay out the details of using the MM algorithm to solve it.

- Implement the MM algorithm in python and fit the housing price dataset. Do not use existing functions, but write your own code for the iterative algorithm.

- Compare the LAD result with ordinary least square estimates.