# Fitting an Advertising Spend Response Model (Quasi-Newton BFGS)

## Background and Motivation

Understanding how advertising expenditure translates to sales is a classic problem in economics and marketing. Intuitively, advertising exhibits diminishing returns: the first dollars spent yield large gains in sales, but beyond a certain point additional spend has progressively smaller impact. In other words, the relationship between ad spend and sales is concave – initial spending is very effective, and later spending saturates the market. To make informed budgeting decisions, a company would like to model this response curve. In this exercise, we use a real advertising dataset to fit a nonlinear response function that captures saturation effects. By formulating this as an optimization problem, we can apply quasi-Newton methods (like BFGS) to estimate the model parameters. Such a model can then be used to determine the ROI of advertising and to find the spending level beyond which returns diminish.

## Dataset Description

We will use the Advertising dataset from the textbook Introduction to Statistical Learning. This dataset contains data for 200 different markets, including the advertising budgets spent on TV, radio, and newspaper and the corresponding product sales in each market ￼. For simplicity, we focus on one advertising channel (TV) and its effect on sales. The data provides pairs $(\mathbf{x}_i, y_i)$ for $i=1,\dots,200$, where $\mathbf{x}_i \in \mathbb{R}^3$ is the vector of TV, radio, and newspaper advertising budgets in that market (in thousands of dollars) and $y_i$ is the product sales (in thousands of units). Empirical analysis suggests a positive but flattening relationship between $x$ and $y$: initially, increasing TV spend strongly boosts sales, but eventually additional TV ads yield only minor increases. Our goal is to fit a curve that captures this pattern.

## Problem Formulation

Rather than a simple linear model, we choose a nonlinear saturation model for the relationship between advertising spend $x$ and expected sales $f(x)$. A reasonable choice is an exponential diminishing-returns function:
$$
f(\mathbf{x}_i; a, \mathbf{b}) = a \left(1 - e^{- \mathbf{x}_i^\top \mathbf{b}} \right),
$$
where $a$ and $\mathbf{b}$ are parameters to be estimated from the data. Here $a$ represents the asymptotic maximum sales (the plateau value as $x \to \infty$), and $b$ controls the rate of saturation (larger $b$ means the curve rises quickly then saturates early). For very small $x$, $f(x)\approx a b x$ (approximately linear growth), and as $x$ grows large, $f(x)\to a$ (sales approach a ceiling of $a$ units). This form aligns with the idea that the first advertising dollars have the highest impact, and returns diminish with higher spend ￼.

**Objective**: We will fit this model by finding the parameters $a$ and $\mathbf{b}$ that minimize the sum of squared errors between predicted and actual sales. If $y_i$ is the observed sales for market $i$ and $\mathbf{x}_i$ the budgets, our optimization problem is:
$$
\min_{a,\mathbf{b}} \; F(a,\mathbf{b}) = \frac{1}{2}\sum_{i=1}^{200} \big( f(\mathbf{x}_i; a,\mathbf{b}) - y_i \big)^2,
$$
where $f(\mathbf{x}_i; a,\mathbf{b})$ is defined as above. This is an unconstrained nonlinear least-squares problem in two variables. (In principle, we expect $a>0$ and $\mathbf{b}>0$, since negative values would violate the intended meaning.)

The function $F(a,b)$ is differentiable but not globally convex in all parameters (it is generally nonlinear and can have local minima if the model can oscillate, though in practice this particular form is well-behaved for positive $a,b$). There is no closed-form solution, so we must resort to iterative numerical optimization.


In [2]:
import pandas as pd

url = "https://raw.githubusercontent.com/JWarmenhoven/ISLR-python/master/Notebooks/Data/Advertising.csv"
df = pd.read_csv(url)
df

Unnamed: 0.1,Unnamed: 0,TV,Radio,Newspaper,Sales
0,1,230.1,37.8,69.2,22.1
1,2,44.5,39.3,45.1,10.4
2,3,17.2,45.9,69.3,9.3
3,4,151.5,41.3,58.5,18.5
4,5,180.8,10.8,58.4,12.9
...,...,...,...,...,...
195,196,38.2,3.7,13.8,7.6
196,197,94.2,4.9,8.1,9.7
197,198,177.0,9.3,6.4,12.8
198,199,283.6,42.0,66.2,25.5


## Tasks:

- Apply a quasi-Newton method (BFGS or DFP rank 2 updates) to the problem and derive the detailed steps;

- Implement Newton's method in python (write your own code) and fit to the advertising data, validate and interpret your results.