Multiple regression refers to a model in which a dependent variable, Yi, is a linear function of a set of independent variables, X1, X2, ..., XI. The population model is written as:

The Y variable is sometimes called the response variable, while the X's are referred to as the predictor variables.
The sample regression line is written as:

The goal of multiple regression is to develop a linear model that can be used to explain variation in or to predict values of the Y variable. We assume that:
- The data, {Yi,X1i,X2i,...,Xpi}, i=1,...,N is a random sample from the population
- The residual e has a mean of zero
- The residual e has a normal distribution with a variance, %sigma;2, that is constant
Least squares estimation
We can write the set of X variables in matrix form,
. The X matrix is n x (p+1). Then the least squares estimates of the coefficients are given by:

ANOVA
The analysis of variance (ANOVA) table summarizes several quantities associated with the regression equation:
| Source | DF | SS | MS | F |
|---|---|---|---|---|
| Regression | p | SSR | MSR | MSR/MSE |
| Error | N-p-1 | SSE | MSE | |
| Total | N-1 | SST |
SST = Sum of Squares Total = 
SSE = Sum of Squares Error = 
SSR = Sum of Squares Regression = 
Note that SST = SSE + SSR and DFT = DFE + DFR.
The F-statistic, F = MSR/MSE, is relevant for testing the null hypothesis that all of the population slope coefficients are equal to zero against the alternative that at least one of the population slope coefficients is not equal to zero:

If we reject the null hypothesis, we conclude that there is at least one significant slope coefficient. Another way to state this conclusion is that variation in at least one of the independent variables helps to explain variation in the dependent variable.
The R2 of the regression, R2 = SSR/SST, measures the percentage of the variation in Y that is explained by the X variables. Note that R2 = 1 - SSE/SST.
Interpretation
The regression coefficient for independent variable Xpi measure the effect of a one unit change in Xpi on the average Yi, holding the values of the other independent variables constant. For example, suppose we estimate a regression of the price of a new home (Price) on the square footage of the house (SqFt) and the number of rooms (Number):
Price = $20,000 + $150 SqFt + $1500 (Number)
Holding the number of rooms in a house constant, an increase in the square footage of the house by one foot is associated with a $150 increase in the average price of a house.
Significance
Testing the significance of the coefficients proceeds along the same lines as in Simple Regression. To test the significance of a regression coefficient, βj, the test statistic is equal to the coefficient estimate, b~j~ divided by the standard error of this coefficient, SEbj:
. This statistic has a t-distribution with N-p-1 degrees of freedom.
