Logistic

The logistic regression model is used as a model to predict the probability of an event occurring. The response variable is the natural log of the odds of an event. The explanatory variables can be quantitative variables (variables with continuous measurements) or indicator variables.

More specifically, we have a variable Y that takes the value 1 if an event occurs and 0 otherwise. For example, Y might be an indicator variable that tells us whether a particular customer purchased a suit: Y = 1 if the customer purchased the suit and Y = 0 if the customer did not purchase a suit. We have data on the customer that includes the customer's income and gender.

Probability distributions

We have a data set of N observations on the purchase decision (Y), income (X1) and gender (X2). The variable Y has a binomial distribution with probability p. The number of successes in the N observations has a binomial distribution. The mean of Y can be approximated with a normal distribution with mean p.

Logistic model

For technical reasons, the dependent variable in a logistic regression is the log of the odds of success (Y=1):

Note that the quantity p/(1-p) is also known as the Odds of success. Of course, we can have more than two independent variables.

We can use statistical software to estimate the parameters of the logistic regression model:

Substitution of values for X1 and X2 into this equation gives a predicted value for the log of the odds. To convert this to a prediction for the odds, calculate the exponential of the prediction for the log odds. With some algebra, this can be converted into a prediction for the probability, p. This is illustrated in the example.

Example

Consider the following simple example. We know if a customer purchased a suit (Yt = 1 if purchased, = 0 if not purchased) and the gender of the customer (Xt = 1 if female, = 0 if male). We summarize the data in the following table:

  Male Female
Purchased 40 60
Did not purchase 25 50

The proportion of men who purchased suits is 40/65 = 0.62. The odds that a man purchased a suit is 0.62/(1-0.62) = 1.63.

Labels

 
(None)