Everything You To Know About Logistic Regression — Part 1
Logistic regression is used to estimate the probability of occurrence of an event based on multiple factors.
Need for Logistic Regression:
Linear regression is used to predict the values of a dependent variable using the independent variable, we can explain the impact of the change in dependent variable on the independent variable.
Problem : The dependent variable in some cases are limited to binary [0 & 1] and is only limited to 2 classes whereas the independent variable may have any number of classes and linear regression is designed to solve the problem by minimizing MSE which is not a good fit in this problem.
Why Logistic Regression and not Logistic Classification ?
Logistic Regression returns a continuous values as an output for the input values which are later converted into [0 & 1] based on the threshold value.
Pre requisites Terminology for Log Reg:
Probability :
It is the ratio of the favourable outcomes to the total outcomes and returns the chances of occurrence of the favourable outcomes out of 1.
Probability = Favourable Outcomes / Total Outcomes
Probabilities only ranges from 0 to 1, we can get the linear predictions out of the range.
Odds :
It is the ratio of probability of occurrence of an event to the probability of non occurrence of an event.
Odds = p(occurrence) / 1 – p(not occurrence)
The range of odds are pretty bad as it goes from 0 to infinity, where any value of odds below 1 is not good and any value of odds above 1 to infinity and better and better odds.
Log Odds :
Log of odds is the natural log of odds, It solves the problem of fitting a linear model of probabilities, It is the log of the ratio of two odds being compared.
Odds Ratio = Odds Y / Odds X
Example -
If OR > 1, i.e. O(Y) > O(X) → group X better than group Y.
If OR < 1, i.e. O(Y) < O(X) → group Y better than group X.
If OR = 1, i.e. O(Y) = O(X) → both are same.
Why we use Log odds instead of Probability or Odds?
The problem with using probability is that probability does not represent the constant effect of X (independent variables) therefore we use Odds but the range of odds is non linear as 0–1 are bad odds and 1 – infinity are good odds, hence we use log odds which has a linear range and can easily represent the effect of X.
LOGISTIC REGRESSION :
It is used to analyze the relationship between dichotomous dependent variable and categorical or numeric independent variables. Logistic Regression combines all independent variables to estimate probability that an event will occur.
Logistic Regression follows the Sigmoid function.
For a single feature
For multiple features
which can also be written as
where W^T is matrix of all the weights multiplied by all the independent variables added to the intercept.
This is then passed to the sigmoid function which then gives the final prediction.
Then we set a threshold and above and below which we classify data as 0 or 1.
COST FUNCTION OF LOGISTIC REGRESSION
- Error Representation in Machine Learning.
- Shows how model is predicting compared to given dataset.
As Cost function decreases, Accuracy Increases.
This works for linear regression as output values are continous but in case of logistic the output variable can be either 0 or 1.
As we try to minimize the above cost function we end up with local minima and do not reach global minima.
Therefore we use the below formula to calculate the cost for the logistic regression.