Logistic Regression: Decision Boundary


In order to get our discrete 0 or 1 classification, we can translate the output of the hypothesis function as follows:

hθ(x)0.5  y=1hθ(x)<0.5  y=0\begin{aligned} h_\theta(x) \geq 0.5 \space \Rightarrow \space y = 1 \\ h_\theta(x) < 0.5 \space \Rightarrow \space y = 0\end{aligned}

Recall that for the logistic function g(z)=11+ezg(z) = \dfrac{1}{1 + e^{-z}}, we have g(z)0.5g(z) \ge 0.5 when z0z \ge 0.

Therefore, hθ(x)=g(θTx)0.5 h_\theta(x) = g(\theta^T x) \geq 0.5 \space when  θTx0\space\theta^T x \geq 0

θTx0  y=1θTx<0  y=0\begin{aligned} \theta^T x \geq 0 \space \Rightarrow \space y = 1 \\ \theta^T x < 0 \space \Rightarrow \space y = 0 \end{aligned}

The decision boundary is the line that separates the area where y = 0 and where y = 1. It is created by our hypothesis function using the equation:

hθ(x)=0.5or, θTx=0\begin{aligned} h_\theta(x) = 0.5 \\ or, \space \theta^T x = 0 \end{aligned}

The input (θTx)(\theta^Tx) to the sigmoid function g(z)g(z) doesn't need to be linear, and could be a function that describes a circle (e.g. z=θ0+θ1x12+θ2x22z = \theta_0 + \theta_1 x_1^2 +\theta_2 x_2^2) or any other shape to fit our data.