Perceptron Algorithm
Published:
- Perceptron Algorithm
- Binary Classification Problem
- Perceptron Algorithm (Primal From)
- Stop in Finite steps
- Perceptron Algorithm(Dual Form)
- Python Code
Perceptron Algorithm
- Basically, this is my studying from NTCU open course - Machine Learning. I will take the key points for my references.
Binary Classification Problem
Given a training dataset
S={(xi,yi)|xi∈Rn,yi∈−1,1,i=1,2,…,l} xi∈A+⟺yi=1xi∈A−⟺yi=−1Main Goal
- Predit the unseen class label for new data
- Find a function f:Rn→R by learning from data such that
Here the simplest function is linear
f(x)=wTx+bPerceptron Algorithm (Primal From)
An on-line and mistake-driven produce
Repeat
for i =1 to l
if yi(⟨wk,xi⟩+bk)≤0, then
end if
Until no mistakes made within the for loop
return: k,(wk,bk)
Remark:
Stop in Finite steps
Theorem(Novikoff) Let S be a non-trival training set, and let
R=maxSuppose that there exists a vector \omega_{opt} such that \Vert \omega_{opt} \Vert =1 and
y_i(\langle w_{opt},x_i \rangle +b_{opt}) \geq r \quad \text{for} \quad 1\leq i \leq lThe number of mistakes made by the on-line perceptron algorithm on S is almost \left(\frac{2R}{r}\right)^2
Remark:
- the value of \eta is irreverent.
Perceptron Algorithm(Dual Form)
w_i=\sum\limits_{i=1}^l \alpha_i y_i x_iGiven a linearly seperable training set S and \alpha=0, \alpha \in \mathbb{R}^l, b=0 and R=\max\limits_{1\leq i \leq l }\Vert x_i \Vert
y_i(\langle w_{k},x_i \rangle +b_{k})=y_i\left(\sum\limits_{i=1}^l \alpha_i y_i \langle w_i,x_i \rangle +b_k \right)Repeat
for i= 1 to l
if y_i\left(\sum\limits_{i=1}^l \alpha_i y_i \langle w_i,x_i \rangle +b_k \right) \leq 0, then
end if
Until no mistakes made within the for loop
Return :(\alpha, b)
Remark:
- The number of updates:
- \sum\limits_{i=1}^l \alpha_i=\Vert \alpha \Vert \leq \left( \frac{2R}{r} \right)^2
which \alpha_i >0 means that \langle x_i,y_i \rangle has been misclassified.
- Training data only appear in the algorithm through the entries of the Gram matrix
Python Code
The following code are from website
- Import data
from sklearn import datasets import numpy as np import pandas as pd import matplotlib.pyplot as plt import matplotlib.lines as mlines np.random.seed(10) # Load data iris=datasets.load_iris() X = iris.data[0:99,:2] y = iris.target[0:99]
- Plot figure
# Plot figure plt.plot(X[:50, 0], X[:50, 1], 'bo', color='blue', label='0') plt.plot(X[50:99, 0], X[50:99, 1], 'bo', color='orange', label='1') plt.xlabel('sepal length') plt.ylabel('sepal width') plt.legend()
# Update y into -1 and 1 y=np.array([1 if i==1 else -1 for i in y])
- Perceptron Algorithm (Primal From)
######### # Gradient Descent ######### # Initialize parameters w=np.ones((X.shape[1],1)); b=1; learning_rate=0.1; Round=0; All_Correct=False; # Start Gradient Descent while not All_Correct: misclassified_count=0 for i in range(X.shape[0]): XX=X[i,] yy=y[i] if yy * (np.dot(w.T,XX.T)+b)<0: w+=learning_rate * np.dot(XX,yy).reshape(2,1) b+=learning_rate * yy misclassified_count +=1 if misclassified_count==0: All_Correct=True else: All_Correct=False Round += 1 print(Round) print(w) print(b)
- Plot result hyperplane
x_points = np.linspace(4,7,10) y_ = -(w[0]*x_points + b)/w[1] plt.plot(x_points, y_) plt.plot(X[:50, 0], X[:50, 1], 'bo', color='blue', label='0') plt.plot(X[50:99, 0], X[50:99, 1], 'bo', color='orange', label='1') plt.xlabel('sepal length') plt.ylabel('sepal width') plt.legend()