I'm trying to put together a Python implementation of a single-layer Perceptron classifier. I've found the example in Sebastian Raschka's book 'Python Machine Learning' very useful, but I have a question about one small part of his implementation. This is the code:
import numpy as np
class Perceptron(object):
"""Perceptron classifier.
Parameters
------------
eta : float
Learning rate (between 0.0 and 1.0)
n_iter : int
Passes over the training dataset.
Attributes
-----------
w_ : 1d-array
Weights after fitting.
errors_ : list
Number of misclassifications in every epoch.
"""
def __init__(self, eta=0.01, n_iter=10):
self.eta = eta
self.n_iter = n_iter
def fit(self, X, y):
"""Fit training data.
Parameters
----------
X : {array-like}, shape = [n_samples, n_features]
Training vectors, where n_samples
is the number of samples and
n_features is the number of features.
y : array-like, shape = [n_samples]
Target values.
Returns
-------
self : object
"""
self.w_ = np.zeros(1 + X.shape[1])
self.errors_ = []
for _ in range(self.n_iter):
errors = 0
for xi, target in zip(X, y):
update = self.eta * (target - self.predict(xi))
self.w_[1:] += update * xi
self.w_[0] += update
errors += int(update != 0.0)
self.errors_.append(errors)
return self
def net_input(self, X):
"""Calculate net input"""
return np.dot(X, self.w_[1:]) + self.w_[0]
def predict(self, X):
"""Return class label after unit step"""
return np.where(self.net_input(X) >= 0.0, 1, -1)
The part I can't get my head around is why we define net_input()
and predict()
to take an array X
rather than just a vector. Everything works out, since we're only passing the vector xi
to predict()
in the fit()
function (and so therefore also only passing a vector to net_input()
), but what is the logic behind defining the functions to take an array? If I understand the model correctly, we are only taking one sample at a time, calculating the dot product of the weights vector and the feature vector associated with the sample, and we never need to pass an entire array to net_input()
or predict()
.
Your concern seems to be why is X in net_input and predict defined as an array not a vector (I'm assuming your definitions are what i mentioned in the comment above--really though i would say that there is no distinction in this context)... What gives you the impression that X is an 'array' as opposed to a 'vector'?
The typing here is determined by what you pass the function, so if you pass it a vector, X is a vector (python uses what's called duck typing). So to answer the question, 'why are net_input and predict defined to take an array as opposed to a vector?'... They're not, they are simply defined to take parameter X, which is whatever type you pass it...
Maybe you are confused by his reuse of the variable name X as a 2d array of training data in the context of fit but as a single sample in the other functions... They may share a name but they are distinct from eachother, being in different scopes.