Understanding the mechanics of machine learning models by building them from scratch is an invaluable exercise for any aspiring data scientist or machine learning engineer. This blog post guides you through creating a Multi-Layer Perceptron (MLP) using Python and NumPy, inspired by the Machine-Learning-from-Scratch repository.
For more details and the full implementation, check out the Machine-Learning-from-Scratch repository.
Happy coding! Understanding the importance of building models from scratch.
What is a Multi-Layer Perceptron?
An MLP is a type of neural network consisting of at least three layers: an input layer, one or more hidden layers, and an output layer. Each neuron in one layer is connected to every neuron in the next layer. The MLP learns by adjusting the weights of these connections to minimize the error of its predictions, using a process called backpropagation.
Key Components of MLP
- Layers and Nodes:
- Input Layer: Receives input data.
- Hidden Layers: Perform intermediate computations.
- Output Layer: Produces the final prediction.
2. Activation Functions:
- Sigmoid:
- ReLU:
- Softmax:
3. Loss Function:
- Cross-Entropy Loss: Measures the performance of a classification model.
4. Optimization:
- Gradient Descent: Updates weights to minimize the loss function.
- Backpropagation: Computes gradients of the loss function with respect to each weight.
Step-by-Step Implementation
Step 1: Initialize Network Parameters
import numpy as np
class MLP:
def __init__(self, input_size, hidden_size, output_size):
self.weights_input_hidden = np.random.randn(input_size, hidden_size)
self.weights_hidden_output = np.random.randn(hidden_size, output_size)
self.bias_hidden = np.zeros((1, hidden_size))
self.bias_output = np.zeros((1, output_size))
def sigmoid(self, x):
return 1 / (1 + np.exp(-x))
def softmax(self, x):
exp_x = np.exp(x - np.max(x))
return exp_x / exp_x.sum(axis=1, keepdims=True)
Explanation:
- Initialize weights and biases: Randomly initialize weights and set biases to zero.
Step 2: Forward Propagation
def forward(self, X):
self.hidden_input = np.dot(X, self.weights_input_hidden) + self.bias_hidden
self.hidden_output = self.sigmoid(self.hidden_input)
self.final_input = np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_output
self.final_output = self.softmax(self.final_input)
return self.final_output
Explanation:
- Compute hidden and output layer inputs and outputs: Apply activation functions to compute the activations.
Step 3: Backward Propagation
def backward(self, X, y, output, learning_rate):
output_error = output - y
hidden_error = np.dot(output_error, self.weights_hidden_output.T) * self.hidden_output * (1 - self.hidden_output)
self.weights_hidden_output -= learning_rate * np.dot(self.hidden_output.T, output_error)
self.bias_output -= learning_rate * np.sum(output_error, axis=0, keepdims=True)
self.weights_input_hidden -= learning_rate * np.dot(X.T, hidden_error)
self.bias_hidden -= learning_rate * np.sum(hidden_error, axis=0, keepdims=True)
Explanation:
- Compute errors and update weights and biases: Adjust the weights and biases using the gradient descent algorithm.
Step 4: Training the Model
def train(self, X, y, epochs, learning_rate):
for epoch in range(epochs):
output = self.forward(X)
self.backward(X, y, output, learning_rate)
if (epoch+1) % 100 == 0:
loss = -np.sum(y * np.log(output)) / X.shape[0]
print(f'Epoch {epoch+1}, Loss: {loss:.4f}')
Explanation:
- Train the network: Perform forward and backward propagation for a specified number of epochs and print the loss periodically.
Step 5: Predicting
def predict(self, X):
output = self.forward(X)
return np.argmax(output, axis=1)
Explanation:
- Make predictions: Compute the output and return the predicted class labels.
Conclusion
Building an MLP from scratch using NumPy helps demystify neural networks and provides a deeper understanding of their mechanics. This exercise not only enhances your appreciation for the complexity of machine learning models but also equips you with practical skills for developing and troubleshooting them.