Building a Multi-Layer Perceptron from Scratch with NumPy

Kassem
4 min readJul 10, 2024

--

Photo by Henry Lim on Unsplash

Understanding the mechanics of machine learning models by building them from scratch is an invaluable exercise for any aspiring data scientist or machine learning engineer. This blog post guides you through creating a Multi-Layer Perceptron (MLP) using Python and NumPy, inspired by the Machine-Learning-from-Scratch repository.

Diagram of a Multi-Layer Perceptron (MLP) architecture.

What is a Multi-Layer Perceptron?

An MLP is a type of neural network consisting of at least three layers: an input layer, one or more hidden layers, and an output layer. Each neuron in one layer is connected to every neuron in the next layer. The MLP learns by adjusting the weights of these connections to minimize the error of its predictions, using a process called backpropagation.

Illustration of neural network layers.

Key Components of MLP

  1. Layers and Nodes:
  • Input Layer: Receives input data.
  • Hidden Layers: Perform intermediate computations.
  • Output Layer: Produces the final prediction.

2. Activation Functions:

  • Sigmoid:
  • ReLU:
  • Softmax:

3. Loss Function:

  • Cross-Entropy Loss: Measures the performance of a classification model.

4. Optimization:

  • Gradient Descent: Updates weights to minimize the loss function.
  • Backpropagation: Computes gradients of the loss function with respect to each weight.

Step-by-Step Implementation

Step 1: Initialize Network Parameters

import numpy as np

class MLP:
def __init__(self, input_size, hidden_size, output_size):
self.weights_input_hidden = np.random.randn(input_size, hidden_size)
self.weights_hidden_output = np.random.randn(hidden_size, output_size)
self.bias_hidden = np.zeros((1, hidden_size))
self.bias_output = np.zeros((1, output_size))

def sigmoid(self, x):
return 1 / (1 + np.exp(-x))

def softmax(self, x):
exp_x = np.exp(x - np.max(x))
return exp_x / exp_x.sum(axis=1, keepdims=True)

Explanation:

  • Initialize weights and biases: Randomly initialize weights and set biases to zero.

Step 2: Forward Propagation

def forward(self, X):
self.hidden_input = np.dot(X, self.weights_input_hidden) + self.bias_hidden
self.hidden_output = self.sigmoid(self.hidden_input)
self.final_input = np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_output
self.final_output = self.softmax(self.final_input)
return self.final_output
Forward propagation through the network.

Explanation:

  • Compute hidden and output layer inputs and outputs: Apply activation functions to compute the activations.

Step 3: Backward Propagation

def backward(self, X, y, output, learning_rate):
output_error = output - y
hidden_error = np.dot(output_error, self.weights_hidden_output.T) * self.hidden_output * (1 - self.hidden_output)

self.weights_hidden_output -= learning_rate * np.dot(self.hidden_output.T, output_error)
self.bias_output -= learning_rate * np.sum(output_error, axis=0, keepdims=True)
self.weights_input_hidden -= learning_rate * np.dot(X.T, hidden_error)
self.bias_hidden -= learning_rate * np.sum(hidden_error, axis=0, keepdims=True)
Backward propagation for updating weights and biases.

Explanation:

  • Compute errors and update weights and biases: Adjust the weights and biases using the gradient descent algorithm.

Step 4: Training the Model

def train(self, X, y, epochs, learning_rate):
for epoch in range(epochs):
output = self.forward(X)
self.backward(X, y, output, learning_rate)
if (epoch+1) % 100 == 0:
loss = -np.sum(y * np.log(output)) / X.shape[0]
print(f'Epoch {epoch+1}, Loss: {loss:.4f}')
Training the MLP model over multiple epochs.

Explanation:

  • Train the network: Perform forward and backward propagation for a specified number of epochs and print the loss periodically.

Step 5: Predicting

def predict(self, X):
output = self.forward(X)
return np.argmax(output, axis=1)

Explanation:

  • Make predictions: Compute the output and return the predicted class labels.

Conclusion

Building an MLP from scratch using NumPy helps demystify neural networks and provides a deeper understanding of their mechanics. This exercise not only enhances your appreciation for the complexity of machine learning models but also equips you with practical skills for developing and troubleshooting them.

--

--

Kassem
Kassem

Written by Kassem

AI & Machine Learning Engineer.

Responses (1)