Code a Neural Network with Numpy

Introduction

The vectorized operations have been discussed in the last post Maths in a Neural Network: Vectorization. This post will focus on implementing the equations with numpy.

Equations proved in the previous posts [1] [2]:

Note that this network takes one sample input at a time, I’ll discuss batch prediction/training later.

Feed-forward:

A^{(l)}= f^{(l)}(Z^{(l)})

Z^{(l)} = W^{(l) T} A^{(l-1)}

Weight update:

Output layer:

\delta^{(L)} = E^{'}(A^{(L)}) \diamond f^{'(L)}(Z^{(L)}) \\ W^{(L)'} = W^{(L)} - \alpha A^{(L-1)} (\delta^{(L)})^T

Hidden layer:

\delta^{(l)} = f^{(l)'}(Z^{(l)}) \diamond W^{(l+1)} \delta^{(l+1)} \\ W^{(l)'} = W^{(l)} - \alpha A^{(l-1)} \delta^{(l)T}

Notebook:

The following code implemented a 2-3-3-2 network for XOR problem. Note that this network takes one sample input at a time, I’ll discuss batch prediction/training later.


Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

In the end, the Mean Square Error of an epoch converges to a low value, which indicates the training was fine.

Next

  1. Maths in a Neural Network: Element-wise
  2. Maths in a Neural Network: Vectorization
  3. Code a Neural Network with Numpy
  4. Maths in a Neural Network: Batch Training

2 thoughts on “Code a Neural Network with Numpy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s