Code a Neural Network with Numpy

Introduction

The vectorized operations have been discussed in the last post Maths in a Neural Network: Vectorization. This post will focus on implementing the equations with numpy.

Equations proved in the previous posts [1] [2]:

Note that this network takes one sample input at a time, I’ll discuss batch prediction/training later.

Feed-forward:

A^{(l)}= f^{(l)}(Z^{(l)})

Z^{(l)} = W^{(l) T} A^{(l-1)}

Weight update:

Output layer:

\delta^{(L)} = E^{'}(A^{(L)}) \diamond f^{'(L)}(Z^{(L)}) \\ W^{(L)'} = W^{(L)} - \alpha A^{(L-1)} (\delta^{(L)})^T

Hidden layer:

\delta^{(l)} = f^{(l)'}(Z^{(l)}) \diamond W^{(l+1)} \delta^{(l+1)} \\ W^{(l)'} = W^{(l)} - \alpha A^{(l-1)} \delta^{(l)T}

Notebook:

The following code implemented a 2-3-3-2 network for XOR problem. Note that this network takes one sample input at a time, I’ll discuss batch prediction/training later.

In the end, the Mean Square Error of an epoch converges to a low value, which indicates the training was fine.

Next

  1. Maths in a Neural Network: Element-wise
  2. Maths in a Neural Network: Vectorization
  3. Code a Neural Network with Numpy
  4. Maths in a Neural Network: Batch Training

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s