Tuesday, 31 May 2016

A brief introduction to ANNs - part 3

In the previous post about ANNs we looked at the linear neuron and the perceptron.

Perceptrons have been used in neural networks for decades, but they are not the only type of neuron in use today. When they were first invented, they seemed capable of learning almost anything.

However, in 1969, Minsky and Papert published their book 'Perceptrons' which showed that a single perceptron could never be trained to perform the XOR function. You'll see in the next post why this is so (and why it's not a huge problem), but for now, let's look at three other common neuron models.

Like the linear neuron and perceptron, these start by calculating the weighted sum of their inputs. Recall that you can implement the linear neuron like this:

ln←{⍺+.×⍵}

sigmoid neuron calculates the same weighted sum of inputs, but then it applies the sigmoid function to the result. The sigmoid function is defined in wikipedia like this:

Here's how you define that function in APL:

sigmoid←{÷1+*-⍵}

That definition says 'take the reciprocal of 1 plus e to the power minus  ⍵', where ⍵ is the argument to the function.

You can implement a sigmoid neuron by combining the sigmoid function with a linear neuron.

sn ← sigmoid ln

You can test it like this:

inputs←0.2 0.3 0.1
weights ← 1 2 0.5
inputs sn weights
0.7005671425

As the name suggests, the sigmoid function is S-shaped. Here is a graph of the function, plotted using Dyalog APL's SharpPlot library:

As you can see, the sigmoid functions value is close to zero for large negative arguments; it has the value 0.5 when its input is zero; and it rises towards one as its input grows larger.

Another commonly used function, with a similar shape, is the tanh function.

APL has implementations of all common trigonometry-related functions. Sin is 1○⍵, and Cos is 2○⍵.  You can find a complete list here.

The definition you need is just

tanh←{7○⍵}

Here is its graph:

As you can see, tanh ranges from -1 for large negative arguments to +1 for large positive arguments. Its value at zero is zero.

The last neuron we'll consider in this post is the Rectified Linear Neuron or RLN.

The transfer function for this neuron is zero for inputs that are negative or zero, and is equal to the input for inputs that are positive.

Here is the APL definition:

rln←{0⌈⍵}

The symbol (max) returns the maximum of its arguments. Here's a plot of the RLN function:

I mentioned earlier that the perceptron has some limitations, but why are these other functions popular? A future post will cover back-propagation - one of the most widely used techniques for training an network - and the functions you've been looking at work well for that purpose.

Before then, you'll take another look at the perceptron, you'll see how to train it, and review its limitations and ways of avoiding them.

Friday, 27 May 2016

Student? Expert Problem Solver? Win \$2000 and a free trip to Glasgow

If you like coding and solving problems, and are a full-time student, you could win up to \$2000 and an expenses-paid trip to a conference in Glasgow later this year.

All you need is a computer and some free software. The computer could be a Raspberry Pi (any model) or a laptop running Windows, OS/X or Linux. I'll tell you where to get the APL software further down this post.

First, though, a warning. If you enter this competition it could change your life!

I’m serious. Just under fifty years ago I had a chance to learn APL.

I did, and it shaped my whole career. I'm still using A

Now, if you want, it’s your turn.

The Dyalog APL 2016 problem solving competition

Dyalog have just announced their annual APL problems solving competition. They want to introduce more people to this extraordinary, powerful language.

If you are a full time student you could win a big cash prize (up to \$2000) and an expenses-paid trip to Glasgow later this year.

If you’re not a student, you can still enter, stretch your mental muscles, and have fun.

In a minute I’ll explain how you can enter, and how you can start getting familiar with the language. Before that I’d like to show you a little of what APL can do and why it’s so powerful.

Meet APL: the most productive programming language I know

Suppose you’re a scientist, or an engineer, or an entrepreneur and you need to crunch some numbers. Perhaps you’ve just done an experiment or got some sales figures in. Whatever the background, you have two sets of data:

expected ← 10 15 13 27 30
actual ← 9 12 15 25 28

How much do the actual figures differ from what you expected?

difference←expected-actual
difference
1 3 ¯2 2 2

What’s the total difference?

+/difference
6

If you’re into statistics, you might ask for the average difference. You could start by writing a program to calculate averages:

average ← {(+/⍵)÷⍴⍵} ⍝ divide sum by number of elements
average difference
1.2

As you can see, APL is powerful and concise. If you want to find out more, and maybe win a cash prize, you should enter the competition today.

5 Steps to enter

• Register for the competition here and click the purple ‘Start the competition’ button.You should get a confirmation email within 10 minutes. If you don’t, check your spam folder. If the email is not there, notify support@sqore.com
• Install Dyalog APL on your own machine. You can use a Windows, Linux or OS/X laptop or a Raspberry Pi.
If you're using a Raspberry Pi you can find out how to install the free software here.

If you're using a laptop you will need to get a license from Dyalog; students can get a free educational licence, and anyone else can get a personal licence for a minimum charge.
• Work on the problems for phase 1 and phase 2. There's more information here.
Then wait to hear the results. Make sure you keep October 9-13, 2016 clear in case you win a trip to the Glasgow conference!

Free tips by email

I’ll be sending out a few tips about getting started (but no solutions!) over the next few days. If you want them, sign up below. I won’t spam you, and you can unsubscribe at any time.

I’m also working on an extra bonus which I hope to offer in one of the emails later this month.

Thursday, 19 May 2016

A new Raspberry Pi robot joins the family

Yesterday saw the arrival of a Raspberry Pi robot kit from The Pi Hut, and I'm finding it hard not to drop everything and have a play.

The Pi Hut has close links with CamJam. CamJam is, I think, the first Raspberry Jam, based in the Cambridge area.

Working with The Pi Hut they have created three excellent EduKits: inexpensive, fun kits which introduce Raspberry Pi owners of all owners to the fun of physical computing.

The earlier kits came with excellent instructions and the Robot kit does too. I'm sure I will succumb to temptation and start exploring the kit in the next day or two. Expect a progress report soon.

My immediate priority is more urgent. I'm talking at the BAA meeting tomorrow, and I need to make sure I'm properly prepared.

Dyalog Visit

I nearly blew it earlier this week. I went along to visit my friends at Dyalog to talk about my neural network research and show them APL running on the new Pi zero.

I thought I had taken everything I needed, but I forgot to take a USB hub. I won't repeat that mistake tomorrow, as I expect there will be a lot of interest in the newest member of the Pi family.

BAA meeting tomorrow - 19th May

If you're an APLer, current or lapsed, and can get to central London tomorrow, do come along to the meeting. I think there are a few places left. Go here to book.

Monday, 16 May 2016

The new Raspberry Pi zero is here - and it's snappy!

 Spot the difference!
The new Raspberry Pi zero is out and it has a camera connector.

The picture on the right compares the new zero with its predecessor. They are very, very similar but the clever folks at Pi towers have re-routed the board to make room for a camera connector while keeping the size of the board unchanged.

I've had a chance to play with the new Pi for a few days now and I love it. You can read my plans below but the main thing is that the new feature has been added without sacrificing the zero's already awesome capabilities.

As you'd expect, existing software runs just as it did before.

The new zero is currently in stock at several dealers in the UK and the USA. Details are on the Raspberry Pi website. Dealer info is at the bottom of their post.

A camera has been one of the most-requested features for the zero. It opens up a huge range of new, exciting projects. There will be a huge demand for the new zero. Let's hope the stocks hold out for a while!

Tweet: The new Pi zero is here!

The new Pi zero as a mobile eye

If you want to give your mobile robot vision the new zero is a great solution. I can see it being used in wheeled robots, submarines and drones. Drones will need some fail-safe method of operator control for legal reasons but wheeled robots and subs can be completely independent if their software is smart enough.

Computer vision and neural networks

The zero has enough memory and processing power to run openCV. I'm working on experiments to add visual input my neural network software. I'll post about the project as it progresses.

If you're interested in neural networks I'm writing a tutorial series for beginners. Start here.

Friday, 13 May 2016

A brief introduction to ANNs - part 2

The previous example of a neuron was a bit far-fetched. Its activation function doubled the weighted sum of its inputs. A simpler (and more useful) variant just returns the sum of its weighted inputs. This is known as a linear neuron.

The linear neuron

In APL, you could implement the linear neuron like this:
ln←{+/⍺×⍵}
and use it like this:
1 0.5 1 ln 0.1 1 0.3
0.9

Inner product

However, there's a neater and more idiomatic way to define it in APL. A mathematician would call the ln function the dot product or inner product of ⍺ and ⍵ and in APL you can write it as
ln←{⍺+.×⍵}

There are several reasons to use the inner product notation. It's concise, it's fast to execute, and (as we'll see later) it allows us to handle more than one neuron without having to write looping code.

Linear neurons are sometimes used in real-world applications; another type of neuron you're likely to come across is the perceptron.

The Perceptron

The output of a linear neuron is a real number. The output of a perceptron is binary: a 0 or a 1. This is useful in classification networks, where the output of a perceptron might be used to indicate the presence or absence of a particular feature, An output of zero would mean the feature was absent, while an output of 1 would mean that the feature was present.
Let's look at a concrete application, which we'll come back to later in this series.

An example - handwritten number recognition

Let's imagine that you want to construct a neural network to recognise handwritten digits. The input to your network might be a 28 x 28 matrix of pixels. A matrix of real numbers might represent how bright each pixel is.
You might have ten perceptrons corresponding to the digits 0 to 9. When the image of a handwritten digit is input, the relevant perceptron should fire. In other words, its output should be 1.

The perceptron firing rule

A perceptron calculates its output by looking at the value of the weighted sum of its inputs, just like a linear neuron. However, a perceptron outputs 0 if the sum is zero or negative, and it outputs a 1 if the sum is positive number.

Implementing the Perceptron in APL

You could define the perceptron like this:
p←{0<⍺+.×⍵}
and test it like this:
0.1 1 0.3  p 1 0.5 1
1
0.1 1 0.3  p 1 0.5 ¯1
1
0.1 1 0.3 p 1 ¯0.5 ¯1
0

Notice that you represent negative numbers in APL using' ¯' rather than '-', which means 'do a subtraction'.

The bias input to the perceptron

If you look at a typical article on the perceptron, you will see that the algorithm it gives often includes an extra term called the bias b. Here's the definition from the Wikipedia article:
The bias is just like the other inputs except that its contribution is not weighted. To keep your code simple, you can do what many neural networks do. You can treat the bias term as the first element in the input vector, and prefix the weights by the constant 1.
Now when you calculate the  inner product of the extended inputs and weights the bias is added to the weighted inputs.
Here's the code:
p2←{0<⍺+.×1,⍵}
In APL the comma (called catenate) is the symbol you use to concatenate two arrays together. In the definition of p2, the 1 is added to the beginning of the vector of weights ⍵. The bias b is the first element of the input vector ⍺.
You can test p2 like this:
0.1 1 0.3  p2 0.5 1
1
0.1 1 0.3  p2 ¯0.5 ¯1
0
Here the bias is 0.1 and the other inputs are 1 and 0.3. The weights of the two inputs are 0.5 and 1
Implementing multiple perceptrons
A single perceptron can't do much on it's own. A useful application is likely to have lots of neurons.These are often grouped into layers, and in many cases the layer is fully-connected.
A fully-connected layer of neurons is a collection of neurons in which
1. each neuron is of the same type
2. each neuron has its own set of weights
3. each neuron receives all of the inputs to the layer
The layer has a vector of inputs. The inputs of every neuron are the elements of that vector.
The layer has a vector of outputs. Each element in the output vector is the output from a single neuron.
Using matrices in APL
Since there are multiple neurons, each of which has its own weights, you can represent the weights as a matrix. Column i of the matrix should contain the weights for neuron i.
In APL you can create a matrix by using the reshape function which is represented by the Greek letter rho.
Create a matrix like this:
mat←2 4⍴20
mat
20 20 20 20
20 20 20 20

You've created a matrix with two rows and four columns. Each entry in the matrix is the number 20.

Creating random test data

You can create test data with a bit more variety like this:

mat←?2 4⍴20
mat
14 11  7  4
10  3 13 15

The ? function (called roll) rolls a 20-sided die for each element in its argument. Your expression created an array of 2 by 4 random numbers.

They are not really random, of course, but they will be different each time you evaluate that expression. So don't be surprised when the values in your version of mat are different from mine!

Even that set of values can be improved on. Try this:

mat←0.1×¯10+?2 4⍴20
mat
0   ¯0.9  0.4  0.6
¯0.4  0.1 ¯0.6 ¯0.9

Remember how APL parses expressions? You can read that first line as 'multiply by 0.1 the result of adding minus ten to a 2 by 4 matrix of numbers in the range 1 to 20'.

The APL is a bit more concise :)

Testing time

Time to try out your layer. Each neuron takes the same vector of inputs. In your case, that vector should have three elements: one for the bias and two for the weights. Create a vector of inputs like this:

inputs ← 0.2 0.3 0.1

See what happens if you try

inputs p2 mat
LENGTH ERROR
p2[0] p2←{0<⍺+.×1,⍵}

Oops! Something has gone wrong. APL's error message has told you the sort of problem it encountered, and what it was executing when the problem occurred.

To understand the source of the problem, try evaluating
1,mat
1  0   ¯0.9  0.4  0.6
1 ¯0.4  0.1 ¯0.6 ¯0.9

Aha! APL has concatenated the 1 at the start of the matrix. That was right when you had a vector argument, but not for a matrix. Try this:

1⍪mat
1    1    1    1
0   ¯0.9  0.4  0.6
¯0.4  0.1 ¯0.6 ¯0.9
That strange comma-with-a-bar means catenate in the last dimension, which will work for vectors and matrices. Change the definition of p2 to be

p2←{0<⍺+.×1⍪⍵}

and run inputs p2 mat again.

It works! With a minor change, you now have code that works equally well for one or many neurons.

That's enough for now. In the next article, we'll look at what the perceptron can and cannot do. We'll also look at other types types of neuron.

Wednesday, 11 May 2016

A brief introduction to ANNs - part 1

ANNs (Artificial Neural Networks) are systems that can process information using connected components called neurons. ANNs are inspired by real (biological) neural networks like the brain.

ANNs are widely used for real-world information processing tasks. In the image below (courtesy of Google Street View) you can see that car number plates have been blurred. Google hides them to protect privacy, and the software that recognises what to blur is a Neural Network.

The software

As I said yesterday, I developed the software in APL on a Raspberry Pi. You’ll find instructions on how you can run it further down.

Neurons

An ANN is made up of neurons. Neurons are usually grouped into one or more layers. Many types of neuron have been proposed, but they all have certain characteristics in common.

A neuron has one or more inputs and a single output. Associated with each input is a weight. Almost all neurons determine their output by adding together their weighted inputs and applying an activation function.

Here’s a diagram of a single neuron:

The values x0, x1, x2 are the inputs and w0, w1 and w2 are the weights. The neuron multiplies each input by the corresponding weight, adds them up, and applies the activation function f to determine the output. We often refer to the sequence of inputs as the vector x, and the sequence of weights as the vector w.

Running the code in APL

If you want to run the APL code as you read this article, you can use the tryapl.org website or a copy of Dyalog APL. If you are just going to use APL for non-commercial purposes you can get a free copy of Dyalog for the Raspberry Pi, and you can get low-cost licenses  for Windows and Linux computers.

If you’re going to use tryapl.org, you can copy individual lines of APL below and paste them into the coloured bar on the webpage. If you want to try different expressions, you can click on the APL keyboard button on the top right of the tryapl,org web page. This will allow you to enter special APL symbols, like  ← and ×.

If you’re going to use Dyalog APL on the Raspberry Pi, start with the reference guide. It will tell you how to install and start APL and point you at other reference materials.

Let’s get started!

An example

You’ll start by calculating the output of a typical neuron.

Assume the neuron’s inputs are 3.5 2 0.7 and 1.2.
Assume also that the associated weights are  0.5 0.3 0.1 0.2.
Finally, take the activation function f to be a function that doubles its argument.

In APL, create the variables x and w. (You can enter a vector by separating its elements by spaces). In APL you assign a value using the leftward arrow , so enter and execute these two lines of code

x ← 3.5 2.0 0.7 1.2
w ← 0.5 0.3 0.1 0.2

Now multiply x by w. APL uses the ×sign for multiplication. Take care to distinguish the letter x (which you’ve used as a variable containing the input vector) from × (the multiplication sign).

Enter

x × w

APL should respond by typing the result:

1.75 0.6 0.07 0.24

Notice that APL has iterated through the vectors, element by element, without you having to write a loop.

Note for the curious: You might like to predict, and then try, what would happen if x and w had different lengths.

Reduction

The output of your neuron should be the sum of those numbers, doubled because you’re using the ‘times two’ function for this neuron.
How can you add up the elements of the vector? If you’ve used languages like Python, Clojure or Haskell, you’ve probably come across reduce or fold. These can be used to repeatedly apply a given function between the elements of a vector.

APL has a similar feature. The plus reduction of a vector will calculate its sum.
You write the plus reduction of x like this:
+/x
The result should be 7.4 but of course you don’t want to sum x. You want to sum x times w.

Enter +/ x×w and you should see 2.66

You haven’t yet applied f, the ‘times two’ function that I specified. Here’s the complete calculation, followed by the result that APL should display

2×+/x×w
5.32

Order of execution
You may be curious about the order in which APL carries out its calculations.

Most of us learned a set of rules when learning arithmetic at school: brackets first, then divide, then times, then minus, then plus. APL takes a simpler approach. Here’s the reason why.

If you’ve programmed in a language like Python C or Java, you will know that they have their own complex set of rules to work out the order of execution. These are hard to remember, and they are a frequent source of subtle programming errors.

APL has lots and lots of useful primitive functions - so many that any precedence rules would be very hard to remember. So APL has no special precedence. You can read the expression above as double the sum of x times w, and that’s exactly what APL does.

User-defined functions

You’ve now seen how to calculate the output of our hypothetical neuron, but it would be impractical to type in that code every time you wanted to know the output of a particular neuron.

Instead, you can define an APL function which will take the inputs and weights as arguments and calculate the result.

Enter the definition of our example neuron calculator like this:

eg ← {2×+/⍺×⍵}

This creates a function called eg which takes two arguments ⍺ and ⍵. It multiplies them together, sums the product and multiplies it by two. Here’s how to test it:

3.5 2.0 0.7 1.2 eg 0.5 0.3 0.1 0.2

If you want to add two numbers, you put the plus sign between them (using so-called infix notation). If you define your own APL function like eg which takes two arguments, you use the same syntax - hence the code above.

That’s probably as much as you can take in for now, and it’s certainly as much as I can write :)

In the next post you’ll take a look at  the code used for several common types of neurons, and you’ll also see how to calculate the output of several neurons at once.