Pages

Sunday, 14 July 2019

Training ANNs on the Raspberry Pi 4 and Jetson Nano


There have been several benchmarks published comparing performance of the Raspberry Pi and Jetson Nano. They suggest there is little to chose between them when running Deep Learning tasks.

I'm sure the results have been accurately reported, but I found them surprising.

Don't get me wrong. I love the Pi 4, and am very happy with the two I've been using. The Pi 4 is significantly faster than its predecessors, but...

The Jetson Nano has a powerful GPU that's optimised for many of the operations used by Artificial Neural Networks (ANNs).

I'd expect the Nano to significantly outperform the Pi running ANNs.


How can this be?


I think I've identified reasons for the surprising results.

At least one benchmark appears to have tested the Nano in 5W power mode. I'm not 100% certain, as the author has not responded to several enquiries, but the article talks about the difficulty in finding a 4A USB supply. That suggests that the author is not entirely familiar with the Nano and its documentation.

The docs makes it clear that the Nano will only run at full speed if 10 W mode is selected, and 10 W mode requires a PSU with a barrel jack rather than a USB connector. You can see the barrel jack in the picture of the Nano above.

Another common factor is that most of the benchmarks test inference with a pre-trained network rather than training speed. While many applications will use pre-trained nets, training speed still matters; some applications will need training from scratch, and others will require transfer learning.

I've done a couple of quick tests of relative speed with a 4GB Pi4 and a Nano running training tasks with the MNIST digits and fashion datasets data sets.

The Nano was in 10 W max-power mode. The Pi was cooled using the Pimoroni fan shim, and it did not get throttled by overheating.

Training times  using MNIST digits 


The first network uses MNIST digits data and looks like this:
model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])
While training, that takes about 50 seconds per epoch on the Nano, and about 140 seconds per epoch in the Pi 4.

MNIST Fashion training times with a second, larger CNN layer


The second (borrowed from one of the deeplearning.ai course notebooks) uses the mnist fashion dataset and looks like this

model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2,2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])
While training, that takes about 59 seconds per epoch on the Nano and about 336 seconds per epoch on the Pi.

Conclusion


These figures suggest that the Nano is between 2.8 and 5.7 times faster than the Pi 4 when training.

That leaves me curious about relative performance when running pre-trained networks. I'll report when I have some more data.

Friday, 28 June 2019

Sambot - MeArm, servos, the Babelboard and Jetson Nano

Jud McCranie CC BY 3.0 via Wikimedia Commons
Way back in 1974 I took Tom Westerdale's Adaptive Systems course as part of my Masters degree. Tom's thesis advisor was John Holland, and a lot of the course covered genetic algorithms. Before that it covered early machine learning applications like Samuel's Checkers Player.

I've wanted to revisit those early AI applications for a while, and I recently decided to put a new spin on an old idea.

I want to build a robot that plays Checkers (that's draughts to us Brits) using a real board, a robot arm and a Jetson Nano using a Raspberry Pi camera.

The game play could be done using a variant of Samuel's approach, a Neural network, or a combination of the two. If AlphaGo can master Go playing against itself it shouldn't be too hard for a pair of Machine Learning programs to maser draughts!

First, though, I need to build a controllable arm that can pick up and move the pieces and a computer vision system that can recognise the location of the peices on the board.

I'm up to my eyes in projects and work at the moment, but I have justified making a start on the robot arm because it ties in nicely with a current priority. It's a chance to explore and explain how to do Physical Computing with the Jetson Nano.

Jetson Nano
One of the many things I love about the Nano is its use of the Raspberry Pi header layout. The Nano's heatsink (needed to cool its GPU brain) means that Pi hats cannot sit over the Nano, but the Pi header has been rotated thought 180 degrees, allowing you to use Raspberry Pi hats unmodified.

Recently I've been making use of the Grove system from Seeed studio, and I've built a series of low-cost boards to let me use the Grove components with the Pi, the Nano and Adafruit Feathers. I call them babelboards.

They are the hardware hacker's equivalent of Douglas Adam's babelfish. They allow lots of different hardware components to talk to each other using the I2C protocol.

Nano with minimal babelboard
The simplest Bableboard is a small piece of stripboard with a couple of connectors. On the right you can see it plugged into a Nano, with a Grove 4-wire cable plugged in, driving a Grove 16-channel servo controller. That's overkill, as I think my Checkers player will only need 4 servos, but I happen to have one to hand.

I've had a MeArm robotic arm waiting for me to assemble it for ages, and today I made a start. I'll talk more about that in a later post, but today I will focus on getting the Nano to control the MeArm's four servos.

The Grove servo controller uses the PCA9685. That's the same chip that Adafruit use in their servo controllers, and they have a Python library to control it.

Better still, Adafruit have ported their Blinka library to the Pi and the Nano so you can use their family of CircuitPython libraries on the Pi and the Nano as well as the Adafruit CircuitPython boards. Awesome!

I installed the software following the instructions on the Adafruit website, by invoking


pip3 install adafruit-circuitpython-servokit --user
Since some of the Adafruit code uses gpio as well as I2C, I made sure I could access them all from my user account, and then rebooted to make the changes take effect:

sudo groupadd -f -r gpio
sudo usermod -a -G gpio $USER
sudo usermod -a -G i2c $USER
sudo cp /opt/nvidia/jetson-gpio/etc/99-gpio.rules /etc/udev/rules.d/
sudo reboot


Next I connected a servo to the servo controller, and ran some sample code from the Adafruit website:

import time
from adafruit_servokit import ServoKit

kit = ServoKit(channels=16)
kit.servo[0].angle = 180
time.sleep(1)
kit.servo[0].angle = 0

And lo - the servo moved!

In the next post I'll give more details about the Babelboards and how to build them.

If you want to keep an eye on what I'm up to, I'm @rareblog on Twitter.

Monday, 24 June 2019

An excellent course for Jetson Nano owners

Jetson Nano
Regular readers will know than I'm a keen Jetson Nano owner.

Recently I posted a series about how to started with the computer but NVIDIA have now published an excellent course,  'Getting Started with the Jetson Nano', which is  free for members of the NVIDIA developers' program.

The course comes with a pre-built image which can run the Nano in headless mode. That's very useful - I had to buy a new monitor to get going, as none of my old monitors had native HDMI support.

The image provide with the course just needs a Nano and a Laptop or Desktop computer with a USB port.

The course is a great introduction to deep learning with a GPU. Once you've completed it you may want to delve deeper; there are lots of excellent Deep Learning courses available on-line, and many of them use Google's Colab for practical sessions.

Google Colab gives you free access to top-of-the range NVIDIA hardware, and if you want to run your trained models locally it's easy to move them onto the Nano. Of course the Nano is small enough to use for mobile robotics!

I'm designing an autonomous robot with the Nano, and I have been delighted with Adafruit's Nano port of their Blinka library. It makes it really easy to write Physical Computing code that runs without change on the Nano, the Raspberry Pi and  Adafruit's CircuitPython-enabled boards.

Come and see the Nano


Tomorrow I'll be at the London Raspberry Pint meet-up showing a Nano driving Adafruit, Grove and SparkFun peripherals using a simple interface board (the babelboard) and some straightforward Python code.

If you're in or near London, do come along. The meet-up starts at 7 PM. It's at CodeNode, and you'll save a lot of time if you pre-register at the CodeNode site.

If you can't join us, a video of the talk will be available in a few days and I will be blogging more about the babelboard later this week.