Showing posts from July, 2019

Benchmarking process for TF-TRT, and a workaround for the Coral USB Accelerator

A couple of days ago I published some benchmarking results running a TF-TRT model on the Pi and Jetson Nano. I said I'd write up the benchmarking process. You'll find the details below. The code I used is on GitHub . I've also managed to get a Coral USB Accelerator running with a Raspberry Pi 4. I encountered a minor problem, and I have explained my simple but very hacky workaround at the end of the post. TensorFlow and TF-TRT benchmarks Setup The process  was based on this excellent article , written by Chengwei Zhang . On my workstation I started by following Chengwei Zhang's recipe. I trained the model on my workstation using and then copied trt_graph.pb from my workstation to the Pi 4. On the Raspberry Pi 4 I used a virtual environment created with pipenv , and installed jupyter and pillow . I downloaded and installed this unofficial wheel . I tried to run step2.ipynb but encountered an import error. This turned out to be an old TensorFl

Benchmarking TF-TRT on the Raspberry Pi and Jetson Nano

Trying to choose between the Pi 4B and the Jetson Nano for a Deep Learning project? I recently posted some results from benchmarks I ran training and running TensorFlow networks on the Raspberry Pi 4 and Jetson Nano. They generated a lot of interest, but some readers questioned their relevance. They were'n interested in training networks on edge devices. Most people expect to train on higher-power hardware and then deploy the trained networks on  the Pi and Nano. If they use TensorFlow for training, they have are several choices for deployment: Standard TensorFlow TensorFlow Lite TF-TRT (a TensorFlow wrapper around NVIDIA's TensorRT, or TRT) Raw TensorRT In this post I'll focus on timing Standard TensorFlow and TF-TRT. In a later post I plan to cover TensorFlow Lite on the Pi with and without accelerators like the Coral EDGE TPU coprocessor and the Intel Compute Stick. I've run a number of benchmarks, and the results have been much as I expected. I d

Training ANNs on the Raspberry Pi 4 and Jetson Nano

There have been several benchmarks published comparing performance of the Raspberry Pi and Jetson Nano. They suggest there is little to chose between them when running Deep Learning tasks. I'm sure the results have been accurately reported, but I found them surprising. Don't get me wrong. I love the Pi 4, and am very happy with the two I've been using. The Pi 4 is significantly faster than its predecessors, but... The Jetson Nano has a powerful GPU that's optimised for many of the operations used by Artificial Neural Networks (ANNs). I'd expect the Nano to significantly outperform the Pi running ANNs. How can this be? I think I've identified reasons for the surprising results. At least one benchmark appears to have tested the Nano in 5W power mode. I'm not 100% certain, as the author has not responded to several enquiries, but the article talks about the difficulty in finding a 4A USB supply. That suggests that the author is not entirely