Thursday, 25 August 2022

How to write reliable tests for Python MQTT applications

More and more IoT applications use MQTT. It's a simple and very useful messaging framework which runs on small boards like the Raspberry Pi Pico as well as systems running under Linux, MacOS and Windows.

I recently decided to add some extra functionality to Lazydoro using MQTT. The code seemed to work when run manually but I had a lot of trouble getting my automated tests working. It took quite a while to understand the problem, but the fix was simple.

Intermittently failing tests are bad

In the end-to-end test that was causing the problem, the code simulated the start of a pomodoro session and then checked that the correct MQTT message had been  sent. The test usually failed but sometimes passed. When I manually ran a separate client that subscribed to the message stream I could see that the right messages were being sent.

Intermittently failing (or passing) tests are a nuisance. They do nothing to build confidence that the application under test is working reliably, and they are no help when you're refactoring. You can never be sure if the tested fail because you made a mistake in the refactoring, or was it just having one of its hissy fits?

Solving timing problems

Intermittent failures like this are often due to timing issues. It's tempting to solve them by adding delays to the testing code, but this is prone to problems. Too short a delay, and the tests still fail from time to time; too long a delay, and the tests become burdensome to run.

The solution is simple; write your test so that it polls to see if the expected condition is true, and set a  timeout so that it will only expire if the test is going to fail.

Before the test checks that the correct message has been sent, it waits until there is a message to check.

Here's the code that waits:

 def wait_for_message(self, tries = 100, interval = 0.01):
        for i in range(tries):
            if len(self.messages()) > 0:
        raise ValueError('waiting for message - timed out')
Now the tests run reliably.

You can see the entire Test Client code here.

No comments:

Post a Comment