Pages

Thursday, 23 June 2022

Technologists: one thing you must know if you use logseq with GitHub

image

Like many technologists, I use Personal Management Tools to manage information overload.

I've been using technology to help me keep track of complex technical material since the 1980s. These days, my favourite tool is logseq. You can use logseq to capture, connect and create technical information. Over the last couple of years I've built up a large, heavily linked knowledge graph - a second brain filled with information and ideas.

It's worked well for me but this week I hit a pitfall.

Using logseq with GitHub

If you use logseq as your PKM system you may be using GitHub to back up and version your knowledge graphs.

Logseq has great GitHub integration. Because of that, many users have adopted GitHub as a way of making sure their second brain is secure and easy to access from anywhere they want.

If you are using logseq with GitHub, beware - there's a potential pitfall lurking!

A trap to avoid

Logseq has great support for PDFs and mp4 videos.

You can embed your mp4 video files using drag and drop. You can also drag a PDF into the logseq window, open it and highlight items of interest in the PDF. You can even copy your highlights into your notes.

But when logseq tries to commit your changes, GitHub may object!

Watch out for large files

If you use logseq to capture large files you're likely to encounter GitHub's file size limits.

Git will allow you to commit large files locally but GitHub won't allow you to push them to the central repository. GitHub warns you if you try to push files larger than 50MB, and it will refuse to push files larger than 100 MB.

If you try, you'll see an error message refusing your push.

 error: GH001: Large files detected

Fortunately there's an easy solution: GitHub LFS.

LFS to the rescue

GitHub LFS (Large File Storage) allows you to version and push files which are larger than GitHub's normal 100MB limit.

It's easy to add LFS to an existing repository if there are no large files. You'll find detailed instructions here.

What if you've added large files to your graph and GitHub has refused to let you push it?

There's good news about that too.

LFS migration

There's a tutorial on GitHub which tells you how to migrate large files that have been committed locally and need to be moved to LFS. After I'd enabled LFS, all I had to do to migrate my existing mp4 files and PDFs was to run

git lfs migrate import --include="*.mp4"

git lfs migrate import --include="*.pdf"

and I could then commit and push in the usual way.

LFS costs

LFS won't break your budget. Every account gets a free 1 GB storage allowance, and you can pay just $5/month to add a 50gb datapack. There's also a bandwidth limit, but you're less likely to be constrained by that.

Summary

GitHub and logseq make a great partnership, but if you're going to store videos or large PDFs you'll want to add LFS support to your GitHub repository.

Friday, 10 June 2022

Three strategies to manage Technology Overload

If you're reading this you're probably a knowledge worker.

Your value lies in the knowledge at your disposal and your ability to apply it. There are daily advances in every field of technology, and you are subjected to a flood of new knowledge competing for your attention. It's easy to feel overwhelmed.

In this article, you'll

  • read a brief introduction to the flood of information overload and its causes.

  • see three strategies you can use to cope with that flood.

  • see how to avoid getting overwhelmed by the range of tools - otherwise the cure will be worse than the disease!

But first - what's the problem?

What's the problem?

The Explosion of Information

Edholm's Law states that Internet traffic now follows the same pace of growth as Moore's Law.



That poses a huge problem for knowledge workers.

Here's a picture of the problem:

The knowledge gap

If that worries you, you're not alone.

Professor Eddie Obeng explores the Knowledge Explosion

In his delightful and provocative TED video, Eddie Obeng warns us about what has happened to us in the 21st century.

"Somebody or something has changed the rules about how our world works....

I think what's happened, perhaps, is that we've not noticed that change...

What we do know is that the world has accelerated."

He goes on to confirm that the rate at which knowledge is generated has grown faster than our ability to absorb it.

This has profound implications for leadership, companies, organizations and countries which Eddie explores in his writing and his work at Pentangle.

Azeem Azhar agrees

In his book Exponential Azeem Azhar points out that

as technology accelerates, the human mind struggles to keep up
- and our companies, workplaces, and democracies get left behind.
This is the exponential gap.

If you're interested in finding more about Azeem Azhar's perspective, you can subscribe to his Exponential View

So how can you cope?

Three proven strategies

These three strategies can dramatically improve your ability to cope with the flood of new information:

  1. PKM Tools
    1. Mind Mapping
    2. Clipping
    3. Note-taking
  2. Power Learning
    1. Learning How to Learn
    2. Learn Like a Pro
    3. Ultralearning
  3. Harnessing Collective Intelligence
    1. Focussed Internet communities
    2. Collaborative Software
    3. Using AI as part of collective intelligence

PKM tools

According to Wikipedia, PKM (Personal Knowledge Management) is a process of collecting information that a person uses to * gather * classify * store * search * retrieve and * share knowledge.

I'd add the ability to connect and enhance the items of knowledge in your store.

You may already be using PKM tools to increase your productivity, but it might be time to update your toolbox. This is a fast-developing field!

Mind Mapping

Mind Maps aren't a modern invention. They have been used by Knowledge workers for hundreds of years. Here's a modern version of the Porphyrian tree, which was used by medieval educators and was adapted by Linnaeus in the 18th century to illustrate the relationships between species.

Widespread adoption started when Tony Buzan introduced the term Mind Map in 1974.

Buzan described hand-drawn maps, and they are still very useful. A hand-drawn map is an intensely individual creation, and can be a thing of beauty.

The main disadvantages of hand-drawn maps is that they require photography to back them up, and they are difficult to share and search.

Many Mind Mappers now use software to create and publish their maps. Here's an example: a MindMap of my new book on Giving Memorable Technical Talks.



Mind mapping software started to appear in the 1980s. These days there are dozens, if not hundreds, of books about mind mapping and dozens of software tools.

You'll find great lists/reviews of Mind Mapping tools, and lots of advice, on Chuck Frey's Mind Mapping Software Blog.

I've used a number of MindMapping tools, but for the last few years I have relied on Freeplane. It's free, open source, well documented, and it's supported by an active user community.

Freeplane stores maps in XML which is easy to transform into and out of other formats. If you're into Python coding, you might find a use for fm2md, a library that can convert a Freemind or Freeplane Mind Map into a set of markdown documents ready for publication on Leanpub.

Freeplane works very well, but it's not designed for collaboration. I'll mention some alternatives in the section on collaborative software.

Mind Maps provide a rich visual experience, but they suffer from one major limitation; each map represents knowledge as a single tree rooted in a single root node. While it's possible to make connections between branches, these can rapidly get confusing.

You'll find information about Note-taking tools that support networks of connections below.

Clipping

Clipping tools allow you to save a URL, an entire page or selected highlights. The two I use are Evernote and Pocket.

Evernote

Evernote offers all three possibilities, and has a wealth of additional capabilities including audio note-taking and Optical Character Recognition (OCR).

Pocket

Pocket was originally called 'Read it Later', and that explains just what Pocket lets you do.

You can save content to read later. Pocket will supplement the content with links to other articles that are likely to be of interest. You can tag links as you save them, and the paid version of Pocket will suggest tags for you to use.

Kindle

But surely Amazon's Kindle is a eBook reader?

It is, but it also allows you to highlight passages in, and add notes to, the Kindle books you own.

You can read your Kindle highlights and notes online, and applications like Readwise can collect them for you. Some can even import them into Note-taking apps, as you'll see in the next section.

The tools below allow you to create your own notes, and more recent tools help you to build linked networks of notes.

It's possible to create collections of notes without links, but the connections between ideas are often as valuable as the ideas themselves. For that reason, this article will focus on note-taking apps with linking capabilities.

Linked Note-taking apps

TiddlyWiki is the grandfather of personal note-taking apps. Derived from Ward Cunningham's wiki concept, TiddlyWiki offers a serverless, self-contained wiki stored in a single html file.

I first started using TiddlyWiki in 2005 and continued to use it for over a decade, along with Freeplane for Mind Maps.

TiddlyWiki has an engaged and helpful community along with a rich ecosystem of plug-ins. Its main weakness is that it relies on the ability to save the html file from within a browser, and that has become harder and harder as browsers have tightened their security.

A worry

It's still possible to save files locally, but my worry is that one day a browser update will prevent me from accessing a PLM tool that I would normally rely on many times each day.

There are workarounds, but they rely on third party plugins which need to be updated if there are significant changes to the supporting browser.

Many TiddlyWiki users have migrated to more recent software.

Roam, a linked note-taking app from Roam Research, has taken off dramatically over the last couple of years. In 2017 it was a prototype with a single user; by 2021 it had over 60,000.

Roam supports collaboration, and it has an attractive and ergonomic user interface.

I started to use Roam daily in late 2019, and my graph (network) now links over 1800 pages.

Some users, including me, find Roam's $15/month price tag onerous, and dislike the fact that Roam keeps all your data in the cloud. You can download backups, but there are three different backup formats and each has limitations.

It's a remarkable product which continues to develop, but it has at least two serious competitors.

Obsidian implements a similar concept but with a different interface. It's free, and it stores your data in your local filesystem. Like Roam, it is closed source, but it has an open plug-in API.

logseq has most of Roam's features and adds some of its own. It's free, it's open source and it stores the text and assets in your notes as local files. It's beta software, but it's easy to back up. It is not designed for collaboration; if that's a major requirement Roam might be a better alternative.

Concerns about cost, privacy ownership led me to migrate to logseq. I've been using it for a couple of weeks and I am happy with the switch.

With Readwise it's easy to automate the import of your Kindle highlights into both Roam and Obsidian. That's not yet directly supported in logseq, but there is a workaround. Install Obsidian alongside logseq!

Readwise will create or update markdown notes for Obsidian, and logseq will see them and incorporate them into your logseq graph.

There's great advice on selecting a note-taking app in how to choose the right note-taking app on the Ness Labs website, and in overview of note-taking styles on Forte Labs.

You can make good use of PKM tools to support power learning.

Power Learning

These days you can enjoy a dramatic improvement in your ability to learn and recall information.

PKM tools can help tremendously, as can recent research in Psychology and Neuroscience. You can learn how to learn from inexpensive books and free MOOCs (Massive Open On-line Courses).

Here are some favourites that will help you learn much more effectively:

MOOCs and Books

Learning How to Learn

Over three million students have taken Learning How to Learn by Barbara Oakley and Terry Sejnowski. It's a great course, and very thorough.

The authors have written a book based on the course that's targeted at kids and teens.

Learn like a Pro

Shorter, and recently updated, Learn like a Pro covers similar ground at a faster pace. There's also a book version of that for adults.

Ultralearning

I like Scott Young's Ultralearning.

From the book's blurb:

Faced with tumultuous economic times and rapid technological change, staying ahead in your career depends on continual learning - a lifelong mastery of new ideas, subjects and skills. If you want to accomplish more and stand apart from everyone else, you need to become an ultralearner

Online, Scott tells the story of an experiment in which he mastered the MIT’s 4-year undergraduate computer science curriculum in 12 months, without taking any classes.

Scott's experiment is an example of Learning in Public. It's a great way to add value to your learning efforts for yourself and others.

What else can you use to cope with tech overwhelm?

Collective Intelligence

The third strategy is to use the power of collective intelligence.

In The Wisdom of Crowds, James Surowiecki suggests that groups can often make better decisions than could have been made by any single member of the group. That's not always true, of course, as political history demonstrates, but there's another way in which groups can surpass their individual members.

They can combine their knowledge and collectively make connections that no individual could see.

Common-Interest Communities

Usenet brought together groups of people with shared interests from the very earliest days of the Internet. In the 1990s many of us graduated to Google Groups and Yahoo groups. Today Social media offer multiple ways to discover people with relevant interests and knowledge, to ask them questions, and to share opinions and resources.

Often, though, you'll want to work together with others on creating shared resources.

Collaborative software tools

COVID forced many of us to work from home. One consequence has been an explosion of web-based and desktop software tools to help remote workers to collaborate.

Google Docs and Google Slides have been around for a while and they both offer excellent support for collaborative development.

Slack and Zoom, miro and gotomeeting have all become household names.

There's a fast-growing group of integrated collaboration tools that combine calendar management, contact management, document management, project management and task management. An online search for team collaboration tools will throw up lots of articles comparing current offerings; the market is changing so rapidly that you'll need to update your search results regularly if you want to keep up.

Knowledge Management Systems

Earlier you read about Personal Knowledge Management. Within a Community or Organisation, you may need to widen your scope to address a communal KMS (Knowledge management System).


From Design Knowledge Management System


This is a huge topic in its own right. There is an International Standard (ISO 30401) that addresses the subject of KMS. You'll find a good introduction to the Standard and its implementation in  Design Knowledge Management System by Santosh Shekar.

AI and collective intelligence

The very technologies that cause the knowledge explosion have given us tools to mitigate the explosion.

You can harness AI as a partner in collective intelligence communities.

The MIT Collective Intelligence Design Lab is a trailblazer in that area. It's working on a methodology called supermind design. You can read an overview in their free Supermind Design Primer.

Conclusion

It's tempting to experiment with every new tool and technique, but that will dilute your focus and worsen the very problem you're trying to solve.

These days I restrict myself to trying a single tool at a time, and I allow myself enough time to reach a level of competence that allows me to make an informed judgement about adopting it. That's typically somewhere between a week and a month, though I may decide to drop an unsatisfactory tool immediately.

The exponential explosion of technology poses a challenge for knowledge workers, but it's also provided us with an amazing range of tools and techniques to help us cope.

How about you?

Do you have a favourite tool or technique? If so, do share it in the comments.

Friday, 6 May 2022

APL and Python go head-to-head

Markdown is great, but...

I've encountered a problem.

I use Markdown a lot. Since it's pure text-based markup, it's handled well by Git and GitHub. That helps me keep track of versions, and it simplifies text merges if I'm collaborating on a writing project.

A lot of my writing features Python code, and I like to work on the code and the article in the same editor.

Fortunately there's great support for editing and displaying Markdown in both PyCharm and VS Code.

Markdown is supported by lots of FOSS tools which make it easy to convert between Markdown and other formats.

I regularly use pandoc to turn Markdown in to pdfs and to publish on Leanpub, and I use the markdown package to convert Markdown files into HTML which I can paste into Blogger.

(Pandoc can create HTML but the output is not directly usable in Blogger.)

So I have a problem.

Much of markdown is standardised but the pandoc and markdown programs handle code blocks differently.

In pandoc, Markdown code blocks are enclosed in triple backticks, with an optional description of the code language.

The markdown program expects code blocks to be indented by four spaces with no surrounding backticks.

I often want to take a Markdown document and create both HTML for the blog and a pdf for people to download, but that requires two different formats for the source document.

I could make the changes manually but that is tedious and error-prone. I decided to write some code to convert between the two formats.

I'd normally write that sort of code in Python, but I also use APL. I wondered how the two approaches would compare.

I first met APL in (cough) 1967 or 1968, and the version I learned then lacks many of the modern features in Dyalog APL.

Luckily there are some very competent and helpful young developers in the APL Orchard community. If you post a problem there you'll often get an immediate solution, so I can easily improve on my dinosaur-like approach to APL problems.

Today I am going to try to find the best solution I can in APL and compare it with a Python version. I'm not worried about performance, since I know each approach is more than capable of converting my documents faster than the eye can see.

I'm more interested in the different approaches. APL is a functional array-oriented language; Python supports functional programming, but most of us use a mixture of procedural and Object-oriented code.

I created a Python solution fairly quickly.

from typing import List


class Gulper:
    def __init__(self):
        self.is_reading_markdown = True
        self.result = None

    def gulp(self,line: str):
        if self.is_reading_markdown:
            self.read_markdown(line)
        else: self.read_code(line)

    def read_markdown(self, line):
        if line.startswith('```'):
            self.is_reading_markdown = False
            return
        self.result.append(line)

    def read_code(self, line):
        if line.startswith('```'):
            self.is_reading_markdown = True
            return
        self.result.append('    %s' % line)

    def convert(self, lines: List[str]):
        self.result = []
        for line in lines:
            self.gulp(line)
        return self.result

It's pretty straightforward; it's essentially a state machine which switches between reading text and reading code whenever it encounters a line with three back-ticks.

Here's the APL:

conv←{t←⊣/'```'⍷3↑⍤1⊢⍵ ⋄ n←2|+\t ⋄ (~t)⌿(¯4×t<n)⌽⍵,⍤1⊢4⍴' '}

I've broken the function down into simpler parts and explained each line by line here.

Thursday, 28 April 2022

Let the computer test your Python GUI application

Let the computer test your Python GUI application

In this article you’ll see how easy it is to write automated tests for Graphical User Interfaces (GUIs) written using the brilliant guizero Python library.

I’ll start with a horror story which explains why I’m so keen on automated GUI tests.

Next I’ll describe an application that I’m using as an example. The code for the app and the test are available on GitHub; the link is in the resources section at the end of this post.

After that, I’ll show how the tests are built up and describe how they enabled me to find and fix a bug.

A personal horror story

A couple of years ago I presented a Quiz Runner application to an audience of Digital Makers.

The Quiz Runner application used a workstation to manage the Quiz.

Quiz Contestants used micro:bits to ‘buzz’ in when they thought they knew an answer.

The micro:bits communicated via radio using the micro:bit’s built-in radio capability, and everything (workstation and micro:bits) was programmed in Python.

The Quiz Runner application had a simple Graphical User Interface (GUI) to control the quiz and keep scores, and a micro:bit connected via its serial interface to interact with the contestants’ micro:bits.

The demo started really well. Then something went wrong with the GUI and I had to abandon the demo. I was annoyed and embarrassed.

Software craftsmanship

My grandfather was a carpenter, as were his father and Grandfather. They were craftsmen in wood. I like to think of myself as a craftsman in software, but I felt I’d just made a door that would not open.

When I had time to explore the problem I found it very hard to reproduce. I needed to hop between the QuizRunner App and the four team micro:bits, clicking and pressing the right things at the right time for dozens of permutations of behaviour.

I gave up.

The first time that you manually test a GUI application, it feels like fun.

The tenth time? Not so much.

The downsides of manual testing

Manual testing has its place, but it can be boring and error-prone. Worse still, there’s no automatic record of what was tested, or what worked.

Because it’s boring, many developers avoid it as far as possible. That can mean that edge cases get tested in QA or production rather than in development. That’s expensive - the later a bug is detected, the greater the cost of fixing it.

So how can you create automated tests for gui-based applications?

How can you use automated tests with GUIs?

There are gui-testing libraries available, but the commercial products are expensive and most of the open-source tools I’ve found are cumbersome.

There is good news, though, if you use Python’s excellent guizero library.

guizero was written by Laura Sach and Martin O’Hanlon.

They are experienced educators, and they work for the Raspberry Pi Foundation.

guizero is easy to use, it has great documentation and there’s a super Book of Examples!

The book is called Create Graphical User Interfaces with Python, and it’s available from The MagPi website.

I’m a big fan of guizero. It ticks all the boxes in my Python library checklist, and I use it a lot. The library has lots of automated tests, but the book is aimed at beginners, so it recommends manual testing.

To keep the code simple, the book also makes use of global variables. I’m happy with that in a book for beginners, but experienced software developers try to avoid globals in their code.

I wondered how easy it would be for me to refactor to eliminate the globals, and to remove some code duplication.

Refactoring the original code

Refactoring is a technique that you can use to improve the design of existing code without its external behaviour.

You may have come across Martin Fowler’s book on the subject. It’s a classic, and I refer to it a lot.

I refactored one of my favourite examples from Create Graphical User Interfaces with Python. It’s a game of Noughts and Crosses (or Tic-tac-toe if you’re reading this in North America).

I ended up with code that had no globals and tests that exercised the system thoroughly.

How do the tests work?

Add the magic code that pushes buttons

The most important code is this short fragment:

from guizero import PushButton

def push(button: PushButton) -> None:
    button.tk.invoke()
It allows you to write code in your test that has the same effect ass a user pressing a button in  the GUI.

I found it buried in the unit tests for the guizero library.

Set up the test fixtures

You create unit tests by writing Test Cases.

You set up the environment for your tests by creating test fixtures.

Opening a GUI application takes time, so you want to do it once per Test Case.

You do that by writing a class method called setUpClass.

import unittest
from tictactoe import TicTacToeApp

class TicTacToeTestCase(unittest.TestCase):
    @classmethod
    def setUpClass(cls) -> None:
        cls.app = TicTacToeApp()

You write individual tests by creating Test Case methods whose names start with test.

The TestCase will run these in random order, so you need to make sure that your tests don’t interfere with each other.

You do that by writing a setUp method which will reset the game before each test method is run.

def setUp(self) -> None:
    self.app.reset_board()

This calls the reset_board method in the application:

def reset_board(self):
    for x in range(3):
        for y in range(3):
            self.square(x, y).text = " "
            self.square(x, y).enable()
    self.winner = None
    self.current_player = 'X'
    self.message.value = 'It is your turn, X'

Write the tests

Next you write tests to check that the game is working correctly.

Each test simulates a player making a move by clicking on a free cell on the board.

The tests also check whose turn it is before making the move.

The tests use a couple of helper methods to make the tests more readable.

There’s an excellent discussion of test readability in Clean Code. (See the resources at the end of the article.)

Use helper methods to make tests more readable

Here are the helper methods:

def message_value(self):
    return self.app.message.value

def play(self, x, y, player):
    self.assertEqual(self.message_value(), 'It is your turn, %s' % player)
    self.push(x, y)

The message_value method is just a concise way of finding the text of the last message sent by the game.

The play method checks that the last message tells the current player it’s their turn to play, and then clicks on the button that is specified by the x and y coordinates.

Write the first test

The first test just checks that the player changes after a move.

def test_turn_changes_after_player_moves(self):
    self.play(0, 0, 'X')
    self.assertEqual(self.message_value(), 'It is your turn, O')

That test passes. That’s good news. It tells you that the refactoring hasn’t broken that behaviour.

Test a game that X wins

Next write a test to check that the game knows when X has won.

def test_knows_if_x_has_won(self):
    self.play(0, 0, 'X')
    self.play(0, 1, 'O')
    self.play(1, 0, 'X')
    self.play(0, 2, 'O')
    self.play(2, 0, 'X')
    self.assertEqual(self.message_value(), 'X wins!')

That passes. You’re on a roll!

Test a win for O

Here’s a game that O wins.

def test_knows_if_o_has_won(self):
    self.play(0, 0, 'X')
    self.play(0, 1, 'O')
    self.play(1, 0, 'X')
    self.play(1, 1, 'O')
    self.play(1, 2, 'X')
    self.play(2, 1, 'O')
    self.assertEqual(self.message_value(), 'O wins!')

Check for a drawn game

If the last square is filled without either player winning, the game is drawn.

Here’s a test for that:

def test_recognises_draw(self):
    self.play(0, 0, 'X')
    self.play(1, 1, 'O')
    self.play(2, 2, 'X')
    self.play(0, 1, 'O')
    self.play(2, 1, 'X')
    self.play(2, 0, 'O')
    self.play(0, 2, 'X')
    self.play(1, 2, 'O')
    self.play(1, 0, 'X')
    self.assertEqual("It's a draw", self.message_value())

So far so good. But…

Finding and fixing a bug

When I was writing one of the tests I saw some strange behaviour. When I played the original version of the game I confirmed that it has a bug.

You can carry on making moves after the game has been won!

When you find a bug, you need to do four things.

  1. Write a test that demonstrates the bug by failing.
  2. Fix the bug
  3. Verify that the test now passes
  4. Check in your code!

Verify the bug

Here’s the test that demonstrates the bug. When you run it on an unfixed application it fails.

def test_game_stops_when_someone_wins(self):
    self.play(0, 0, 'X')
    self.play(0, 1, 'O')
    self.play(1, 0, 'X')
    self.play(1, 1, 'O')
    self.play(1, 2, 'X')
    self.play(2, 1, 'O')
    # O wins!
    self.push(0, 2) # should be ignored
    self.push(2, 0) # should be ignored
    self.push(2, 2) # should be ignored
    self.assertEqual(self.message_value(), 'O wins!')

Fix the bug

Here’s the application code that fixes the bug:

def disable_all_squares(self):
    for i in range(3):
        for j in range(3):
            self.square(i, j).disable()

The application needs to invoke that method when a game has been won.

Verify the bug is fixed

If you now run the tests they all pass, so it’s safe to check in your changes.

Success! You now have a working, tested application.

Resources

The code for this article is on GitHub.

guizero is available on GitHub.

You can install it via pip.

pip3 install guizero

Documentation is available here.

The book ‘Create Graphical User Interfaces with Python’ is available from the MagPi website.

I mentioned two other books:

Refactoring by Martin Fowler, and Clean Code by Robert C. Martin.

Questions? Ask in a comment, or tweet me at @RAREblog.

Image credits:

Micro:bit images courtesy of https://microbit.org Radio beacon: https://en.wikipedia.org/wiki/File:Wireless_tower.svg

Monday, 25 April 2022

Choosing a Python library

You’re working on a Python project, and you realise the next thing to do is a bit tricky. You don’t want to reinvent the wheel if you don’t have to. You wonder: has someone solved this problem before?

The first place to look is the Python Standard Library. One of Python’s great strengths is that it comes with batteries included; there are well-documented, tried and tested libraries to do all sorts of useful things.

No luck? Turn to GitHub for help - it usually can! Most of the libraries I use are hosted on GitHub.

Sometimes you’ll just find one candidate library; sometimes there will be more than one. You’ll need to decide if any fit the bill, and which looks best.

As we’ll see, GitHub can tell you a lot about the quality of the project.

Here are the things I like to ask about a library I’m considering. I’ve illustrated the checklist using the guizero project as an example, since I use it a lot and it ticks all the boxes.

Does it have good documentation?

Is the intended use clear, and does it match your requirements?

I’ve sometimes been caught by a library that looks as if it does the job but actually does something different.

And if it says it’s doing what you want, do the docs show you how to install and use the library?

Is the project active and well supported?

GitHub is your friend here. It’s easy to check how “alive” the project is, right from the project’s home page. You can use it to find out

  • When was the last update?
  • How many issues are there?
    • How many unresolved issues are there?
    • How long have they been around?
    • Are pull requests dealt with quickly?
  • How popular is the project?
    • How many people are watching it?
    • Is it often starred?
    • Has it often been forked?
      • Forking is a process for creating a personal copy which you can use for modification, or to suggest fixes or improvements via pull requests.

What is the quality of the code?

Does it support Python 3? Most libraries do these days, but if it doesn’t that is a show stopper.

Is the code readable? I look for good naming of functions, classes, methods, and variables, and I like the use of type hints where these help.

Is the code sufficiently commented?

Is it well structured: simple and short?

Is it sufficiently performant? This may not matter, but if it does, it might matter a lot.

Does it have appropriate licensing?

In my case I look for a licence that's compatible with the MIT licence but your choice will depend on your intended use and your context. If you’re a solo developer you will be able to decide for yourself but in some situations you may need to check Corporate Policy, and may even need to talk to your employer’s legal department.

Does it have a large, supportive community?

If the library has a Slack or Discord channel you may be able to dip in and quickly get the feel of the community. Is it friendly, respectful and helpful?

Is the library easy to learn?

Are there links to Tutorials, Books or Courses? Ideally you should be able to see how to get going straight from the README. (There is a README, isn’t there?)

Is it written by authors I know and trust?

That’s not essential, but it’s very reassuring.

Does it have a sensible, consistent API?

A good API satisfies the principle of least surprise: if you guess how to use it you’ll probably be right.

Does the author specifically encourage pull requests to add features or fix bugs?

Check the language of the documentation (especially the README) to see if the author wants to hear if you find a problem. It often indicates their mindset when writing it - do they intend for it to be used in a collaborative fashion or is it a “one man show”?

Are there automated tests?

I can write very simple test-free code, but as soon as things get at all complicated I know I need tests to keep me on track. If I am using other people’s code, plenty of automated tests reassure me that the code is likely to work. They also indicate that I can refactor the code or extend it safely if I need to.

Summary

A few minute’s search on GitHub can tell you a lot about whether you should use a 3rd party Python Library.

I hope you find this checklist helpful, and I welcome constructive feedback.

Thanks to Ben Nutall (@ben_nuttall), Michael Horne (@recantha) and @BrianLinuxing for their helpful suggestions.