APL and Python go head-to-head

Markdown is great, but...

I've encountered a problem.

I use Markdown a lot. Since it's pure text-based markup, it's handled well by Git and GitHub. That helps me keep track of versions, and it simplifies text merges if I'm collaborating on a writing project.

A lot of my writing features Python code, and I like to work on the code and the article in the same editor.

Fortunately there's great support for editing and displaying Markdown in both PyCharm and VS Code.

Markdown is supported by lots of FOSS tools which make it easy to convert between Markdown and other formats.

I regularly use pandoc to turn Markdown in to pdfs and to publish on Leanpub, and I use the markdown package to convert Markdown files into HTML which I can paste into Blogger.

(Pandoc can create HTML but the output is not directly usable in Blogger.)

So I have a problem.

Much of markdown is standardised but the pandoc and markdown programs handle code blocks differently.

In pandoc, Markdown code blocks are enclosed in triple backticks, with an optional description of the code language.

The markdown program expects code blocks to be indented by four spaces with no surrounding backticks.

I often want to take a Markdown document and create both HTML for the blog and a pdf for people to download, but that requires two different formats for the source document.

I could make the changes manually but that is tedious and error-prone. I decided to write some code to convert between the two formats.

I'd normally write that sort of code in Python, but I also use APL. I wondered how the two approaches would compare.

I first met APL in (cough) 1967 or 1968, and the version I learned then lacks many of the modern features in Dyalog APL.

Luckily there are some very competent and helpful young developers in the APL Orchard community. If you post a problem there you'll often get an immediate solution, so I can easily improve on my dinosaur-like approach to APL problems.

Today I am going to try to find the best solution I can in APL and compare it with a Python version. I'm not worried about performance, since I know each approach is more than capable of converting my documents faster than the eye can see.

I'm more interested in the different approaches. APL is a functional array-oriented language; Python supports functional programming, but most of us use a mixture of procedural and Object-oriented code.

I created a Python solution fairly quickly.

from typing import List


class Gulper:
    def __init__(self):
        self.is_reading_markdown = True
        self.result = None

    def gulp(self,line: str):
        if self.is_reading_markdown:
            self.read_markdown(line)
        else: self.read_code(line)

    def read_markdown(self, line):
        if line.startswith('```'):
            self.is_reading_markdown = False
            return
        self.result.append(line)

    def read_code(self, line):
        if line.startswith('```'):
            self.is_reading_markdown = True
            return
        self.result.append('    %s' % line)

    def convert(self, lines: List[str]):
        self.result = []
        for line in lines:
            self.gulp(line)
        return self.result

It's pretty straightforward; it's essentially a state machine which switches between reading text and reading code whenever it encounters a line with three back-ticks.

Here's the APL:

conv←{t←⊣/'```'⍷3↑⍤1⊢⍵ ⋄ n←2|+\t ⋄ (~t)⌿(¯4×t<n)⌽⍵,⍤1⊢4⍴' '}

I've broken the function down into simpler parts and explained each line by line here.

Comments

Popular posts from this blog

Controlling a Raspberry Pi Pico remotely using PySerial

Five steps to connect Jetson Nano and Arduino

Raspberry Pi Pico project 2 - MCP3008