A concise solution to a fiddly coding problem

I
My pomodoro timer is coming on nicely. I'll post about progress in the next day or so.

For now, here is a short story about a problem I hit while working on the project, and the happy solution that I came up with.

For some time I've kept an online journal for each project I'm working on. Like some blogs, the journals used to have the latest entries at the top.

Here's a sample journal file:

# Project journal for zero-web
 

## Thursday 07 February 2019
 
I added 2 new pages.

## Monday 04 February 2019

I've created a homepage.
I'll serve it with websocketd.
I found that order confusing.

I decided that I'd prefer the posts ordered as they would be in a paper diary, with the latest posts last. That way I could read the  project history like a book.

The problem: I have a lot of project journals, and I really didn't want to edit them all by hand.

I decided to write a program to do it.

The problem was simple to solve using APL. Regular readers will know that APL is one of my favourite tools for data manipulation and analysis. APL is an array-oriented functional language with a huge range of primitive functions and operators, and APL programs tend to be very concise.

APL uses lots of special character, along with a lot of the arithmetic symbols you learned at school, so I'll explain what the program does step by step.

If this spikes your curiosity there's a link to help you find out more about APL at the end of this post.

I already had a tiny APL function called read which reads a file as a vector of character vectors. The result is a little hard to parse visually so I've used a function called show to display the vector in column format.

Here's the first step:

     show read 'journal.md'

┌──────────────────────────────┐
│# Project journal for zero-web│
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│## Thursday 07 February 2019  │
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│I added 2 new pages.          │
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│## Monday 04 February 2019    │
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│I've created a homepage.      │
├──────────────────────────────┤
│I'll serve it with websocketd.│
└──────────────────────────────┘ 

Next I found out which lines correspond to the beginning of a post; they start with ## followed by a space.

     mask ← '## '∘≡¨3↑¨file
     mask
0 0 1 0 0 0 1 0 0 0

The 3↑¨ reads 'three take each'; it takes the first three characters of each line of the file. The '∘≡¨ checks to see if the start of each line matches the string '## '. The result is mask - a vector of boolean values that told me which lines in the file were the starts of dated journal entries.

That's almost what I wanted for the next step. I snipped the file into segments which correspond to posts but I wanted to keep the first bit of the file as well. In other words, I wanted the first element of the mask to be a 1, not a zero. I achieved that by dropping the first element and then sticking a one on the front.

In APL that was easy:

      mask ← 1, 1↓mask
      mask
1 0 1 0 0 0 1 0 0 0


Now I needed to chop the contents of the file into posts so I could modify their order. I used an APL function called partitioned enclose to do just that.

Here's the code and its result:

      show posts ← mask ⊂ file
┌─────────────────────────────────────────────────────────────────────────────────────┐
│┌──────────────────────────────┬┐                                                    │
││# Project journal for zero-web││                                                    │
│└──────────────────────────────┴┘                                                    │
├─────────────────────────────────────────────────────────────────────────────────────┤
│┌────────────────────────────┬┬────────────────────┬┐                                │
││## Thursday 07 February 2019││I added 2 new pages.││                                │
│└────────────────────────────┴┴────────────────────┴┘                                │
├─────────────────────────────────────────────────────────────────────────────────────┤
│┌──────────────────────────┬┬────────────────────────┬──────────────────────────────┐│
││## Monday 04 February 2019││I've created a homepage.│I'll serve it with websocketd.││
│└──────────────────────────┴┴────────────────────────┴──────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────────────────┘

Now I wanted to take the first snippet, and stick it in front of the remaining snippets in reverse order:

      show reordered ← (1↑posts),⌽1↓posts
┌─────────────────────────────────────────────────────────────────────────────────────┐
│┌──────────────────────────────┬┐                                                    │
││# Project journal for zero-web││                                                    │
│└──────────────────────────────┴┘                                                    │
├─────────────────────────────────────────────────────────────────────────────────────┤
│┌──────────────────────────┬┬────────────────────────┬──────────────────────────────┐│
││## Monday 04 February 2019││I've created a homepage.│I'll serve it with websocketd.││
│└──────────────────────────┴┴────────────────────────┴──────────────────────────────┘│
├─────────────────────────────────────────────────────────────────────────────────────┤
│┌────────────────────────────┬┬────────────────────┬┐                                │
││## Thursday 07 February 2019││I added 2 new pages.││                                │
│└────────────────────────────┴┴────────────────────┴┘                                │
└─────────────────────────────────────────────────────────────────────────────────────┘

Finally, I wanted to stick them all back together again. I used ',' which is APL's catenate function. It appends the value on its right to the value in its left.

/ is APL's reduce operator, so ,/ is a catenate reduction.

Applied to a vector, it sticks all its elements together using catenate.

Here we go:

 show ⊃,/reordered
┌──────────────────────────────┐
│# Project journal for zero-web│
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│## Monday 04 February 2019    │
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│I've created a homepage.      │
├──────────────────────────────┤
│I'll serve it with websocketd.│
├──────────────────────────────┤
│## Thursday 07 February 2019  │
├──────────────────────────────┤
│                              │
├──────────────────────────────┤
│I added 2 new pages.          │
├──────────────────────────────┤
│                              │
└──────────────────────────────┘

Of course it would have been tedious to type all that in for each file I wanted to process, but I could combine all the steps into a single APL function.

      munge ← {⊃,/(1↑p),⌽1↓p←⍵⊂⍨1,1↓'## '∘≡¨3↑¨⍵}

Now I had a vector of vectors, ready to write out. I used a function I wrote earlier, called update, which creates a backup file and overwrites the original.

I combined the read, munge and update functions together into a function called fix.

     fix←{⍵ update munge read ⍵}

I had one last thing to code. I wanted to find all the project files called 'journal.md' located under the directory where I keep active projects. I could code that in APL, but Unix's find comment does that very well. Fortunately, APL lets me invoke a Unix command and capture the result as an array. The function find did just that.

      find←{⎕SH'find ',⍵,' -name ''journal.md'''}

So now I just needed to type

      fix ¨ find '~/git/active/'

and the job was done!

To recap, here's all the code I wrote and ran to solve my problem:

      munge ← {⊃,/(1↑p),⌽1↓p←⍵⊂⍨1,1↓'## '∘≡¨3↑¨⍵}
      fix←{⍵ update munge read ⍵}
      find←{⎕SH'find ',⍵,' -name ''journal.md'''}
      fix ¨ find '~/git/active/'

Now I'm going back to working on the Pomodoro timer :)

If you're curious you can learn more about this incredibly powerful language for data manipulation on Dyalog's website.

Comments

  1. I often comment that it's a question which function key is most valuable in CoSy : F6 which executes the line under the cursor , or F11 which puts down a timestamp .

    My regular log is kept in the variable ` text . This diary is just text with days simply separated by " daylns " which can be inserted by F12 or in bulk by a simple expression .

    A very commonly used line I keep in ` state . is , eg :
    ` text Dv@ daylncut >t0> s" sols" con
    which retrieves ` text , splits it on daylns , and here found just one day containg "sols" :

    | ======================== | Fri.Dec,20181221 | ======================== |
    ... | 1050 | NL 20181221.1053 | 1403 |
    1523 | Solstice | day 7:14 am - 4:40 pm Total:09:27
    | 2033 |

    If a bunch of days are returned , or I simply want to my entire text in reverse order , that would be :

    ` text Dv@ daylncut reverse

    ReplyDelete
    Replies
    1. Neat, and very practical. CoSy's blend of Forth and APL is very interesting.

      Delete

Post a Comment

Popular posts from this blog

Controlling a Raspberry Pi Pico remotely using PySerial

Five steps to connect Jetson Nano and Arduino

Raspberry Pi Pico project 2 - MCP3008