Static site migration – we have working comments with isso!

One “biggie” that was holding up this blog’s migration to a static site was getting a comments system up and running, followed by importing the existing comments. I had picked Isso a while back as it allows for easy import of existing comments from WordPress. I really didn’t want to depend on a third party comment hosting service like Disqus. I also didn’t want to use Staticman, mainly because it has dependencies on other services like Github or Gitlab. So Isso it was as that allows me to host everything on my own server.

Setting up Isso on FreeBSD

The Isso documentation recommends setting up Isso via virtualenv. As the server is only used as a web server (and I’m your typical “lazy” developer), I opted to ignore that part. This worked fine, the only hiccup I ran into was that the version of Isso I tried to install via pip had an API disagreement with the version of Werkzeug that it tried to install by default. That took a little time to figure out and I managed to resolve it by installing a specific version of Werkzeug (0.16.1): pip install werkzeug==0.16.1.

Once that was out of the way, it was a simple matter of installing and configuring Isso following the documentation. I opted for the sub-URI configuration as that makes it easier to migrate to the server URL later. I’m not going to detail the configuration here – the Isso docs are good and it’s easy to configure.

Getting Hugo to play nicely with Isso

Ah yes, the fun part. I think I’ve mentioned before that I’m not really a web front end developer, so that took a little more figuring out. Fortunately the theme already has support for Disqus via the default Hugo integration, so between that, the Isso docs and the Staticman integration, I was able to figure this part out surprisingly quickly. In the end, it boiled down to adding another section to the site’s config.toml and adding the following code block to the theme’s layout/_default/comments.html:


{{/* Add support for ISSO comment system */}}
{{ else if .Site.Params.isso.enabled }}
<article class="post">
<script data-isso="{{ .Site.Params.isso.data }}" data-isso-require-author="{{ .Site.Params.isso.requireAuthor }}" data-isso-require-email="{{ .Site.Params.isso.requireEmail }}" data-isso-reply-notifications="{{ .Site.Params.isso.replyNotification }}" src="{{ .Site.Params.isso.jsLocation }}"></script>
<noscript>Please enable JavaScript to view the comments powered by <a href="https://posativ.org/isso/">Isso</a>.</noscript>
<div>
<section id="isso-thread"></section>
</div>
</article>

The complete template is here in my Github repository. I’m planning to clean up the change and submit it back to the original theme’s author. With the above change and some more debugging done, I had working comments. I had already imported an older version of the comments so I could quickly check that everything was working as expected.

As usual, all of this needs a bit more tweaking but it’s functional.

What’s left to do?

To switch over from the current WordPress site to this one, not much. A server with a little bit more oomph, probably :). I also need to make sure that I can back up the comment database. That’s pretty easy and just a matter of rsync-ing the database file.

In the longer run, I still want to address some of the performance issues that Pagespeed Insights highlighted:

  • The theme uses FontAwesome, which is, err, awesome. However that means that the initial page load requires about ~250kB of glyphs, of which it’s only using a handful. It looks like it might be possible to reduce that font load by switching to SVGs.
  • The theme also uses jQuery. As it supports fancybox, jQuery is pretty much a required dependency if you use fancybox. Which I don’t, so it would be nice to get rid of the jQuery dependency.
  • Some general tweaks, mostly visual, but in general I’m pretty happy with the way the site looks now.
  • Some sort of analytics. Google Analytics would be the easy button, but also looking into some of the other options that are self hosted (spot a theme here?). Right now I use goaccess for basic analytics.

All in all, I think I’m pretty close to switching over to the static site. I’ll probably give it a shot next weekend and see how things work out.

Note: Comments on the static site are still transient. I expect to do one last reload of the WordPress comments before I switch over, so comments only made on the static site will likely disappear. Sorry, the static site is still beta.

Setting up enchant for use with flyspell-mode on macOS

I have a few more loose ends to tidy up before switching to the static version of the blog. One of the important tasks was to make sure I had a spell checker available. Back in the dim and distant past I had set up flyspell-mode with hunspell, but I wanted to check if there was something better available these days. Enter enchant, which acts as a front end to multiple, different spell checkers. I like that Emacs has included support for enchant since version 26, plus one of the backends enchant supports is AppleSpell. In other words, when running on macOS, flyspell can make use of the OS’s built in spell checker and dictionaries.

Instructions on how to actually set up enchant on macOS are a bit thin on the ground, so I decided that I’ll put together a quick write up.

Read More

Static site migration – starting the optimisation, already

Now that I’ve got the static site up and running, it’s obviously time to switch over immediately, right? Not to fast. After QA’ing my deployment process in production, it was time to check how the two compared from a performance perspective. I like to use several different tests, starting with Pingdom, then using PageSpeed Insights for more details.

The Pingdom speed test gave it a thumbs up, but they’re not running the currently dominant search engine. Fortunately said search engine also offers performance check tooling. This wasn’t quite the thumbs up I had hoped for, though. While the mobile performance is similar – in other words, equally unimpressive – the desktop performance is pretty good for both sites. The WordPress site still has a slight advantages, but after some initial tweaks like disabling highlight.js (the static site uses the basic Hugo highlighter), the static site is pretty close.

PageSpeed Insight comparison of the static (Hugo) site vs the old WordPress site. Both are in the low 50s.
Mobile site comparison – static vs WordPress
Side by side comparison of the PageSpeed results for the new static site vs the existing WordPress site
PageSpeed insight for new static site vs WordPress (desktop)

Clearly, in both cases the static site needs work. Admittedly I have neglected the mobile performance for both sites somewhat. It’s not like anybody ever looks at sites like this on their phones or tablet, right? Oh, wait. Might want to do something about that.

I got to the numbers above after I made a few tweaks to the theme already – adding a canonical link and adding support for Google and Bing site verification. Neither should affect performance. The only performance tweaks so far were disabling the use of highlight.js. I use Hugo’s built in highlighter and overlooked at the theme’s default config.toml from its exampleSite also enables highlight.js

The PageSpeed analysis indicates to me that the two big issues are the font and JavaScript downloads, with the fonts being the larger and harder to fix issue. The servers – well, VMs – I am running these on are also not the same size and type. Unsurprisingly the WordPress VM has more cores and more RAM compared to the static site.

While I’m no front-end web developer by any stretch of the imagination, trying to improve this theme for my needs might prove a useful learning experience. I apologise in advance for more meta blogging posts.

Static site should be fixed now

Ah yes, the guy who used wear the “I don’t often test my code, but if I do, I do it in production” T-shirt in an ironic way followed his own advice, unironically.

The deployment script was ultra efficient and mainly removed the static site when updating it. Think about all the bandwidth this conserved!

Anyway, the static (beta) Hugo site should now be available and accessible. Feedback is still appreciated :).

Moving this blog to a static site – this time I’m serious (because org-mode)

I have been toying with the idea of migrating this blog to a static site to simplify its maintenance for some time. While WordPress is a great tool, this blog is a side project and any time I have to spend maintaining WordPress gets deducted from the time I have to write for the blog. Keep in mind that I’m self-hosting this blog and it’s actually running on a Linux VM that only handles the blog. That is yet another server that I need to administer, and it’s the odd one out, too, as all of the others are FreeBSD or OpenBSD servers.

Oh, and one of the big advantages of using a static site generator is that the whole site can be put in version control, archived there and also quickly re-generated from there should I manage to spectacularly mess up the host. A WordPress restore is a fair amount more work, as I found out about two years ago.

The only parts that don’t back up that well are the comments, especially if I use a self-hosted system for them.

Migration is fine, but what to?

For that reason, I had experimented with moving the blog to Jekyll for quite a while. In fact, I have maintained a parallel Jekyll blog for years now. For a smooth switch over, the devil was in the details though, and I was never happy enough to actually migrate and shutdown the WordPress site.

After another patchapalooza, I revisited the idea of converting the blog to a static site and looked at several other static site generators. I also decided to resurrect an experiment I had started with Hugo. My first attempt hadn’t been very successful, so I stuck with Jekyll. This second attempt however is a lot more successful as it addressed a couple of the issues that kept me from being happy with Jekyll:

  • Hugo builds the site much faster than Jekyll
  • I found a theme that I’m really happy with.
  • It looks like some of the details that I found hard to get right in Jekyll are already handled nicely in Hugo. This is mostly around the RSS feed generation – Hugo by default generates category feeds in addition to the regular “full” feed. It also looks like the category feeds are very similar path-wise to what WordPress uses, so hopefully the RSS feeds should continue working with the couple of aggregators that picked up this blog.

Fortunately, this time I succeeded in importing the site from Jekyll into Hugo. It took some work to clean up some of the artifacts that originated in the WP-to-Jekyll conversion, and then I had to convert the Liquid template code to Hugo shortcodes. The latter was pretty easy, but a bit tedious. Just not tedious enough to write a script to do it.

Hugo can render org-mode files (and so can Jekyll, apparently)

Another nice feature is that my workflow can still use org-mode for when I want to use it to write longer posts. Shorter posts I usually crank out in Markdown, but for the more complex posts it’s nice to have org-mode as an option. Especially if the post contains source code.

While the setup I described before with using org-mode to post to WordPress is mostly working, it’s a bit clunky especially if you’re using more than one computer to work on blog posts. Both Jekyll and Hugo can handle org-mode files directly – in fact, even though this file would be as effortless to write as Markdown, what you are currently reading is actually a processed org-mode file. Assuming that you’re reading this on the static site, that is.

What’s left to before the actual cutover?

There are two fairly large items I have to tick off the todo list before I can cut over for good.

  • Migrate the comments. Now this blog doesn’t get a ton of comments, but I appreciate all of them and am trying to migrate them over to this blog. As I believe in self-hosting everything, I’m experimenting with isso, and I need to redo some of the experiments with Hugo as so far, they’ve been completely focussed on Jekyll. Hugo does have built in support for disqus, plus this theme has support for staticman, but neither of the two are my preferred alternatives.
  • QA the site, especially the RSS feeds so that they hopefully keep working as they do right now.
  • There are still some minor tweaks to be done, analytics integrated and all that, but those are probably going to happen slowly after the cutover.
  • Oh, and the last item is all of the draft posts that have accumulated on WordPress that I somehow never got around to finishing.

The beta of the static site is here if you want to have a look. Any feedback is welcome.

Configuring MongoDB Java driver logging from Clojure using timbre

I’ve mentioned in the past how you can configure the MongoDB Java driver output from Java. Most Clojure applications that use MongoDB use a database driver that wraps the official MongoDB Java driver. I personally use monger for a lot of my projects, but also occasionally created my own wrapper. The methods described in this post should be applicable to other Clojure MongoDB drivers as long as they wrap the official MongoDB Java driver. They should also work if your application directly wraps the MongoDB Java driver itself rather than using a more idiomatic driver.

The MongoDB Java driver helpfully logs a lot of information via slf4j under the assumption that your application configures its output to an appropriate place. That’s all very nice if you already use slf4j and have it configured it accordingly. However, if you use Clojure (or you’re me) that’s probably not the case. At least I didn’t set up slf4j in any of my Clojure code.

Enter timbre. Timbre is a logging library for Clojure that optionally interoperates with Java logging libraries like slf4j. You can use the interop for slf4j via the slf4j-timbre library and most importantly for me, control the slf4j logging output from Clojure directly.

The timbre configuration for timbre 5.1 that I use in some of my Clojure projects  that interact with MongoDB is as follows:

(timbre/merge-config! {:min-level `[[#{"org.mongodb.*"} :error]]})

This configuration sets a minimum log level of error for any log output from the org.mongodb namespace. This is the namespace the Java driver uses for logging. Normally you wouldn’t just configure the minimum log level for the MongoDB Java driver, but for several different tools. As a result the timbre configuration for my projets tends to look more like this:

(timbre/merge-config! {:min-level `[[#{"org.mongodb.*"} :error] [#{"clj-ssh.*"} :warn] [#{"*"} :debug]]})

The configuration above sets a default minimum log output level of :debug. It also sets two additional log levels specific to the org.mongodb and clj-ssh namespaces. Obviously if you use other tools or have other namespaces that require special handling, you would add those to the configuration settings as well.

Note that using timbre and slj4f-timbre requires the following dependencies in project.clj:

:dependencies [[org.clojure/clojure        "1.10.0"]
               [com.taoensso/timbre        "5.1.0"] 
               [com.fzakaria/slf4j-timbre  "0.3.14"] ;; Attempt to send all log output through timbre
               [org.slf4j/jul-to-slf4j     "1.7.14"]]
My Timex 1000/Sinclair ZX81

I bought the first computer I ever wrote a program on

I don’t usually do Happy New Year posts, but given how “well” 2020 went I thought it was appropriate to start 2021 with a whimsy post.¬† This post is probably going to date me since it’s been a few years – OK, decades – since these were current.

Well, it’s not the actual computer, but the same model. I was first exposed to computers during the personal computer heyday of the early 1980s. Back then, my school had two computers, one TRS 80 Model 3 and one Sinclair ZX81. The ZX81 was used to teach pupils rudimentary programming. I wouldn’t be surprised if one of the teachers actually built it from a kit as that was the cheapest way to get into one.

Keep in mind that I grew up in Europe where computers like the Apple ][ were very expensive and didn’t gain much traction in the educational field. Or with hobbists, either. Yes, there were some around but you saw a lot more VIC20s, C64s or Ataris. A lot of schools including mine bought European manufactured computers like Sinclairs and later, Amstrad/Schneider CPC 464s.

Read More

Setting up rdiff-backup on FreeBSD 12.1

My main PC workstation (as opposed to my Mac Pro) is a dual-boot Windows and Linux machine. While backing up the Windows portion is relatively easy via some cheap-ish commercial backup software, I ended up backing up my Linux home directories only very occasionally. Clearly, Something Had To Be Done ™.

I had a look around for Linux backup software. I was familiar with was Timeshift, but at least the Manjaro port can’t back up to a remote machine and was useless as a result. I eventually settled on rdiff-backup as it seemed to be simple, has been around for a while and also looks very cron-friendly. So far, so good.

Read More

MongoDB tips – How to find documents containing a non-empty array in MongoDB

Why do we need another post cluttering up the Interpipes on how to find a set of documents in a MongoDB collection that contain a non-empty array field? It’s not like we suddenly have a shortage of posts and articles on this topic after all. Well, it maybe a shocking revelation but not all of these posts and Stack Overflow answers are, well, as correct as I’d like them to be.

Go on, what’s wrong with the typical approach then?

When you search for ‘how to find MongoDB documents with non-empty arrays’, you usually end up finding suggestions like db.test.find({ my_huge_array: { $exists: true, $ne: [] } }) or variations thereof. And this approach works most of the time, especially if you have a fairly rigid schema and a little helping of luck. The only problem with this approach is that, well, it’s almost correct but not correct enough to be robust. But, I hear you say you’ve used this find clause in production for years and it just works, what I am talking about?

Read More

Building an OpenBSD WireGuard VPN server part 3 – Unbound DNS filtering

In part 2, I reconfigured my WireGuard VPN to use an Unbound DNS server on the VPN server rather than rely on a third party server I had used for the original quick and dirty configuration. It was important for me to set up a validating DNS server, which I did in that part.

In this part, I’m extending the existing configuration to include some basic block lists for known ad and tracking servers. As I’m mainly trying to use the VPN while on the road, I want to ensure that anything I end up doing using the VPN is as secure as I can make it with reasonable effort. That makes tracking and preventing malicious ads the next step. That said, I’m not planning to go for a full Pi-Hole like setup. Initially, I am trying to do is integrate one known good blocklists into the Unbound configuration and automate the process. I can get fancy with a more Pi-Hole like setup later if I want to.

Picking a DNS filter list and using it from Unbound

I started by using a single DNS block list from StevenBlack’s github repo. deadc0de.re’s blog has a good how-to post including the necessary awk incantations for converting the file into the format Unbound needs. I used that blog post as my as my starting point.

The process overall is relatively simple. First, pick the flavour of blocklist you want. Download the hosts file for the blocklist. I used wget to download it as it’s already installed on my OpenBSD system, but curl would also work. My assumption is that you’re running these commands in /var/unbound/etc, so all the absolute paths refer to that location.

wget -O /tmp/blocklist-hosts https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts

This list is in the traditional host file format, so we need to convert it into the format that Unbound needs. Fortunately, deadc0de.re’s blog post contains the necessary code to accomplish that transformation:

cat /tmp/blocklist-hosts | grep '^0\.0\.0\.0' | awk '{print "local-zone: \""$2"\" redirect\nlocal-data: \""$2" A 0.0.0.0\""}' > adblocker.conf

We also need to update our unbound.conf file to include the freshly generated host blocklist. To do that, we have to add an include statement to the server section of the configuration file. The change looks like this:


server:
  ...
  include: /var/unbound/etc/block-ads.conf

I also had to set up unbound-control to ensure that I didn’t have to restart the server every time I updated the block list. Thus, I had to make a slight detour that involved running unbound-control-setup to generate the necessary certificates. I also needed to add the following statement to my unbound.conf:

remote-control:
  control-enable: yes

Note that remote-control is its own section and is not part of the server: section. The default setup only accepts remote control connections on the loopback interface. This is perfect for my setup, so I didn’t need to make any other changes.
After restarting unbound to pick up the configuration changes I was able to test the VPN with the new DNS blocking and it proved to be mostly effective on the websites I visited.

Automating the process

Of course I’m a software engineer by background, so clearly I wanted to automate the process. Fortunately this is about as complicated as pasting the commands I listed above into a shell script and periodically run it. My script looks like this:

#!/bin/sh

cd /var/unbound/etc
wget -O /tmp/black-hosts https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
cat /tmp/black-hosts | grep '^0\.0\.0\.0' | awk '{print "local-zone: \""$2"\" redirect\nlocal-data: \""$2" A 0.0.0.0\""}' > block-ads.conf
rm /tmp/black-hosts
unbound-control reload

I’m running this script via cron once a week, as that seems to roughly correspond to the update frequency of the list. I may change that in the future, especially if I end up combining the multiple block lists instead. In the meantime I don’t think it’s necessary to run the cron script more often.

What’s next?

Right now it looks like WireGuard VPN server is working as intended and does what I need it to. TuM’Fatig has a great blog post on expanding the approach I’m using to include all of the public block lists that Pi-Hole is using, so I’m probably going to look into that sooner or later. Other than that and keeping the machine up to date, the next project will be to create orchestration scripts to recreate the server from scratch. I found setting up WireGuard on OpenBSD much easier than doing it on a Linux host. Nevertheless having a Terraform or Ansible script that allowed me to quickly stand one up from scratch if the current instance was damage would likely be a good thing.

Would I go the OpenBSD route again?

I had mentioned at the start of this series that I hadn’t had done much with OpenBSD for years. Was it worth choosing OpenBSD over FreeBSD, which I use on a very regular basis? Maybe, maybe not. OpenBSD required less additional fiddling to secure it out of the box. Some of the time saved was taken up by reacquainting myself with the system, so I’d call that a wash. I suspect that setting up the actual WireGuard VPN instance would take the same amount of time on FreeBSD as I’d run the same commands, so no clear advantage there. I do like the newer version of pf that comes with OpenBSD – FreeBSD’s is a few versions behind and is missing some features, and I doubt it’ll ever catch up.

Put that down as a ‘maybe’.

How well does WireGuard work for me?

So far it is proving performant enough to run on one of Vultr’s $5 instances that aren’t exactly swimming in computing power. Of course this is only a personal VPN and I’m not running a whole company’s worth of traffic across it. That would very likely require more oomph (technical term) on the server, but overall WireGuard VPN appears to be both very lightweight and more importantly, easy to configure both at the server and client end.

Throughput is good enough to use it to watch YouTube videos in good quality on crappy hotel Wifi without visibly dropping frames. That’s good enough for me as a performance metric because it shows that the VPN server is not the bottleneck. The real use cases for the VPN really are for accessing more sensitive information like financial information, and that is usually all interactive anyway.