Highlighting (the Good, the Bad, and the Unencoded)

An approach to dynamic code markup in HighlightJS with Fira Code ligatures for extra flair

Houston Haynes

5 minute read

Flexing with Flexibility

This site shows examples of my work in a variety of domains. Not only does “full stack engineering” require ventures into territories far and wide, my career has also migrated through several language families over the years. And as you see on this site I still return to them on occasion. So along with insert “SQL + Data Science language of choice” here the ability to display an array of formats is more than a nice-to-have. I started with HighlightJS, “out of the box” as it is part of the Hugo template I selected, but it still required some tweaks to encode HTML and related formats.

Stage One: Encoding on the Fly

It all started with wanting to have it both ways - meaning I wanted to be able to view/edit the code snippet in the editor and I wanted the markup to show up correctly on the page. A bit of research online revealed this gem in JavaScript:

Encodes special characters within a code block
This gave me the benefit of clarity during edit. Not every code block “lights up” in the editor like the sample below, but the formatting and general layout is preserved. If you’ve ever tried to edit encoded markup inline you know what a mess it can become if you miss just one encoding character.
VSCode screen grab of HTML from Serverless Contact Page Sidebar

It wouldn’t be impossible to maintain that portion of the page “encoded” if it was the only one. But imagine a code-snippet-heavy site with those little “land mines” sprinkled throughout. Things become brittle, and issues become time consuming even when the change seems innocuous. So this approach gave some latitude in seeing code blocks in a form that’s very close to how they’d appear on the page, which makes maintenance more approachable.

Stage Two: Little Bugs Everywhere

But as usual, there was a catch. Once a “quick fix” was in place it rippled out to adversely affect and other snippets. Honestly, I didn’t spot the issue as a code issue until I brought in Fira Code, to express ligatures.

This R snippet…
hc <- highchart() |>
…looked like this.
hc <- highchart() %>%
And that SQL snippet …
... + datetime >= to_date ...
…looked like that.
... + datetime >= to_date...

This wrinkle robbed Fira Code of its ability to make a nice, concise impression. At first I thought I had pasted in poorly marked-up code. But then I looked deeper to realize the inserted blocks were fine. After some head-scratching I checked the JavaScript doing the inline-encoding, and it began to dawn on me that it was “too broad” for what I needed.

In the version above I set exclusions for the code classes that I didn’t want encoded. The “not” indicator seemed to do the trick, until I kept running into more and more exceptions. At that point there were more “not” clauses than actual code that required encoding. So I flipped the script and simply declared which classes I wanted to encode.

This is both more logical and more maintainable. I only need to add a class to that list in the declaration if and when I have another language that requires markup. As a side note, I’ll say that I don’t use Django, but here it’s used to highlight the Hugo shortcodes, which use double-braces in the HTML. I spotted that VSCode auto-detected those files as Django - so I used that as a cheat code to highlight the double-braced parameters.

Final version of JavaScript - formatted as JavaScript
document.querySelectorAll("code.html, code.xml, code.django").forEach(function(element) {
    element.innerHTML = element.innerHTML.replace(/&/g, "&").replace(//g, ">").replace(/"/g, """).replace(/'/g, "'");
});

HighlightJS is pretty great, but it does struggle with certain things. That’s why I used screen shots of the JavaScript code along with the embed, which tries to encode escaped characters. It just goes to show that even highly-regarded, long-lived OS projects like HighlightJS are also works in progress. But between the screen grabs and the embedded block above, anyone looking to lift this for their site should be able to piece it together.

Stage Three: Exceptions to the Rules

So after that excursion I still found that certain things were not displaying correctly. It’s not that it looked bad per se, but it wasn’t the detailed code block that I wanted to show. If you look at the first image below, you’ll see that you can get the general idea of the HTML that’s returned by the JavaScript snippet. But like the rest of the code showcased on this site, specifics matter. So I made a compromise and simply hand-encoded the HTML blocks and pasted them into the JS to “just make it work” for this case.

JavaScript with inline HTML
Encoded HTML placed inline
Final view of corrected markup

I can live with these small segments of encoded HTML inline within the sample. And honestly, I wouldn’t even know where to begin to make a programmatic solution work in this mixed case. Since the pieces of code are relatively small, and (so far) it’s the only one like this on my site, I’m pretty happy to let it go at that. But the bigger win here is the ability to work more quickly in dropping in code snippets un-doctored and continue working. That not only lets me focus on “the fun part” of doing the actual work, but it also relieves some of the friction in considering which code samples I might present because of something being un-friendly when displayed in a code block. Now, pretty much everything is fair game.

Key Value
BuildDateTime 2021-11-07 14:44:45 -0500
LastGitUpdate 2021-11-07 14:13:35 -0500
GitHash 765a5f1
CommitComment capitalization cleanup