Eleventy uses markdown-it by default to render Markdown, but I use remark and rehype. My original motivation was more control. While I’m not sure remark really offers that, I’ve customized it a fair bit at this point.

Either way, code blocks on A Place For My Head are highlighted by rehype-prism. I wanted to split each block into individually-wrapped lines, so that:

HTML<pre><code>foo
bar
baz</code></pre>

Becomes:

HTML<pre><code><span class="line">foo</span>
<span class="line">bar</span>
<span class="line">baz</span></code></pre>

However, pre elements can contain any phrasing content, such as spans. If I just split the text based on newlines, the results could be incorrect. This:

HTML<pre><code><span class="something">foo
bar
baz</span></code></pre>

Would become this, in which the first span class="line" is interpreted as wrapping the entire contents of the pre:

HTML<pre><code><span class="line"><span class="something">foo</span>
<span class="line">bar</span>
<span class="line">baz</span></span></code></pre>

Such a scenario might not be likely in practice, since as far as I can tell Prism highlights one line at a time anyway, but I wanted to do better. My solution is a tad convoluted:

  1. Turn the rehype AST into a string with hast-util-to-html.
  2. Split into raw lines, like the naïve approach.
  3. Trim trailing newlines.
  4. Parse each line with a regex (I know, I know… this is a specific scenario where the input is constrained) to get a list of tags left open in the end.
  5. For each line except the last, close the open tags at the end and re-open them on the following line.
  6. Add data-line to individual lines and data-digits to the block to simplify layout and styling later.
  7. Parse this back into an AST with parse5. I originally intended to parse incomplete fragments, which I no longer need, so I might move away from parse5.

The final product is even more complicated because I needed a wrapper element within a wrapper element. Here’s an example of the results (which I unfortunately can’t pretty-print since it’s preformatted):

HTML<pre><code data-digits="1"><span class="line" data-line="1"><span class="line-content"><span class="token punctuation">(</span><span class="token defun"><span class="token keyword">defun</span> <span class="token function">aankh/activate-rfc1345</span> <span class="token punctuation">(</span><span class="token arguments"></span><span class="token punctuation">)</span></span>
</span></span><span class="line" data-line="2"><span class="line-content">  <span class="token punctuation">(</span><span class="token car">setq-local</span> default-input-method <span class="token quoted-symbol variable symbol">'rfc1345</span><span class="token punctuation">)</span>
</span></span><span class="line" data-line="3"><span class="line-content">  <span class="token punctuation">(</span><span class="token car">activate-input-method</span> <span class="token quoted-symbol variable symbol">'rfc1345</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
</span></span><span class="line" data-line="4"><span class="line-content">
</span></span><span class="line" data-line="5"><span class="line-content"><span class="token punctuation">(</span><span class="token defun"><span class="token keyword">defun</span> <span class="token function">aankh/activate-input-method-in-insert-state</span> <span class="token punctuation">(</span><span class="token arguments"><span class="token rest-vars"><span class="token lisp-marker">&#x26;rest</span> <span class="token argument variable">args</span></span></span><span class="token punctuation">)</span></span>
</span></span><span class="line" data-line="6"><span class="line-content">  <span class="token punctuation">(</span><span class="token car">activate-input-method</span> default-input-method<span class="token punctuation">)</span><span class="token punctuation">)</span>
</span></span><span class="line" data-line="7"><span class="line-content">
</span></span><span class="line" data-line="8"><span class="line-content"><span class="token punctuation">(</span><span class="token car">advice-add</span> <span class="token quoted-symbol variable symbol">'evil-insert-state</span> <span class="token
 lisp-property property">:after</span> <span class="token quoted-symbol variable symbol">'aankh/activate-input-method-in-insert-state</span><span class="token punctuation">)</span></span></span></code></pre>

The core algorithm should be usable inside any plugin architecture that allows conversion between an AST and a string, or even just working with the raw string. I’ll publish my rehype implementation on GitHub eventually, once I’ve added some documentation and answered some release-related questions. (I’ve already added tests, which are in fact the first on this blog, but more on that another time.)

Next in series: (#38 in Colophon: Finding A Place For My Head)