Writing an Asciidoc Parser in Rust: Asciidocr

I really only ever make something when I want something to exist that doesn’t already, or when I want something that does exist to more readily suit my (admittedly) idiosyncratic needs or thoughts about how it should exist. For better or worse, I have a lot of wants, and so I make a lot of things (e.g., Two Page Tuesday, or last night’s mostly-successful attempt at tapering a pair of pants I got at Global Thrift, or an early solve for the problem I’m solving here).

So: I wrote an asciidoc parser in Rust. I called it asciidocr because the Command-Line Rust book put an r after all the "clone a UNIX tool" projects, and I liked that convention.

Asciidoc is a lightweight markup language that is, in my opinion, the best one. Why it’s the best one is a separate issue entirely, but we can at least safely assume that it’s a good one, and the one that, for better or worse, I’ve been using to write nearly everything I’ve written for personal or professional use in the last five years or so. While it started as a Python project, it got new life (and a bunch of new features) when it was more or less taken over by the fine Asciidoctor folks, who wrote their converter in Ruby. It works very well, and does a lot of things. But.

It’s in Ruby, a language I have petty beef with and, more importantly, is an interpreted, not compiled language, which means that for every new machine I want to convert asciidoc files on, I need to install Ruby. And there are some other things to, in part pertaining to the way that templates must be written for custom output(s), it’s frankly a little slow, and whatever else.

But mostly it was the "I don’t want to have to write Ruby to extend the thing" that got me thinking. I was dreaming about a text-based writing management tool (like a Scrivener but for folks who use vim), and having already written a tool to make generating PDFs from asciidoc easier, I knew that if I wanted to write this next app in anything but Ruby, I’d need to either (a) subprocess out to the Ruby; (b) rely on the old asciidoc.py project, with its limitations (and also therefore limiting myself to writing in Python, which, like Ruby, means that if I wanted to share my tool, the folks using it would need to be able to install Python); or (c) find or build a converter in a different language. So after getting part of the way through an (a) implementation in Python, I cut my losses and started looking more readily into option (c), for: I was learning Rust and Go(lang).

There is, in fact, a pretty good Go implementation of an asciidoc parser/converter. And there was a hot second when it looked like my company might transition to Go for some backend stuff, so I picked up Powerful Command-Line Applications in Go and got to work. Unfortunately I realized pretty quickly that I am allergic to the following, oft-repeated pattern in the language:

  if err != nil {
    return err
  }

And then it became clear that we weren’t going to be using Go at work, so I dropped it.

Rust, on the other hand: boy-howdy did I love (and still do) working in that. And sure, there wasn’t a very feature-complete asciidoc parser or converter yet, but I liked the language and figured I could learn something: so I asked for some mentorship (thanks big time to Kit Dallege for everything that follows) and got to work.

My background is, of course, very humanities-focused. I mean, sure, there was a math minor in there somewhere, but that was all in service of a brief glimmer of a future doing philosophy of math, so. I’ve written a lot of code, and have been writing some kind of code or other since I was a small kid (thank you, hackable Geocities sites), but I have no "computer science education." Learning how to write a parser seemed like a good way to go.

And instead of relying on a lexing package (e.g., something like pest), where you write a grammar and the thing does it for you, Kit recommended I do the whole thing by hand, since I’d learn more (and potentially it could be faster, or at least a smaller binary).

So that’s more or less what I did. It’s not perfect; it could, of course, be improved; there are some decisions I made early on that I would not make today, knowing what I know how; and I am very fucking proud of it. So we can dig in.

Pretending it’s a Compiler

Googling around got me to a few resources that seemed like they’d be relevant, specifically the Commonmark Spec section about parsing, but what really ended up sticking in my brain was a book called Crafting Interpreters, which I someday would love to go back and really read for its intended purpose. But since I was going to be doing more or less the first half (up to the point where you do something with the tree you’ve created by scanning and parsing the code), I figured this would be a good place to start, and it was! Very well-written, too. So much so that it even made sense though I haven’t pretended to know anything about reading Java in years.

What this meant, anyway, was that I had a clear path forward. Prior to asking for help, I’d written a half-of-a-half implementation that mixed up the lexing and the parsing and the output all together, but this was going to be better, both in terms of building it, in terms of architecture, and in terms of being able to do other things with the tree/graph once I had it. So what I would then do was:

  1. Scan the document into tokens

  2. Parse those tokens into a tree

  3. Take that tree and do something with it

Easy enough, right?

Scanning, Lexing, Whatever You Want to Call It

I’m still not sure what the difference between "scanning" and "lexing" is, if there is one at all, but anyway I needed to generate some tokens. I don’t plan on going into too much detail about the why/how of this (instead I refer you back to Crafting Interpreters), but there are a few interesting (annoying?) things about asciidoc that I think are worth mentioning here.

Like markdown, asciidoc is essentially a line-based language. The most significant character is therefore the line break, \n, and in some worlds/lights it makes sense to parse asciidoc line-by-line. If I were to go back and do it as a "one-shot" parser (which according to the chatter in the Asciidoc community chat, isn’t possible anyway), I might do it as a line-by-line thing. Instead, however, I did the scanning character-by-character, in part because that’s what the book told me to do, and in part because keeping track of the newline tokens actually made parsing much easier in the end (I think/hope, anyway).

So the scanning.

Maybe the best "new thing I started using a lot" of 2024 was the humble Enum. I started using them in Python for a specific thing, and then started using them more, and one of the things I like best about Rust is that it takes its Enums seriously. So, to wit, the first thing I did was create a big ass TokenType enum:

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum TokenType {
    NewLineChar,
    LineContinuation,
    ThematicBreak,
    PageBreak,
    Comment,
    PassthroughBlock, // i.e., "++++"
    SidebarBlock,     // i.e., "****"
    SourceBlock,      // i.e., "----"

    // ...snip
Note
All source can be found in the Github repo. I’m going to condense and remove some comments and things in this post as needed to keep it clean.

And then a Struct for each token:

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Token {
    pub token_type: TokenType,
    pub lexeme: String,          // raw string of code
    pub literal: Option<String>, // our literals are only ever strings (or represented as such)
    pub line: usize,
    pub startcol: usize,
    pub endcol: usize,
    /// The file's stack hierarchy if it's an include, otherwise stays empty
    pub file_stack: Vec<String>,
}

There is a draft official schema for how an asciidoc document should be (able to be) represented, and that’s why we’re keeping track of line, startcol, etc. I think if I were to go back and clean this up, we could probably drop the literal attribute, since we don’t really need it (this was inspired/copied from the Crafting Interpreters way of doing things, which has different requirements than what we have, ultimately).

So once we have our Token structs to play with, we can then proceed to actually scanning the document into tokens. We create a Scanner struct to hold some state and the source and things:

#[derive(Debug)]
/// Scans an asciidoc `&str` into [`Token`]s to be consumed by the Parser.
pub struct Scanner<'a> {
    pub source: &'a str,
    start: usize,
    startcol: usize,
    current: usize,
    line: usize,
    file_stack: Vec<String>,
}

And then, because Rust has such good pattern matching, the actual work just becomes a(n admittedly gigantic) match/switch statement:

fn scan_token(&mut self) -> Token {
        let c = self.source.as_bytes()[self.current] as char;
        self.current += 1;

        match c {
            '\n' => self.add_token(TokenType::NewLineChar, false, 1),

            '\'' => {
                if self.starts_repeated_char_line(c, 3) {
                    self.current += 2;
                    self.add_token(TokenType::ThematicBreak, false, 0)
                } else if ['\0', ' ', '\n'].contains(&self.peek_back()) && self.peek() == '`' {
                    self.current += 1;
                    self.add_token(TokenType::OpenSingleQuote, true, 0)
                } else {
                    self.add_text_until_next_markup()
                }
            }
    // ...snip

In order to keep things moving along speedily (because, in addition to being "cool," Rust is also supposed to be "fast"), the actual scanning function is implemented as an Iterator (a "generator" in Python-speak):

impl<'a> Iterator for Scanner<'a> {
    type Item = Token;

    fn next(&mut self) -> Option<Self::Item> {
        if !self.is_at_end() {
            self.start = self.current;
            return Some(self.scan_token());
        }
        None
    }
}

(It was amazing how easy it was to do that, really.)

Some fun nuances, because we’re dealing with "text" instead of "code," that came up ended up being character boundaries. So take something like the humble ellipsis () or an emoji: these require multiple bytes to represent. This means that sometimes you might try to do something between the bytes it takes to represent the character, which makes the scanner sad (and die, or in Rust-parlance, panic!).

(It occurs to me now that I should have specified earlier that we’re scanning byte by byte, not character-by-character; there are some reasons for doing this that I don’t feel like explaining to do with the way text is encoded and then handled by Rust, so, just, like, trust me that this was a good way to do it.)

Getting around this means that we just check for character boundaries when we look around to see, based on context, what kind of token we should be producing. And we do a lot of looking around! Here are a few, noting the easy-to-use is_char_boundary() function in there:

    fn peek(&self) -> char {
        if self.is_at_end() || !self.source.is_char_boundary(self.current) {
            return '\0';
        }
        self.source.as_bytes()[self.current] as char
    }

    fn peek_back(&self) -> char {
        if self.start == 0 || !self.source.is_char_boundary(self.start - 1) {
            return '\0';
        }
        self.source.as_bytes()[self.start - 1] as char
    }

    fn peeks_ahead(&self, count: usize) -> &str {
        if self.is_at_end()
            || self.current + count > self.source.len()
            || !self.source.is_char_boundary(self.current + count)
        {
            return "\0";
        }
        &self.source[self.current..self.current + count]
    }

This means that, say, if we get a character -, and know it’s the beginning of a new line (i.e., that self.peek_back() == '\n'), and we can peeks_ahead to see that self.peeks_ahead() == "---\n", we know that we should generate a TokenType::SourceBlock delimiter token. Scanning is essentially that, but, like, a bunch of times with a bunch of edge cases and nuances (e.g., because that four-repeated-characters-before-a-newline is such a common pattern, you write a function that checks that for you).

This, naturally, segues into unit testing!

There are a lot of tests around the scanner! I haven’t yet gotten around to running coverage on it, but I think it’s pretty good. One thing I don’t like about Rust is that, by convention, you keep unit tests in the same file as the code they’re testing. I see why you’d want to do that, but also my scanner/mod.rs file is a whopping 1932 lines long. Coming from Python-land… ouch! Still: it works, especially if you use my new best friend rstest, which works so analogously to our dear friend pytest that I was able to get up and running in a matter of minutes with it, simplifying the test-cases dramatically:

#[rstest]
#[case("NOTE", TokenType::NotePara)]
#[case("TIP", TokenType::TipPara)]
#[case("IMPORTANT", TokenType::ImportantPara)]
#[case("CAUTION", TokenType::CautionPara)]
#[case("WARNING", TokenType::WarningPara)]
fn inline_admonitions(#[case] markup_check: &str, #[case] expected_token: TokenType) {
    let markup = format!("{}: bar.", markup_check);
    let expected_tokens = vec![
        Token::new_default(
            expected_token,
            format!("{}: ", markup_check),
            Some(format!("{}: ", markup_check)),
            1,
            1,
            markup_check.len() + 2, // account for space
        ),
        Token::new_default(
            TokenType::Text,
            "bar.".to_string(),
            Some("bar.".to_string()),
            1,
            markup_check.len() + 3,
            markup_check.len() + 6,
        ),
    ];
    scan_and_assert_eq(&markup, expected_tokens);
}

Easy, right? So let’s now suppose we scan our document-as-a-&str into a bunch of tokens. We then parse them. Yay!

Parser-ing

…and again we use a big-ass match statement. But before we can really get into that, we need to look at what we’re doing all this parsing into, namely a (mostly) spec-compliant Abstract Syntax Graph.

Trees and Graphs

In "normal" parsing you create a node tree, and then do some traversing of that tree and… actually I didn’t get that far in the book. Because, to be compliant with the "Asciidoc Technology Compatibility Kit (TCK)," you need to produce JSON, I… just figure it would be easier to start there. JSON — and more specifically the objects needed to serialize it — would be an easy enough "intermediate representation" from which to then go on and output HTML and other formats (more on this later).

To be completely honest, the spec is good but not great, and frankly not complete yet. If I had more time and energy I would contribute more readily to the ADRs and discussions and so on, but… I don’t. Yet. Maybe someday. Regardless.

This meant essentially that I could go the "super abstract" route, and create generic "block" and "inline" objects and go from there, or I could just go ahead and make a struct for each kind of thing, since it’s a finite set of things. So I went that route.

As you do in rust, I used serde and serde-json to do the serialization. What this meant, though, was that it was going to be harder to use Traits to create shared functionality (and to make functions accept more generic parameters). I looked at a few crates that ultimately used our old friend the Enum on the backend to make the serialization happen (since you get the serialization more or less for free with an Enum), so I just did that directly. This meant that I had, for example, this hairy-looking thing:

#[derive(Serialize, Clone, Debug)]
#[serde(untagged)]
pub enum Inline {
    InlineSpan(InlineSpan),
    InlineRef(InlineRef),
    InlineLiteral(InlineLiteral),
    InlineBreak(LineBreak),
}

And that I had to do a lot of if let Some(Block::LeafBlock(block) = foo.last_mut() type stuff, but I’m told that this is part of why my parser is so fast, because enums are so fast, and… if you’re not first you’re last? Anyway, this is one of the design decisions that I’m not sure I would do again (I think it would be more developer-friendly to use Traits), but as (a) I am the only developer and (b) it works, and is fast, it’s fine.

So parsing then becomes a matter of looking at given Token and deciding what to do with it. Because "what to do with it" is often a matter of context, we build a lot of that context into our Parser:

/// Parses a stream of tokens into an [`Asg`] (Abstract Syntax Graph), returning the graph once all
/// tokens have been parsed.
pub struct Parser {
    /// Where the parsing "starts," i.e., the adoc file passed to the script
    origin_directory: PathBuf,
    /// allows for "what just happened" matching
    last_token_type: TokenType,
    /// optional document header
    document_header: Header,
    /// document-level attributes, used for replacements, etc.
    document_attributes: HashMap<String, String>,
    /// holding ground for graph blocks until it's time to push to the main graph
    block_stack: Vec<Block>,
    /// holding ground for inline elements until it's time to push to the relevant block
    inline_stack: VecDeque<Inline>,
    /// holding ground for includes file names; if inside an include push to stack, popping off
    /// once the file's tokens have been accommodated (this allows for simpler nesting)
    file_stack: Vec<String>,
    /// holding ground for a block title, to be applied to the subsequent block
    block_title: Option<Vec<Inline>>,
    /// holding ground for block metadata, to be applied to the subsequent block
    metadata: Option<ElementMetadata>,
    /// counts in/out delimited blocks by line reference; allows us to warn/error if they are
    /// unclosed at the end of the document
    open_delimited_block_lines: Vec<usize>,
    /// appends text to block or inline regardless of markup, token, etc. (will need to change
    /// if/when we handle code callouts)
    open_parse_after_as_text_type: Option<TokenType>,
    // convenience flags
    in_document_header: bool,
    /// designates whether we're to be adding inlines to the previous block until a newline
    in_block_line: bool,
    /// designates whether new literal text should be added to the last span
    in_inline_span: bool,
    /// designates whether, despite newline last_tokens_types, we should append the current block
    /// to the next
    in_block_continuation: bool,
    /// forces a new block when we add inlines; helps distinguish between adding to section.title
    /// and section.blocks
    force_new_block: bool,
    /// Temporarily preserves newline characters as separate inline literal tokens (where ambiguous
    /// blocks, i.e., DListItems, may require splitting the inline_stack on the newline)
    preserve_newline_text: bool,
    /// Some parent elements have non-obvious closing conditions, so we want an easy way to close these
    close_parent_after_push: bool,
    /// Used to see if we need to add a newline before new text; we don't add newlines to the text
    /// literals unless they're continuous (i.e., we never count newline paras as paras)
    dangling_newline: Option<Token>,
}

(As an aside: I’m keeping the comments on this struct, as opposed to many of the others I’ve shown above, in part because it’s useful and in part because I want to shout out to docs.rs for making it SUPER easy to generate really nice documentation for your project. Makes my former technical writer heart happy.)

We keep track of a lot of state, and frankly it got a little over-complicated, but also I didn’t have the time to make it simpler, so: it works, you know?

Again we have a big match statement with a lot of arms like:

TokenType::QuoteVerseBlock => {
    // check if it's verse
    if let Some(metadata) = &self.metadata {
        if metadata.declared_type == Some(AttributeType::Verse) {
            self.parse_delimited_leaf_block(token);
            return;
        }
    } else if self.open_parse_after_as_text_type.is_some() {
        self.parse_delimited_leaf_block(token);
        return;
    }

    self.parse_delimited_parent_block(token);
}

These, in turn generate various Block and Inline objects, that get added to our Abstract Syntax Graph:

#[derive(Serialize, Debug)]
pub struct Asg {
    pub name: String,
    #[serde(rename = "type")]
    pub node_type: NodeTypes,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub attributes: Option<HashMap<String, String>>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub header: Option<Header>,
    #[serde(skip)]
    /// footnote references
    document_id: String,
    #[serde(skip)]
    /// Has of all IDs in the document, and the references they point to
    document_id_hash: HashMap<String, Vec<Inline>>,
    /// Document contents
    pub blocks: Vec<Block>,
    pub location: Vec<Location>,
}

So by and by we build our graph, which takes something like:

This document has two paragraphs.

Paragraphs may be separated by one or more empty lines.

Into:

{
  "name": "document",
  "type": "block",
  "blocks": [
    {
      "name": "paragraph",
      "type": "block",
      "inlines": [
        {
          "name": "text",
          "type": "string",
          "value": "This document has two paragraphs.",
          "location": [ { "line": 1, "col": 1 }, { "line": 1, "col": 33 } ]
        }
      ],
      "location": [ { "line": 1, "col": 1 }, { "line": 1, "col": 33 } ]
    },
    {
      "name": "paragraph",
      "type": "block",
      "inlines": [
        {
          "name": "text",
          "type": "string",
          "value": "Paragraphs may be separated by one or more empty lines.",
          "location": [ { "line": 4, "col": 1 }, { "line": 4, "col": 55 } ]
        }
      ],
      "location": [ { "line": 4, "col": 1 }, { "line": 4, "col": 55 } ]
    }
  ],
  "location": [ { "line": 1, "col": 1 }, { "line": 4, "col": 55 } ]
}
Note
All that location stuff is required by the schema; I don’t like it, but hey, it’s not all about me. If ever somebody takes this to create a better asciidoc LSP or something, it’ll be useful information. (Or if I ever start doing more error handling/verification for the user.)

I could perhaps go into more detail about how the parsing actually works, but, you know, it’s just creating objects, and this is getting long. So if you’re curious, look at the code (or holler at me on Bluesky and I’ll do a follow-up post about whichever part you’re interested in). We’ll now turn to doing something with this graph we’ve made.

Turning it Into Something Useful (Templating)

The first, most obvious useful thing for the parser to do is produce HTML, since that can be turned into basically anything else, one way or another. Instead of targeting the kind of HTML that Asciidoctor produces (which I find overly div heavy), I targeted a HTML standard called "HTMLBook", in part because that’s what I use for work and am therefore most comfortable with, and in part because it’s clean and simple and more like what pick-your-favorite-markdown converter produces. So to make HTML, we use templating. Yes! Our old friend templating. From Dreamweaver templates to LiquidTemplates to handlebars to Jinja/Django, they’re all more or less the same. More or less usable. Etc. For this project I went with one called tera, after trying one called askama, which was really really cool but ultimately was hard to make work nicely with serde.

tera, on the other hand, is basically just Django templates. I write Django templates at work. Easy:

{% import "inline.html.tera" as inline_macros %}
{% import "block.html.tera" as block_macros %}
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>{%- if header %}
        {%- for inline in header.title %}
        {{- inline_macros::process_inline(inline=inline) -}}
        {% endfor -%}
        {% endif -%}</title>
</head>

<body>{% for block in blocks %}
    {{ block_macros::process_block(block=block,skip_tag=false) -}}
{% endfor %}
</body>

</html>

There is a pretty annoying recursion issue (not the fault of tera so much as the fault of what I’m trying to do with it), which means that the block and inline macro code is… ugly. But hey, it works to produce nice, clean documents like the following:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title></title>
</head>

<body>
    <p>What follows is an aside.</p>
    <aside data-type="sidebar">
        <h5>Aside Title</h5>
        <p>Some aside text!</p></aside>
</body>

</html>

Nice.

But Wait! There’s More!

So now we’ve more or less gotten to the point where we’ve duplicated a good chuck of what asciidoctor, the reference implementation, does in terms of parsing and conversation, but of course: asciidocr, this implementation, DOES NOT DO EVERYTHING ASCIIDOCTOR DOES, and doesn’t intend to. But it does handle a whole bunch of the language, including nice things like include:: directives (see the limitations doc in the repo for more), but this all started because I not only wanted a non-interpreted-language implementation (with Rust we can generate binaries), but also because I wanted to do other stuff, more easily.

So let’s talk about a little of that.

Docx

If there is a "killer feature" of asciidocr, is it that it will — eventually — produce Word/docx files natively. Creating docx files is a PAIN IN THE ASS, but it’ll be worth it for folks like me who want to write their fictions and whatever else in asciidoc, but then have to send journals and agents publishers Word documents.

I’m currently rewriting the implementation of the DOCX backend, but even now, if you install the tool with the --feature docx enabled (for more on what I’m talking about when I talk about installing a Rust feature, see here), you can get a docx created IF:

  • It’s only prose and headings

  • BUT it can include italics and bold and stuff

The reimplementation will be better and handle more things — tables, lists, etc., — but I wanted to write this post now, instead of waiting for it to be "done," since "done" is a myth when it comes to software. Anyway: go try it out! My hope is for the docx backend to be stable enough that I don’t need to hide it behind a feature flag anymore.

Rust and Python

And, somewhat finally, another feature-flag thing: calling asciidocr from Python, making asciidoc conversions super fast with modern syntax (compared to asciidoc.py).

All the credit for this really goes to the pyo3 project, but building on top of their brilliant work, it’s very easy to do something like:

#![cfg(feature="python")]

use std::path::PathBuf;
use crate::scanner;
use crate::parser;
use crate::backends::htmls::render_htmlbook;
use pyo3::{exceptions::PyRuntimeError, prelude::*};

/// parses a string using the specified backend
#[pyfunction]
fn parse_to_html(adoc_str: &str) -> PyResult<String> {
    let graph = parser::Parser::new(PathBuf::from("-")).parse(scanner::Scanner::new(adoc_str));
    match render_htmlbook(&graph) {
        Ok(html) => Ok(html),
        Err(_) => Err(PyRuntimeError::new_err("Error converting asciidoc string")),
    }
}

#[pymodule]
fn asciidocr(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(parse_to_html, m)?)
}

Build a wheel, install it, and then from within Python:

$ python
Python 3.13.1 (main, Jan  7 2025, 10:41:20) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import asciidocr
>>> asciidoc = "This is _pretty freakin' cool_, right?!"
>>> html = asciidocr.parse_to_html(asciidoc)
>>> print(html)
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title></title>
</head>

<body>

    <p>This is <em>pretty freakin' cool</em>, right?!</p>
</body>

</html>

So that’s nice, and potentially useful. As a friend pointed out recently, I need to get this up on PyPI, but, you know, in time…

Loose Ends

So there’s writing an asciidoc parser in Rust, in a pretty high-level way (I could in theory go back and add more detail, but this post is far, far too long). And there are plenty of loose ends so far as the project itself goes, like:

  • Actually covering the entirety of the asciidoc language

  • Allowing users to supply stylesheets for HTML builds via the CLI (and I never talked above about the CLI, did I? Or the packaging process? Maybe separate posts; anyway I used clap).

  • Creating an Asciidoctor-compliant HTML backend, because that means that folks can use this more as a "drop-in replacement" if they want

  • Finishing the docx build

  • …other future dream-big builds that I don’t want to talk about yet (OK: PDFs, I’m talking about PDFs).

  • And much, much more!

In any case.

As with all newer skills, the biggest benefit to my Rust knowledge was just having to write an ass-ton of Rust. I also think I learned something about design patterns, about balance (i.e., maybe it would have been more "pure" to keep some things in the Parser, but it was so much easier to just make the Scanner a little bit smarter sometimes), and about writing software more generally. I like Rust, in part, because it makes you really consider what the "right" thing to do is (okay: I really like it mostly because the tooling is so damn good), and this in turn makes me think about writing all code different (apologies to my coworkers, who now have to put up with me importing Rust-y patterns into Python — I promise I’ll only do it when it makes sense!).

Mostly, though: I’m just happy I now have a tool that does more or less what I want it to do, and quickly (not to brag, but compare some very non-scientific testing that has asciidocr converting a file to HTML in 0.01s user, whereas asciidoctor takes a whole 0.32s user. It’s an admittedly small but noticeable difference, especially for larger documents). So in that sense mission achieved. Yay.


Links:

2024: Another Year in Reading

It’s December 31, and the probability of my finishing the two-and-a-half books I’ve got in the hopper before midnight are diminishing rapidly. One I only started yesterday (The Third Lie), one is poems and I like to take my time with poems (Monad+Monadnock), and the half is a book club book (The Shape of Content), and I’ve found that it’s better if I read the chapters closer to book club when we do book club. All of which is to say that, despite the apparent trend towards late November/early December "wrapped"-like things, what follows is an accurate and complete list of the books I’ve finished this year.

The primary observation from this "year of reading" is that I read a heckuva lot more than most years. It’s the highest "count" (not that the number really means anything) I’ve had since I’ve been keeping track, and perhaps the highest count ever for me. The primary motivation for this was the need to find "comp titles" for the novel manuscript I’m trying to convince someone to agent or publish early in the year. This meant that I read contemporary fiction in an intentional way for the first time in ever. It was pretty instructive, I think, and though I’m still not sure my comps are really great comps, I did read a lot, and feel like I have at least some provisional understanding of the "market" for "literary" novels, even if I also have ideas about comp titles in general for literary — and especially "experimental"-leaning — novels, but that’s a separate thing than the book list.

I did a few rereads this year, with notables including Life of a Star and The Anthologist, both books I love for very different reasons. Speaking of Janes, Jane Alison’s Villa E was maybe my favorite "I’ve been waiting forever to read this" of the year, given that I am 99% sure I remember her mentioning an "architecture book" as far back as 2012, when I took classes with her. It lived up to the hype in my head.

I was happy to come across an ARC of James, which I really enjoyed, even if I preferred — for my own tastes and interests — Dr. No. A few books I don’t really remember at all, e.g., the Rovelli and the Christie, and there are a few (which I will keep to myself) that I still can’t decide if I liked or not, or if they were good or not. For nonfiction, I really loved The Essays of Leonard Michaels, and Once Upon a Prime made me very happy. I’m very glad I finally got to Monsters, which I greatly enjoyed. For history, I cannot recommend The Blazing World highly enough. It paired nicely with The Name of the Rose, which admittedly took me forever to read, but this was in part because I was savoring it, reading it only before bed, etc.

The latter part of the year was a lot of poetry, I think partly due to my having gone to a proper poetry reading for the first time in a very long time. And luckily, the book I got there, Visitors from the Red Star was super-excellent. The Gold Cell is one of the best collections I’ve ever read, and I’m kind of glad I waited (not intentionally) to read it in my 30s; I’ve had such good luck with the "dollar carts" outside Commonwealth Books and the Brattle Bookshop (both of which are on my "lunch walk" route when I go into the office downtown) this year. The Gold Cell was one, but maybe the best find was Country Cooking and Other Stories, even if this means that I am now forever on the hunt for books published by Burning Deck (RIP).

There were a couple more programming books in there this year, and they were fine. The best ones I read (e.g., Command-Line Rust) I never finished for one reason or another, and only "finished" books end up on the list. I feel like this is often true of programming books, though, which means they’ll always be under-represented here. I’m not going to lose sleep over this.

The year more or less finished with Claudia Rankine, Agota Kristof, Robert Coover (RIP), and Lucia Berlin. This has been a nice way to end the year, even if the Kristof novels are deeply upsetting (and excellent), and the Lucia Berlin stories make me feel far too many things.

There were other notables I didn’t necessarily mention above — e.g., The Organs of Sense might have been my favorite novel of the year, if you were to account for my very intense recency bias when it comes to "favorites" — but, you know, I’m around if you ever want to talk about one or another of them.

In any case, it was a lovely year of reading.

And so the list:

  1. Ladies' Lunch and Other Stories by Lore Segal

  2. How the Internet Happened by Brian McCullough

  3. Seven Brief Lessons on Physics by Carlo Rovelli

  4. The Physiology of Taste by Jean Anthelme Brillat-Savarin, trans. MFK Fisher

  5. Life of a Star by Jane Unrue

  6. The Blazing World by Jonathan Healey

  7. The Ladies-in-Waiting by Javier Olivares and Santiago García

  8. So Long, See You Tomorrow by William Maxwell

  9. Pandora’s Jar by Natalie Haynes

  10. The Essays of Leonard Michaels by Leonard Michaels

  11. Bonsai by Alejandro Zambra

  12. One Woman Show by Christine Coulson

  13. The Age of Wire and String by Ben Marcus

  14. A Life of One’s Own by Joanna Biggs

  15. LaserWriter II by Tamara Shopsin

  16. Marigold and Rose: A Fiction by Louise Glück

  17. Before and After the Book Deal by Courtney Maum

  18. The Beautiful Race: The Story of the Giro d’Italia by Colin O’Brien

  19. Drifts by Kate Zambreno

  20. Conversation of the three wayfarers by Peter Weiss

  21. Crooked House by Agatha Christie

  22. The Berlin Wall (Bookfair Presale Edition) by David Leo Rice

  23. The Name of the Rose by Umberto Eco

  24. Dr. No by Percival Everett

  25. A Shock by Keith Ridgway

  26. Electronic Literature by Scott Rettberg

  27. Interior Chinatown by Charles Yu

  28. re: f(gesture) by Percival Everett

  29. Chaucer by Marion Turner

  30. The Organs of Sense by Adam Ehrlich Sachs

  31. How to Travel with a Salmon and Other Essays by Umberto Eco

  32. The Heart is a Lonely Hunter by Carson McCullers

  33. Once Upon a Prime by Sarah Hart

  34. Villa E by Jane Alison

  35. Visitors from the Red Star by August Smith

  36. Robust Python by Patrick Viafore

  37. James by Percival Everett

  38. Consider the Oyster by MFK Fisher

  39. Lunch Poems by Frank O’Hara

  40. The Anthologist by Nicholson Baker

  41. Tidy First? by Kent Beck

  42. The Poetics of Space by Gaston Bachelard

  43. The Invention of Morel by Adolfo Bioy Casares

  44. Fierce Poise: Helen Frankenthaler and 1950s New York by Alexander Nemerov

  45. Country Cooking and Other Stories by Harry Mathews

  46. Small Things Like These by Claire Keegan

  47. Pockets: An Intimate History of How We Keep Things Close by Hannah Carlson

  48. The Hatred of Poetry by Ben Lerner

  49. In Concrete by Anne Garréta, trans. Emma Ramadan

  50. Meditations in an Emergency by Frank O’Hara

  51. Monsters by Claire Dederer

  52. The Gold Cell by Sharon Olds

  53. Citizen by Claudia Rankine

  54. The Notebook by Ágota Kristóf

  55. The Enchanted Prince by Robert Coover

  56. A Manual for Cleaning Women by Lucia Berlin

  57. The Proof by Ágota Kristóf

Tell Just One Person About

I’ve been listening to a podcast lately while cooking called A History of Rock Music in 500 Songs, that always includes in the following line in the outro:[1]

If you’ve enjoyed this episode, please by all means subscribe… but more importantly, please tell just one other person about this podcast. Word of mouth is the best way to get information out about any creative work. So please, if you like this, tell someone. Thank you very much.

I’ve been thinking about this a lot in relation to the books I read and recommend. Specifically the novel Villa E by Jane Alison, who, for disclosure, I took classes with way back at U Miami. It’s a damn good fucking book. Jane is so lovely. But mostly it’s a great fucking book, and I want everybody to read it, though specifically a couple writer friends of mine, one of whom is also working (albeit on the back burner) on an "architecture novel," the other of whom likes/writes literary historical fiction and good prose (Villa E is both). And I really enjoyed it, myself,[2] and so I’ve told people about it.

And funny enough, I only heard about that podcast because one of my cycling buddies (who is also a musician) told me about it: so it must be working.

This is all in the context, too, of — you know — the political climate, and I can’t help but think that in the face of gigantic, large, looming disasters like climate change et al, the only thing to do — the only thing I can think to do, anyway, given my time, abilities, influence — is go local, go talking to people. Thus Two Page Tuesday. Thus a few other projects I’ve got simmering on my own mental back burner. Thus why I keep telling everyone to go read Jane’s book. Why I’ve mentioned the podcast to a few folks. Why I keep trying to think of ways to get journals and agents and folks to the readings so they can see my great friends' great work and sign it.

But the shit takes time, but we’ll get there,[3] eventually.


1. At least for the first however-many episodes; I’m still way behind current.
2. So much so that I read it faster than I meant to on a trip, and needed to acquire another book to get me home.
3. Fun fact: the novel I wrote in one of Jane’s classes was called "When We Get There." I have not returned to said novel in about a decade and can no longer remember if it was any good. I’m sure it was fine. I was twenty-two, a child, really.

Growth for Other than Growth’s Sake

Last night at Two Page Tuesday,[1] we had the largest crowd yet at one of the readings (or, to be honest, any of the related events that have been running since January). We had around 20 at the first reading, about 15 at the second, 25 or so at the third, and last night, according to Megan’s count, we had somewhere in the are of 35 to 40 fucking people there, which is insane.[2] All seven of the readers were excellent, and pretty much all of the new (to the party) readers brought friends. Somehow a creative writing club at one of the colleges caught wind of it and decided to come. We had at least three people in attendance with ties to either gallery/arts spaces or arts/book events there, you know, for networking.[3] A lot of people seemed to meet a lot of people. I’d had an unusual kind of day[4] and was aggressively squirrel brained, but even still (or perhaps because of this) I’m pretty sure I shook everyone’s hand in the room and collected emails and future readers. The bar was happy with us, too, and have invited us back again for January. By pretty much all (nonexistent) metrics, it was a wild success. I’m still buzzing.

The growth is good, I think, but not because it’s "growth" but because people want to be there and seem to be getting something from it. I don’t know how many times I told the "origin story," i.e., explained that it was a bar night that turned into a bi-monthly reading which is now aspirationally monthly.

But it’s been slow to build. Intentionally.

I’m thinking about this today while I scribble these notes (let’s say) on my lunch break at work, where the imperative — for good or no — is more or less always to grow, more or less as fast as we can.[5] And I’m thinking about this in terms of reach and purpose and everything else, about how there’s no point in trying to grow a community if there aren’t community ties, how there’s no point in adding people if the people don’t actually get to know each other, if the people being added don’t meet folks or have the kind of experience that makes them go, "Well fuck, I guess all these people are writing or working on novels or whatever, maybe I ought to do some of that, too."[6]

And part of it is functional and/or selfish, sure: I can only do so much, and scale requires resources of time if not many other things, and I have only so much time. So scaling slowly means I have the time to build the infrastructure I need for it to stay fun and not a pain in the ass, like what more or less happened with Response. And as was pointed out by one of the writers I was talking to earlier today who had generously offered help, "delegation takes brain batteries," and there’s no point in delegating if I don’t have the energy to. So slow is good. I think of what a Navy vet I worked with about a decade ago used to say: "Slow is smooth and smooth is fast." I think about this phrase often — like, a lot.

So we grow slow. But we do grow. And I love that it’s not linear, or even always growing larger. And I love that folks took time out of their busy lives to come and read and/or listen and have a good time, and I love that my hunch that many would stay for karaoke afterwards panned out, and I love that when I say "we" about it I really do mean a "we," that though the organization of it is still more or less a one-person operation, there is a "we": there are regulars. People to point to. People who help.

It’s a beautiful thing. And I’m already looking forward to the next one on December 3.


1. While I do intend, at some point, to write something for this blog other than nonsense about Two Page Tuesday, this is not that post.
2. In a good way, obviously.
3. They were not actually there to network, but rather to support their friends (much better). But some networking did happen, nevertheless, yes.
4. A separate, unrelated story.
5. To my company’s credit, we tend to actually be pretty conservative about the speed at which we grow, and yet we still had layoffs this quarter, soooooooooooooo.
6. As the not-so-secret goal of this project, like many of my other projects, is to get creative work out of people.

Two Page Tuesday #4, Now With Karaoke

It’s next week, Nov 12! At Charlie’s! From the website:

We are SO EXCITED to announce that the 4th edition of Two Page Tuesday marks our first “odd month” event, as well as the first (of hopefully many, many) journey(s) across the river into Cambridge, and what better place to land than CHARLIE’S KITCHEN. Better yet, an hour or so after our reading, we shall hearken back to days of yore (ask someone about the c.2017-era Breakwater Reading Series) and join in song for KARAOKE at CHARLIE’S, which a few of us were able to confirm is an excellent thing to do after a gathering in September. It’s seriously going to be, so, so fun.

Come along if you’re around!