Utopias Everywhere: An Interview With Bob Sykora

10 July 2025

Vol. 1 Brooklyn was kind enough to put out an interview I did with Bob a little bit ago, which you can read on their website. To wit (they picked this quote out for socials, and I like that one, too, so I’m going to use it also here):

…one of the cool things about living in New England is the pretty immediate and physical access to history and these historical places, so the narrative really started to make sense to me once I decided to visit some of the sites I was learning about."

— Bob Sykora

It’s a real fucking good book, and you can buy it from the publisher here.

(Also I know, I know: I hadn’t posted since January and now it’s twice in a week? Well.)

Free Press

8 July 2025

Well, we’re in the Globe.

“I think it’s very important, especially in a city with a lot of academic establishments like Boston, to have a sort of DIY literary scene,” said [Nick] Roberts [who reads for us a lot]. “Poetry has become too corporatized in a lot of ways, and I view Two Page Tuesday, in some small way, as a step away from all that. It’s a group of people, not fighting for prizes or awards, who share a deep love of literature, art, and each other’s company.”

Pretty cool!

Writing an Asciidoc Parser in Rust: Asciidocr

8 January 2025

I really only ever make something when I want something to exist that doesn’t already, or when I want something that does exist to more readily suit my (admittedly) idiosyncratic needs or thoughts about how it should exist. For better or worse, I have a lot of wants, and so I make a lot of things (e.g., Two Page Tuesday, or last night’s mostly-successful attempt at tapering a pair of pants I got at Global Thrift, or an early solve for the problem I’m solving here).

So: I wrote an asciidoc parser in Rust. I called it asciidocr because the Command-Line Rust book put an r after all the "clone a UNIX tool" projects, and I liked that convention.

Asciidoc is a lightweight markup language that is, in my opinion, the best one. Why it’s the best one is a separate issue entirely, but we can at least safely assume that it’s a good one, and the one that, for better or worse, I’ve been using to write nearly everything I’ve written for personal or professional use in the last five years or so. While it started as a Python project, it got new life (and a bunch of new features) when it was more or less taken over by the fine Asciidoctor folks, who wrote their converter in Ruby. It works very well, and does a lot of things. But.

It’s in Ruby, a language I have petty beef with and, more importantly, is an interpreted, not compiled language, which means that for every new machine I want to convert asciidoc files on, I need to install Ruby. And there are some other things to, in part pertaining to the way that templates must be written for custom output(s), it’s frankly a little slow, and whatever else.

But mostly it was the "I don’t want to have to write Ruby to extend the thing" that got me thinking. I was dreaming about a text-based writing management tool (like a Scrivener but for folks who use vim), and having already written a tool to make generating PDFs from asciidoc easier, I knew that if I wanted to write this next app in anything but Ruby, I’d need to either (a) subprocess out to the Ruby; (b) rely on the old asciidoc.py project, with its limitations (and also therefore limiting myself to writing in Python, which, like Ruby, means that if I wanted to share my tool, the folks using it would need to be able to install Python); or (c) find or build a converter in a different language. So after getting part of the way through an (a) implementation in Python, I cut my losses and started looking more readily into option (c), for: I was learning Rust and Go(lang).

There is, in fact, a pretty good Go implementation of an asciidoc parser/converter. And there was a hot second when it looked like my company might transition to Go for some backend stuff, so I picked up Powerful Command-Line Applications in Go and got to work. Unfortunately I realized pretty quickly that I am allergic to the following, oft-repeated pattern in the language:

  if err != nil {
    return err
  }

And then it became clear that we weren’t going to be using Go at work, so I dropped it.

Rust, on the other hand: boy-howdy did I love (and still do) working in that. And sure, there wasn’t a very feature-complete asciidoc parser or converter yet, but I liked the language and figured I could learn something: so I asked for some mentorship (thanks big time to Kit Dallege for everything that follows) and got to work.

My background is, of course, very humanities-focused. I mean, sure, there was a math minor in there somewhere, but that was all in service of a brief glimmer of a future doing philosophy of math, so. I’ve written a lot of code, and have been writing some kind of code or other since I was a small kid (thank you, hackable Geocities sites), but I have no "computer science education." Learning how to write a parser seemed like a good way to go.

And instead of relying on a lexing package (e.g., something like pest), where you write a grammar and the thing does it for you, Kit recommended I do the whole thing by hand, since I’d learn more (and potentially it could be faster, or at least a smaller binary).

So that’s more or less what I did. It’s not perfect; it could, of course, be improved; there are some decisions I made early on that I would not make today, knowing what I know how; and I am very fucking proud of it. So we can dig in.

Pretending it’s a Compiler

Googling around got me to a few resources that seemed like they’d be relevant, specifically the Commonmark Spec section about parsing, but what really ended up sticking in my brain was a book called Crafting Interpreters, which I someday would love to go back and really read for its intended purpose. But since I was going to be doing more or less the first half (up to the point where you do something with the tree you’ve created by scanning and parsing the code), I figured this would be a good place to start, and it was! Very well-written, too. So much so that it even made sense though I haven’t pretended to know anything about reading Java in years.

What this meant, anyway, was that I had a clear path forward. Prior to asking for help, I’d written a half-of-a-half implementation that mixed up the lexing and the parsing and the output all together, but this was going to be better, both in terms of building it, in terms of architecture, and in terms of being able to do other things with the tree/graph once I had it. So what I would then do was:

Scan the document into tokens
Parse those tokens into a tree
Take that tree and do something with it

Easy enough, right?

Scanning, Lexing, Whatever You Want to Call It

I’m still not sure what the difference between "scanning" and "lexing" is, if there is one at all, but anyway I needed to generate some tokens. I don’t plan on going into too much detail about the why/how of this (instead I refer you back to Crafting Interpreters), but there are a few interesting (annoying?) things about asciidoc that I think are worth mentioning here.

Like markdown, asciidoc is essentially a line-based language. The most significant character is therefore the line break, \n, and in some worlds/lights it makes sense to parse asciidoc line-by-line. If I were to go back and do it as a "one-shot" parser (which according to the chatter in the Asciidoc community chat, isn’t possible anyway), I might do it as a line-by-line thing. Instead, however, I did the scanning character-by-character, in part because that’s what the book told me to do, and in part because keeping track of the newline tokens actually made parsing much easier in the end (I think/hope, anyway).

So the scanning.

Maybe the best "new thing I started using a lot" of 2024 was the humble Enum. I started using them in Python for a specific thing, and then started using them more, and one of the things I like best about Rust is that it takes its Enums seriously. So, to wit, the first thing I did was create a big ass TokenType enum:

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum TokenType {
    NewLineChar,
    LineContinuation,
    ThematicBreak,
    PageBreak,
    Comment,
    PassthroughBlock, // i.e., "++++"
    SidebarBlock,     // i.e., "****"
    SourceBlock,      // i.e., "----"

    // ...snip

Note	All source can be found in the Github repo. I’m going to condense and remove some comments and things in this post as needed to keep it clean.

And then a Struct for each token:

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Token {
    pub token_type: TokenType,
    pub lexeme: String,          // raw string of code
    pub literal: Option<String>, // our literals are only ever strings (or represented as such)
    pub line: usize,
    pub startcol: usize,
    pub endcol: usize,
    /// The file's stack hierarchy if it's an include, otherwise stays empty
    pub file_stack: Vec<String>,
}

There is a draft official schema for how an asciidoc document should be (able to be) represented, and that’s why we’re keeping track of line, startcol, etc. I think if I were to go back and clean this up, we could probably drop the literal attribute, since we don’t really need it (this was inspired/copied from the Crafting Interpreters way of doing things, which has different requirements than what we have, ultimately).

So once we have our Token structs to play with, we can then proceed to actually scanning the document into tokens. We create a Scanner struct to hold some state and the source and things:

#[derive(Debug)]
/// Scans an asciidoc `&str` into [`Token`]s to be consumed by the Parser.
pub struct Scanner<'a> {
    pub source: &'a str,
    start: usize,
    startcol: usize,
    current: usize,
    line: usize,
    file_stack: Vec<String>,
}

And then, because Rust has such good pattern matching, the actual work just becomes a(n admittedly gigantic) match/switch statement:

fn scan_token(&mut self) -> Token {
        let c = self.source.as_bytes()[self.current] as char;
        self.current += 1;

        match c {
            '\n' => self.add_token(TokenType::NewLineChar, false, 1),

            '\'' => {
                if self.starts_repeated_char_line(c, 3) {
                    self.current += 2;
                    self.add_token(TokenType::ThematicBreak, false, 0)
                } else if ['\0', ' ', '\n'].contains(&self.peek_back()) && self.peek() == '`' {
                    self.current += 1;
                    self.add_token(TokenType::OpenSingleQuote, true, 0)
                } else {
                    self.add_text_until_next_markup()
                }
            }
    // ...snip

In order to keep things moving along speedily (because, in addition to being "cool," Rust is also supposed to be "fast"), the actual scanning function is implemented as an Iterator (a "generator" in Python-speak):

impl<'a> Iterator for Scanner<'a> {
    type Item = Token;

    fn next(&mut self) -> Option<Self::Item> {
        if !self.is_at_end() {
            self.start = self.current;
            return Some(self.scan_token());
        }
        None
    }
}

(It was amazing how easy it was to do that, really.)

Some fun nuances, because we’re dealing with "text" instead of "code," that came up ended up being character boundaries. So take something like the humble ellipsis (…) or an emoji: these require multiple bytes to represent. This means that sometimes you might try to do something between the bytes it takes to represent the character, which makes the scanner sad (and die, or in Rust-parlance, panic!).

(It occurs to me now that I should have specified earlier that we’re scanning byte by byte, not character-by-character; there are some reasons for doing this that I don’t feel like explaining to do with the way text is encoded and then handled by Rust, so, just, like, trust me that this was a good way to do it.)

Getting around this means that we just check for character boundaries when we look around to see, based on context, what kind of token we should be producing. And we do a lot of looking around! Here are a few, noting the easy-to-use is_char_boundary() function in there:

    fn peek(&self) -> char {
        if self.is_at_end() || !self.source.is_char_boundary(self.current) {
            return '\0';
        }
        self.source.as_bytes()[self.current] as char
    }

    fn peek_back(&self) -> char {
        if self.start == 0 || !self.source.is_char_boundary(self.start - 1) {
            return '\0';
        }
        self.source.as_bytes()[self.start - 1] as char
    }

    fn peeks_ahead(&self, count: usize) -> &str {
        if self.is_at_end()
            || self.current + count > self.source.len()
            || !self.source.is_char_boundary(self.current + count)
        {
            return "\0";
        }
        &self.source[self.current..self.current + count]
    }

This means that, say, if we get a character -, and know it’s the beginning of a new line (i.e., that self.peek_back() == '\n'), and we can peeks_ahead to see that self.peeks_ahead() == "---\n", we know that we should generate a TokenType::SourceBlock delimiter token. Scanning is essentially that, but, like, a bunch of times with a bunch of edge cases and nuances (e.g., because that four-repeated-characters-before-a-newline is such a common pattern, you write a function that checks that for you).

This, naturally, segues into unit testing!

There are a lot of tests around the scanner! I haven’t yet gotten around to running coverage on it, but I think it’s pretty good. One thing I don’t like about Rust is that, by convention, you keep unit tests in the same file as the code they’re testing. I see why you’d want to do that, but also my scanner/mod.rs file is a whopping 1932 lines long. Coming from Python-land… ouch! Still: it works, especially if you use my new best friend rstest, which works so analogously to our dear friend pytest that I was able to get up and running in a matter of minutes with it, simplifying the test-cases dramatically:

#[rstest]
#[case("NOTE", TokenType::NotePara)]
#[case("TIP", TokenType::TipPara)]
#[case("IMPORTANT", TokenType::ImportantPara)]
#[case("CAUTION", TokenType::CautionPara)]
#[case("WARNING", TokenType::WarningPara)]
fn inline_admonitions(#[case] markup_check: &str, #[case] expected_token: TokenType) {
    let markup = format!("{}: bar.", markup_check);
    let expected_tokens = vec![
        Token::new_default(
            expected_token,
            format!("{}: ", markup_check),
            Some(format!("{}: ", markup_check)),
            1,
            1,
            markup_check.len() + 2, // account for space
        ),
        Token::new_default(
            TokenType::Text,
            "bar.".to_string(),
            Some("bar.".to_string()),
            1,
            markup_check.len() + 3,
            markup_check.len() + 6,
        ),
    ];
    scan_and_assert_eq(&markup, expected_tokens);
}

Easy, right? So let’s now suppose we scan our document-as-a-&str into a bunch of tokens. We then parse them. Yay!

Parser-ing

…and again we use a big-ass match statement. But before we can really get into that, we need to look at what we’re doing all this parsing into, namely a (mostly) spec-compliant Abstract Syntax Graph.

Trees and Graphs

In "normal" parsing you create a node tree, and then do some traversing of that tree and… actually I didn’t get that far in the book. Because, to be compliant with the "Asciidoc Technology Compatibility Kit (TCK)," you need to produce JSON, I… just figure it would be easier to start there. JSON — and more specifically the objects needed to serialize it — would be an easy enough "intermediate representation" from which to then go on and output HTML and other formats (more on this later).

To be completely honest, the spec is good but not great, and frankly not complete yet. If I had more time and energy I would contribute more readily to the ADRs and discussions and so on, but… I don’t. Yet. Maybe someday. Regardless.

This meant essentially that I could go the "super abstract" route, and create generic "block" and "inline" objects and go from there, or I could just go ahead and make a struct for each kind of thing, since it’s a finite set of things. So I went that route.

As you do in rust, I used serde and serde-json to do the serialization. What this meant, though, was that it was going to be harder to use Traits to create shared functionality (and to make functions accept more generic parameters). I looked at a few crates that ultimately used our old friend the Enum on the backend to make the serialization happen (since you get the serialization more or less for free with an Enum), so I just did that directly. This meant that I had, for example, this hairy-looking thing:

#[derive(Serialize, Clone, Debug)]
#[serde(untagged)]
pub enum Inline {
    InlineSpan(InlineSpan),
    InlineRef(InlineRef),
    InlineLiteral(InlineLiteral),
    InlineBreak(LineBreak),
}

And that I had to do a lot of if let Some(Block::LeafBlock(block) = foo.last_mut() type stuff, but I’m told that this is part of why my parser is so fast, because enums are so fast, and… if you’re not first you’re last? Anyway, this is one of the design decisions that I’m not sure I would do again (I think it would be more developer-friendly to use Traits), but as (a) I am the only developer and (b) it works, and is fast, it’s fine.

So parsing then becomes a matter of looking at given Token and deciding what to do with it. Because "what to do with it" is often a matter of context, we build a lot of that context into our Parser:

/// Parses a stream of tokens into an [`Asg`] (Abstract Syntax Graph), returning the graph once all
/// tokens have been parsed.
pub struct Parser {
    /// Where the parsing "starts," i.e., the adoc file passed to the script
    origin_directory: PathBuf,
    /// allows for "what just happened" matching
    last_token_type: TokenType,
    /// optional document header
    document_header: Header,
    /// document-level attributes, used for replacements, etc.
    document_attributes: HashMap<String, String>,
    /// holding ground for graph blocks until it's time to push to the main graph
    block_stack: Vec<Block>,
    /// holding ground for inline elements until it's time to push to the relevant block
    inline_stack: VecDeque<Inline>,
    /// holding ground for includes file names; if inside an include push to stack, popping off
    /// once the file's tokens have been accommodated (this allows for simpler nesting)
    file_stack: Vec<String>,
    /// holding ground for a block title, to be applied to the subsequent block
    block_title: Option<Vec<Inline>>,
    /// holding ground for block metadata, to be applied to the subsequent block
    metadata: Option<ElementMetadata>,
    /// counts in/out delimited blocks by line reference; allows us to warn/error if they are
    /// unclosed at the end of the document
    open_delimited_block_lines: Vec<usize>,
    /// appends text to block or inline regardless of markup, token, etc. (will need to change
    /// if/when we handle code callouts)
    open_parse_after_as_text_type: Option<TokenType>,
    // convenience flags
    in_document_header: bool,
    /// designates whether we're to be adding inlines to the previous block until a newline
    in_block_line: bool,
    /// designates whether new literal text should be added to the last span
    in_inline_span: bool,
    /// designates whether, despite newline last_tokens_types, we should append the current block
    /// to the next
    in_block_continuation: bool,
    /// forces a new block when we add inlines; helps distinguish between adding to section.title
    /// and section.blocks
    force_new_block: bool,
    /// Temporarily preserves newline characters as separate inline literal tokens (where ambiguous
    /// blocks, i.e., DListItems, may require splitting the inline_stack on the newline)
    preserve_newline_text: bool,
    /// Some parent elements have non-obvious closing conditions, so we want an easy way to close these
    close_parent_after_push: bool,
    /// Used to see if we need to add a newline before new text; we don't add newlines to the text
    /// literals unless they're continuous (i.e., we never count newline paras as paras)
    dangling_newline: Option<Token>,
}

(As an aside: I’m keeping the comments on this struct, as opposed to many of the others I’ve shown above, in part because it’s useful and in part because I want to shout out to docs.rs for making it SUPER easy to generate really nice documentation for your project. Makes my former technical writer heart happy.)

We keep track of a lot of state, and frankly it got a little over-complicated, but also I didn’t have the time to make it simpler, so: it works, you know?

Again we have a big match statement with a lot of arms like:

TokenType::QuoteVerseBlock => {
    // check if it's verse
    if let Some(metadata) = &self.metadata {
        if metadata.declared_type == Some(AttributeType::Verse) {
            self.parse_delimited_leaf_block(token);
            return;
        }
    } else if self.open_parse_after_as_text_type.is_some() {
        self.parse_delimited_leaf_block(token);
        return;
    }

    self.parse_delimited_parent_block(token);
}

These, in turn generate various Block and Inline objects, that get added to our Abstract Syntax Graph:

#[derive(Serialize, Debug)]
pub struct Asg {
    pub name: String,
    #[serde(rename = "type")]
    pub node_type: NodeTypes,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub attributes: Option<HashMap<String, String>>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub header: Option<Header>,
    #[serde(skip)]
    /// footnote references
    document_id: String,
    #[serde(skip)]
    /// Has of all IDs in the document, and the references they point to
    document_id_hash: HashMap<String, Vec<Inline>>,
    /// Document contents
    pub blocks: Vec<Block>,
    pub location: Vec<Location>,
}

So by and by we build our graph, which takes something like:

This document has two paragraphs.

Paragraphs may be separated by one or more empty lines.

Into:

{
  "name": "document",
  "type": "block",
  "blocks": [
    {
      "name": "paragraph",
      "type": "block",
      "inlines": [
        {
          "name": "text",
          "type": "string",
          "value": "This document has two paragraphs.",
          "location": [ { "line": 1, "col": 1 }, { "line": 1, "col": 33 } ]
        }
      ],
      "location": [ { "line": 1, "col": 1 }, { "line": 1, "col": 33 } ]
    },
    {
      "name": "paragraph",
      "type": "block",
      "inlines": [
        {
          "name": "text",
          "type": "string",
          "value": "Paragraphs may be separated by one or more empty lines.",
          "location": [ { "line": 4, "col": 1 }, { "line": 4, "col": 55 } ]
        }
      ],
      "location": [ { "line": 4, "col": 1 }, { "line": 4, "col": 55 } ]
    }
  ],
  "location": [ { "line": 1, "col": 1 }, { "line": 4, "col": 55 } ]
}

Note	All that `location` stuff is required by the schema; I don’t like it, but hey, it’s not all about me. If ever somebody takes this to create a better asciidoc LSP or something, it’ll be useful information. (Or if I ever start doing more error handling/verification for the user.)

I could perhaps go into more detail about how the parsing actually works, but, you know, it’s just creating objects, and this is getting long. So if you’re curious, look at the code (or holler at me on Bluesky and I’ll do a follow-up post about whichever part you’re interested in). We’ll now turn to doing something with this graph we’ve made.

Turning it Into Something Useful (Templating)

The first, most obvious useful thing for the parser to do is produce HTML, since that can be turned into basically anything else, one way or another. Instead of targeting the kind of HTML that Asciidoctor produces (which I find overly div heavy), I targeted a HTML standard called "HTMLBook", in part because that’s what I use for work and am therefore most comfortable with, and in part because it’s clean and simple and more like what pick-your-favorite-markdown converter produces. So to make HTML, we use templating. Yes! Our old friend templating. From Dreamweaver templates to LiquidTemplates to handlebars to Jinja/Django, they’re all more or less the same. More or less usable. Etc. For this project I went with one called tera, after trying one called askama, which was really really cool but ultimately was hard to make work nicely with serde.

tera, on the other hand, is basically just Django templates. I write Django templates at work. Easy:

{% import "inline.html.tera" as inline_macros %}
{% import "block.html.tera" as block_macros %}
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>{%- if header %}
        {%- for inline in header.title %}
        {{- inline_macros::process_inline(inline=inline) -}}
        {% endfor -%}
        {% endif -%}</title>
</head>

<body>{% for block in blocks %}
    {{ block_macros::process_block(block=block,skip_tag=false) -}}
{% endfor %}
</body>

</html>

There is a pretty annoying recursion issue (not the fault of tera so much as the fault of what I’m trying to do with it), which means that the block and inline macro code is… ugly. But hey, it works to produce nice, clean documents like the following:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title></title>
</head>

<body>
    <p>What follows is an aside.</p>
    <aside data-type="sidebar">
        <h5>Aside Title</h5>
        <p>Some aside text!</p></aside>
</body>

</html>

Nice.

But Wait! There’s More!

So now we’ve more or less gotten to the point where we’ve duplicated a good chuck of what asciidoctor, the reference implementation, does in terms of parsing and conversation, but of course: asciidocr, this implementation, DOES NOT DO EVERYTHING ASCIIDOCTOR DOES, and doesn’t intend to. But it does handle a whole bunch of the language, including nice things like include:: directives (see the limitations doc in the repo for more), but this all started because I not only wanted a non-interpreted-language implementation (with Rust we can generate binaries), but also because I wanted to do other stuff, more easily.

So let’s talk about a little of that.

Docx

If there is a "killer feature" of asciidocr, is it that it will — eventually — produce Word/docx files natively. Creating docx files is a PAIN IN THE ASS, but it’ll be worth it for folks like me who want to write their fictions and whatever else in asciidoc, but then have to send journals and agents publishers Word documents.

I’m currently rewriting the implementation of the DOCX backend, but even now, if you install the tool with the --feature docx enabled (for more on what I’m talking about when I talk about installing a Rust feature, see here), you can get a docx created IF:

It’s only prose and headings
BUT it can include italics and bold and stuff

The reimplementation will be better and handle more things — tables, lists, etc., — but I wanted to write this post now, instead of waiting for it to be "done," since "done" is a myth when it comes to software. Anyway: go try it out! My hope is for the docx backend to be stable enough that I don’t need to hide it behind a feature flag anymore.

Rust and Python

And, somewhat finally, another feature-flag thing: calling asciidocr from Python, making asciidoc conversions super fast with modern syntax (compared to asciidoc.py).

All the credit for this really goes to the pyo3 project, but building on top of their brilliant work, it’s very easy to do something like:

#![cfg(feature="python")]

use std::path::PathBuf;
use crate::scanner;
use crate::parser;
use crate::backends::htmls::render_htmlbook;
use pyo3::{exceptions::PyRuntimeError, prelude::*};

/// parses a string using the specified backend
#[pyfunction]
fn parse_to_html(adoc_str: &str) -> PyResult<String> {
    let graph = parser::Parser::new(PathBuf::from("-")).parse(scanner::Scanner::new(adoc_str));
    match render_htmlbook(&graph) {
        Ok(html) => Ok(html),
        Err(_) => Err(PyRuntimeError::new_err("Error converting asciidoc string")),
    }
}

#[pymodule]
fn asciidocr(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(parse_to_html, m)?)
}

Build a wheel, install it, and then from within Python:

$ python
Python 3.13.1 (main, Jan  7 2025, 10:41:20) [Clang 16.0.0 (clang-1600.0.26.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import asciidocr
>>> asciidoc = "This is _pretty freakin' cool_, right?!"
>>> html = asciidocr.parse_to_html(asciidoc)
>>> print(html)
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title></title>
</head>

<body>

    <p>This is <em>pretty freakin' cool</em>, right?!</p>
</body>

</html>

So that’s nice, and potentially useful. As a friend pointed out recently, I need to get this up on PyPI, but, you know, in time…

Loose Ends

So there’s writing an asciidoc parser in Rust, in a pretty high-level way (I could in theory go back and add more detail, but this post is far, far too long). And there are plenty of loose ends so far as the project itself goes, like:

Actually covering the entirety of the asciidoc language
Allowing users to supply stylesheets for HTML builds via the CLI (and I never talked above about the CLI, did I? Or the packaging process? Maybe separate posts; anyway I used clap).
Creating an Asciidoctor-compliant HTML backend, because that means that folks can use this more as a "drop-in replacement" if they want
Finishing the docx build
…other future dream-big builds that I don’t want to talk about yet (OK: PDFs, I’m talking about PDFs).
And much, much more!

In any case.

As with all newer skills, the biggest benefit to my Rust knowledge was just having to write an ass-ton of Rust. I also think I learned something about design patterns, about balance (i.e., maybe it would have been more "pure" to keep some things in the Parser, but it was so much easier to just make the Scanner a little bit smarter sometimes), and about writing software more generally. I like Rust, in part, because it makes you really consider what the "right" thing to do is (okay: I really like it mostly because the tooling is so damn good), and this in turn makes me think about writing all code different (apologies to my coworkers, who now have to put up with me importing Rust-y patterns into Python — I promise I’ll only do it when it makes sense!).

Mostly, though: I’m just happy I now have a tool that does more or less what I want it to do, and quickly (not to brag, but compare some very non-scientific testing that has asciidocr converting a file to HTML in 0.01s user, whereas asciidoctor takes a whole 0.32s user. It’s an admittedly small but noticeable difference, especially for larger documents). So in that sense mission achieved. Yay.

Links:

Repo: https://github.com/delfanbaum/asciidocr
Crate: https://crates.io/crates/asciidocr
Docs: https://docs.rs/asciidocr/0.1.6/asciidocr/

2024: Another Year in Reading

31 December 2024

It’s December 31, and the probability of my finishing the two-and-a-half books I’ve got in the hopper before midnight are diminishing rapidly. One I only started yesterday (The Third Lie), one is poems and I like to take my time with poems (Monad+Monadnock), and the half is a book club book (The Shape of Content), and I’ve found that it’s better if I read the chapters closer to book club when we do book club. All of which is to say that, despite the apparent trend towards late November/early December "wrapped"-like things, what follows is an accurate and complete list of the books I’ve finished this year.

The primary observation from this "year of reading" is that I read a heckuva lot more than most years. It’s the highest "count" (not that the number really means anything) I’ve had since I’ve been keeping track, and perhaps the highest count ever for me. The primary motivation for this was the need to find "comp titles" for the novel manuscript I’m trying to convince someone to agent or publish early in the year. This meant that I read contemporary fiction in an intentional way for the first time in ever. It was pretty instructive, I think, and though I’m still not sure my comps are really great comps, I did read a lot, and feel like I have at least some provisional understanding of the "market" for "literary" novels, even if I also have ideas about comp titles in general for literary — and especially "experimental"-leaning — novels, but that’s a separate thing than the book list.

I did a few rereads this year, with notables including Life of a Star and The Anthologist, both books I love for very different reasons. Speaking of Janes, Jane Alison’s Villa E was maybe my favorite "I’ve been waiting forever to read this" of the year, given that I am 99% sure I remember her mentioning an "architecture book" as far back as 2012, when I took classes with her. It lived up to the hype in my head.

I was happy to come across an ARC of James, which I really enjoyed, even if I preferred — for my own tastes and interests — Dr. No. A few books I don’t really remember at all, e.g., the Rovelli and the Christie, and there are a few (which I will keep to myself) that I still can’t decide if I liked or not, or if they were good or not. For nonfiction, I really loved The Essays of Leonard Michaels, and Once Upon a Prime made me very happy. I’m very glad I finally got to Monsters, which I greatly enjoyed. For history, I cannot recommend The Blazing World highly enough. It paired nicely with The Name of the Rose, which admittedly took me forever to read, but this was in part because I was savoring it, reading it only before bed, etc.

The latter part of the year was a lot of poetry, I think partly due to my having gone to a proper poetry reading for the first time in a very long time. And luckily, the book I got there, Visitors from the Red Star was super-excellent. The Gold Cell is one of the best collections I’ve ever read, and I’m kind of glad I waited (not intentionally) to read it in my 30s; I’ve had such good luck with the "dollar carts" outside Commonwealth Books and the Brattle Bookshop (both of which are on my "lunch walk" route when I go into the office downtown) this year. The Gold Cell was one, but maybe the best find was Country Cooking and Other Stories, even if this means that I am now forever on the hunt for books published by Burning Deck (RIP).

There were a couple more programming books in there this year, and they were fine. The best ones I read (e.g., Command-Line Rust) I never finished for one reason or another, and only "finished" books end up on the list. I feel like this is often true of programming books, though, which means they’ll always be under-represented here. I’m not going to lose sleep over this.

The year more or less finished with Claudia Rankine, Agota Kristof, Robert Coover (RIP), and Lucia Berlin. This has been a nice way to end the year, even if the Kristof novels are deeply upsetting (and excellent), and the Lucia Berlin stories make me feel far too many things.

There were other notables I didn’t necessarily mention above — e.g., The Organs of Sense might have been my favorite novel of the year, if you were to account for my very intense recency bias when it comes to "favorites" — but, you know, I’m around if you ever want to talk about one or another of them.

In any case, it was a lovely year of reading.

And so the list:

Ladies' Lunch and Other Stories by Lore Segal
How the Internet Happened by Brian McCullough
Seven Brief Lessons on Physics by Carlo Rovelli
The Physiology of Taste by Jean Anthelme Brillat-Savarin, trans. MFK Fisher
Life of a Star by Jane Unrue
The Blazing World by Jonathan Healey
The Ladies-in-Waiting by Javier Olivares and Santiago García
So Long, See You Tomorrow by William Maxwell
Pandora’s Jar by Natalie Haynes
The Essays of Leonard Michaels by Leonard Michaels
Bonsai by Alejandro Zambra
One Woman Show by Christine Coulson
The Age of Wire and String by Ben Marcus
A Life of One’s Own by Joanna Biggs
LaserWriter II by Tamara Shopsin
Marigold and Rose: A Fiction by Louise Glück
Before and After the Book Deal by Courtney Maum
The Beautiful Race: The Story of the Giro d’Italia by Colin O’Brien
Drifts by Kate Zambreno
Conversation of the three wayfarers by Peter Weiss
Crooked House by Agatha Christie
The Berlin Wall (Bookfair Presale Edition) by David Leo Rice
The Name of the Rose by Umberto Eco
Dr. No by Percival Everett
A Shock by Keith Ridgway
Electronic Literature by Scott Rettberg
Interior Chinatown by Charles Yu
re: f(gesture) by Percival Everett
Chaucer by Marion Turner
The Organs of Sense by Adam Ehrlich Sachs
How to Travel with a Salmon and Other Essays by Umberto Eco
The Heart is a Lonely Hunter by Carson McCullers
Once Upon a Prime by Sarah Hart
Villa E by Jane Alison
Visitors from the Red Star by August Smith
Robust Python by Patrick Viafore
James by Percival Everett
Consider the Oyster by MFK Fisher
Lunch Poems by Frank O’Hara
The Anthologist by Nicholson Baker
Tidy First? by Kent Beck
The Poetics of Space by Gaston Bachelard
The Invention of Morel by Adolfo Bioy Casares
Fierce Poise: Helen Frankenthaler and 1950s New York by Alexander Nemerov
Country Cooking and Other Stories by Harry Mathews
Small Things Like These by Claire Keegan
Pockets: An Intimate History of How We Keep Things Close by Hannah Carlson
The Hatred of Poetry by Ben Lerner
In Concrete by Anne Garréta, trans. Emma Ramadan
Meditations in an Emergency by Frank O’Hara
Monsters by Claire Dederer
The Gold Cell by Sharon Olds
Citizen by Claudia Rankine
The Notebook by Ágota Kristóf
The Enchanted Prince by Robert Coover
A Manual for Cleaning Women by Lucia Berlin
The Proof by Ágota Kristóf

Tell Just One Person About

6 December 2024

I’ve been listening to a podcast lately while cooking called A History of Rock Music in 500 Songs, that always includes in the following line in the outro:^[1]

If you’ve enjoyed this episode, please by all means subscribe… but more importantly, please tell just one other person about this podcast. Word of mouth is the best way to get information out about any creative work. So please, if you like this, tell someone. Thank you very much.

I’ve been thinking about this a lot in relation to the books I read and recommend. Specifically the novel Villa E by Jane Alison, who, for disclosure, I took classes with way back at U Miami. It’s a damn good fucking book. Jane is so lovely. But mostly it’s a great fucking book, and I want everybody to read it, though specifically a couple writer friends of mine, one of whom is also working (albeit on the back burner) on an "architecture novel," the other of whom likes/writes literary historical fiction and good prose (Villa E is both). And I really enjoyed it, myself,^[2] and so I’ve told people about it.

And funny enough, I only heard about that podcast because one of my cycling buddies (who is also a musician) told me about it: so it must be working.

This is all in the context, too, of — you know — the political climate, and I can’t help but think that in the face of gigantic, large, looming disasters like climate change et al, the only thing to do — the only thing I can think to do, anyway, given my time, abilities, influence — is go local, go talking to people. Thus Two Page Tuesday. Thus a few other projects I’ve got simmering on my own mental back burner. Thus why I keep telling everyone to go read Jane’s book. Why I’ve mentioned the podcast to a few folks. Why I keep trying to think of ways to get journals and agents and folks to the readings so they can see my great friends' great work and sign it.

But the shit takes time, but we’ll get there,^[3] eventually.

1. At least for the first however-many episodes; I’m still way behind current.

2. So much so that I read it faster than I meant to on a trip, and needed to acquire another book to get me home.

3. Fun fact: the novel I wrote in one of Jane’s classes was called "When We Get There." I have not returned to said novel in about a decade and can no longer remember if it was any good. I’m sure it was fine. I was twenty-two, a child, really.