matklad

TIL: Symlinking NixOS Dotfiles

2026-05-21T00:00:00+00:00

TIL: Symlinking NixOS Dotfiles

May 21, 2026

The standard answer to managing dotfiles on NixOS is home-manager. I’ve never used it, due to two aesthetic and one practical objection:

I avoid dependencies, especially in nix, which rivals Python in the number of approaches to dependency management.
home-manager installs packages for the current user only, which makes sense on non-NixOS systems. But on a single-user desktop system, I prefer having just one set of packages.
Having a source of truth for dotfiles be in nix store requires rebuilding your system to change config, which gets in the way of Emacs-style direct tinkering.

The approach I like is storing dotfiles in the same repository as flake.nix / configuration.nix and symlinking them in place.

The problem here is that NixOS seemingly doesn’t have a “native” way to say that /a/b/c should be a symlink to /c/d/e. Or has it?

If you search options for symlink, you’ll learn about environment.etc which allows you to configure symlinks, but only for things in /etc, not your ~/.config.

For the latter, you can use gnu stow or some other dotfile link manager, but the complexity of the problem of just managing symlinks doesn’t warrant yet another dependency. It’s fine to do this manually.

But wouldn’t it be nice if this framework for declarative configuration of your system allowed you to declaratively configure symlinks? Turns out this is possible, in roundabout way. Inaptly-named systemd-tmpfiles allows creating symlinks from a declarative config, and you can use NixOS to configure systemd-tmpfiles itself (thanks, NobbZ!).

For example, if I want to symlink ~/dotfiles/git/config to .config/git/config:

{
  systemd.tmpfiles.rules = [
    "L+ /home/matklad/.config/git/config - - - - /home/matklad/dotfiles/git/config"
  ];
}

No opinion at this point how this compares to a bespoke script or something more purpose-built.

Always Be Blaming

2026-05-18T00:00:00+00:00

Always Be Blaming

May 18, 2026

A few tips on 4D-ing your code comprehension skills.

I wrote on the importance of reading code before: Look Out For Bugs My default approach to reading is “predictive”: I don’t actually read the code line by line. Rather, I try to understand the problem that it wants to solve, then imagine my own solution, and read the “diff” between what I have in my mind and what I see in the editor. Non-empty “diff” signifies either a bug in my understanding, or an opportunity to improve the code.

This is 2D reading, understanding a snapshot of code, frozen in time. This is usually enough to spot “this feels odd” anomalies, worthy of further investigation.

Ideal code is memoryless — it precisely solves the problem at hand. Most real code is Markov — the shape of the code at time T depends not only on the problem statement, but also on the shape of the code at time T - 1. The 3D step is to trace the evolution of code over time, Where Do We Come From? What Are We? Where Are We Going?.

The step after that is to understand the why. What were we thinking back then, when we wrote this code? It’s useful to have the “theory of mind” concept ready here. I personally learned the term way too late in my life, so let me give a short intro for today’s lucky 10 000. Theory of mind is the ability to imagine yourself in someone else’s skin. Not just in their shoes (“I certainly would have acted differently in that situation”), but with their mind (“I wouldn’t have acted that way, but I get why they did”). This is something people learn. The experimental setup here is to have a child in a room with toys, with a doll sitting near the opposite end of the room, and asking the child “what does the doll see?”. Younger children describe the room from their perspective, older begin to intuit that doll’s perspective is different.

So this is the goal of reading code — understanding what the original author was thinking, and why.

End of the mumbo-jumbo, some practical advice. First, read Every line of code is always documented, it is very good.

Second, make sure it is effortless for you to find out how a given snippet of code evolved. This is harder than it seems! Just git blame isn’t an answer — mind the gap between the problem that’s easy to solve, and the problem in need of solving.

git blame answers spatial question of “how each line appeared in this file”, because there’s a relatively straightforward UI for this — annotate each line with a commit hash. But this is not the question you are asking most of the time! You don’t care about the file! There’s a small snippet of code in the middle, and you want a temporal history of that.

As much as I don’t like working in the browser GitHub’s web interface for blaming is probably better than what you get locally by default. It starts with the y shortcut, which resolves a symbolic reference like

https://github.com/tigerbeetle/tigerbeetle/blob/main/src/vsr/replica.zig

into the one which has a commit hash in the URL:

https://github.com/tigerbeetle/tigerbeetle/blob/c54f613a2eb2a127a0ba212704e3fa988c42e5cb/src/vsr/replica.zig

This commit hash is critical, because it anchors the entire repository — if you open a different file from the web UI, it will be shown as of that commit. This enables you to not myopically focus on just the diff in question, but to absorb the entire context at that point in time.

So my usual web workflow is:

ctrl+f to find the line I am interested in
b to toggle blame
Click “blame prior to change” a couple of times, repeating ctrl+f to go back to the snippet I am curious about.
cmd-click on the commits that are potentially relevant, pinning their commit hashes in the URL in new tabs.
Then, from the commit page, “Browse files” button to then go and t to other files. Or, cmd+l to focus browser’s address bar, and s/commit/tree/ (or back!) as needed, to switch between diff and snapshot views.

Again, my goal here is not to annotate a diff on a file but rather to get a “virtual checkout” as of the interesting commit.

This web approach is what I was using throughout most of my career, but I’ve finally found a way to replicate it locally. The idea is to make blaming “in-place”. Instead of git blame annotating lines of code, I directly switch to a historical commit. I have the following devil hydra of shortcuts:

, b l blames line. It notes the $line the cursor is at, runs git blame -L $line,$line to find $commit that introduced the line, and then runs git switch --detach $commit to check it out. I have a dedicated worktree for code archeology, so I don’t worry about trashing my work. There’s also a half-hearted attempt to maintain “logical” cursor position, but it doesn’t work very well. Is there some git command that tells me directly “what’s the equivalent of $file:$line:column in $sha-A for $sha-B?”

, b p blames parent. Which is just switching to the parent commit of the current HEAD, what “blame before this change” does on GitHub (it works slightly differently because it assumes that , b l was the previous command)

, b u undoes the last blaming operation, switching to the previous point. I really love that, on the web, I can cmd-click to create an alternative branch of exploration. In theory, this is replicatable locally, but I prefer to destructively mutate a single working tree on disk. A big reason for preferring in-place blame is that LSP, ./zig/zig build test, rg and the like just work. That’s more important for me than the garden of forking paths, and undo is an acceptable work-around.

Finally, , b w copies GitHub link to the current commit and line, which I can paste into the browser. An enormous problem with modern version control landscape is that absolutely critical information in the form of code review comments is not a part of the git repository, and is locked in someone else’s proprietary database. I failed to solve this problem in one weekend, and had to begrudgingly adapt. Opening the commit in a browser links you to the PR and its discussion as well.

Implementing this blame workflow required a bit of custom code. Feel free to use it, but beware that it’s somewhat crufty, especially around maintaining current cursor position. Making a production-ready version of this sounds like a fun project ;-)

Catch Flakes On Main

2026-05-14T00:00:00+00:00

Catch Flakes On Main

May 14, 2026

A small Mechanical Habit today:

When using not rocket science rule / merge queue, continue to redundantly run the full test suite on main. Maintain an easily accessible list of recent main failures — these are the flaky tests to eradicate.

For an example, see the “Flakes” link on https://devhub.tigerbeetle.com

Flaky tests are tests that fail intermittently, once in a thousand runs. This might be due to a genuine bug (assumptions about scheduling that mostly hold) or due to instability of underlying infrastructure (e.g., inability to download a release from GitHub, or to delete a folder on Windows). In either case, flaky tests are a huge productivity drain — as the size and complexity of test suite grows, more and more CI runs fail spuriously, even as each individual test almost always passes.

Flaky tests are challenging to deal with — if you are working on landing a PR and your CI fails due to an obvious flake, the temptation to just re-run the test suite is enormous, especially if there’s a certain background dissatisfaction with infrastructure stability.

If you are of a mind to do some flake squashing, then your PRs will be green just to spite you! And working off of others’ PRs would require first to separate flakes from genuine failures.

This is why the merge queue is powerful: if there’s a guarantee that every commit on the main branch passes the tests, then every failure on main is a flake, by definition. Collecting all such failures into a single list compresses time, allows to prioritize the most impactful sources of instability, and reveals correlations between failures.

Learning Software Architecture

2026-05-12T00:00:00+00:00

Learning Software Architecture

May 12, 2026

In reply to an email asking about learning software design skills as a researcher physicist:

I was attached to a bioinformatics lab early in my career, so I think I understand what you are talking about, the phenomenon of “scientific code”! My thoughts:

First meta observation is that “software design” is something best learned by doing. While I had some formal “design” courses at the University, and I was even “an architect” for our course project, that stuff was mostly make-believe, kindergarteners playing fire-fighters. What really taught me how to do stuff was an accident of my career, where my second real project (IntelliJ Rust) propelled me to a position of software leadership, and made design my problem. I did make a few mistakes in IJ Rust, but nothing too horrible, and I learned a lot. So that’s good news — software engineering is simple enough that an inquisitive mind can figure it out from first principles (and reading random blog posts).

Second meta observation, the bad news: Conway’s law is important. Softwaregenesis repeats the social architecture of the organization producing software. Or, as put eloquently by neugierig,

If I were to summarize what I learned in a single sentence, it would be this: we talk about programming like it is about writing code, but the code ends up being less important than the architecture, and the architecture ends up being less important than social issues.

I suspect that the difference you perceive between industrial and scientific software is not so much about software-building knowledge, but rather about the field of incentives that compels people to produce the software. Something like “my PhD needs to publish a paper in three months” is perhaps a significant explainer?

Two things you can do here. One, at times you get a chance to design or nudge an incentive structure for a project. This happens once in a blue moon, but is very impactful. This is the secret sauce behind TIGER_STYLE, not the set of rules per se, but the social context that makes this set of rules a good idea.

Two, you can speedrun the four stages of grief to acceptance. Incentive structure is almost never what you want it to be, but, if you can’t change it, you can adapt to it. This is also true about most industrial software projects — there’s never a time to do a thing properly, you must do the best you can, given constraints.

Let me use rust-analyzer as an example. The physical reality of the project is that it’s simultaneously very deep (it’s a compiler! Yay!) and very wide (opposite to an LLM, a classical IDE is a lot of purpose-built special features). The social reality is that “deep compiler” can attract a few brilliant dedicated contributors, and that the “breadth features” can be a good fit for an army of weekend warriors, people who learn Rust, who don’t have sustained capacity to participate in the project, but who can sink an hour or two to scratch their own itch.

My insistence that rust-analyzer doesn’t require building rustc, that it builds on stable, that it doesn’t have any C dependencies, and that the entire test suite takes seconds, was in the service of the goal of attracting high-impact contributors. I was wrangling the build system to make sure people can work on the borrow checker without thinking about anything else.

To attract weekend warriors, the internals of rust-analyzer are split into multiple independent features, where each feature is guarded by catch_unwind at runtime. The thinking was that I explicitly don’t want to care too much about quality there, that the bar for getting a feature PR in is “happy path works & tested”. It’s fine if the code crashes, it will only attract further contributors, provided that:

the quality is isolated to a feature, and doesn’t spill over,
at runtime, the crash is invisible to the user (it’s crucial that rust-analyzer features work with an immutable snapshot, and can’t poison the data).

In contrast, when working on the core spine which provided support for features, I was very relatively more pedantic about quality.

A word of caution about adapting to, rather than fixing incentive structure — the future is uncertain, and tends to happen in the least convenient manner. The original motivation behind rust-analyzer experiment was to avoid the need to write a parallel compiler (the one in IntelliJ Rust), and to prototype a better architecture for LSP, so that the learnings could be backported to rustc. So, even in core (especially in core), the code was very experimental. Oh well. Stuck with one more compiler now, I guess?

I might hazard a guess that something similar happened to uutils project, which started as the primary destination for people learning Rust, and ended up as Ubuntu coreutils implementation.

Third, now to some concrete recommendations. Sadly, I don’t know of a single book I can recommend which contains the truths. I suspect one can only find such a book in an apocryphal short story by Borges: practice seems to be an essential element here. But here are some things worth paying attention to:

Boundaries talk by Gary Bernhardt is all-time favorite. It contains solid object-level advice, and, for me, it triggered the meta inquiry.

How to Test is something I wish I had. I immediately understood the importance of testing, but it took me a long time to grow arrogant enough to admit that most widely-cited testing advice is shamanistic snake-oil, and to conceptualize what actually works.

∅MQ guide and, more generally, writings by Pieter Hintjens introduced me to Conway’s Law thinking. That “feature development” architecture of rust-analyzer? – optimistic merging, applied.

Reflections on a decade of coding by Jamii is excellent, goes very meta. It is intentionally the first of my links.

Ted Kaminski blog is the closest there is to a coherent theory of software development, appropriately framed as a set of notes to a non-existing book!

As for the actual books, Software Engineering at Google and Ousterhout’s The Philosophy of Software Design are often recommended. They are good. SWE, in particular, helped me with a couple of important names. But they weren’t ground breaking for me.

Steering Zig Fmt

2026-05-08T00:00:00+00:00

Steering Zig Fmt

May 8, 2026

Two tips on using zig fmt effectively. Read this if you are writing Zig, or if you are implementing a code formatter.

For me, zig fmt is better than any other formatter I used: rustfmt, the one in IntelliJ, deno fmt. zig fmt is steerable. For every syntactic construct, it has several variations for how it might be laid out. The variation used is selected by looking at what’s currently in a file.

Easier to show a pair of examples:

    f(1, 2,
      3);

// -> zig fmt ->

    f(1, 2, 3);

    f(1, 2,
      3,);

// -> zig fmt ->

    f(
        1,
        2,
        3,
    );

Depending on the trailing comma, function call is formatted on a single line, or with one argument per line.

The way this plays out in practice is that you decide how you want to lay out the code, add a couple of ,, hit the reformat shortcut (, p is mine), and zig fmt does the rest. For me, this works better than the alternative of the formatter guessing. 90% of great formatting are blank lines between logical blocks and tasteful choice of intermediate variables, so you might as well lean into key choices, rather than eliminate them.

I know of one non-trivial formatting customization point: columnar layout for arrays:

    .{ 1, 2, 3,
       4, 5, 6, 7, 8, 9, 10, 11,  };

One would think that trailing comma would lead to a number-per-line layout, but, for arrays, zig fmt also takes note of the first line break. In this case, the line break comes after the first three items, so we get three numbers per line, aligned:

How cool is that!

Furthermore, with judicious use of ++ (array concatenation), you can vary the number of items per line. When I need to pass --key value pairs to subprocess, I often go for formatting like this:

try run(&(.{ "aws", "s3", "sync", path, url } ++ .{
    "--include",            "*.html",
    "--include",            "*.xml",
    "--metadata-directive", "REPLACE",
    "--cache-control",      "max-age=0",
}));

Minimal Viable Zig Error Contexts

2026-05-03T00:00:00+00:00

Minimal Viable Zig Error Contexts

May 3, 2026

fn process_file(io: Io, path: []const u8) !void {
    errdefer log.err("path={s}", .{path});

    const fd = try Io.Dir.cwd().openFile(io, path, .{});
    defer fd.close(io);

    // ...
}

Out of the box, Zig provides minimal and sufficient facilities for error handling — strongly-typed error codes. Error reporting is left to the user. Idiomatic solution is to pass a Diagnostics out parameter (“sink”) to materialize human-readable strings as needed.

Diagnostics pattern works well for “production” code, but for more script-y code it adds too much friction relative to the default option of a plain try fallible(), which of course gives a less than ideal message on failure:

λ zig build
error: FileNotFound
~/.cache/zig/p/../lib/std/Io/Threaded.zig:4866:35: 0x1044126c7 in dirOpenFilePosix (fail)
                        .NOENT => return error.FileNotFound,
                                  ^
~/.cache/zig/p/../lib/std/Io/Dir.zig:578:5: 0x104347d8b in openFile (fail)
    return io.vtable.dirOpenFile(io.userdata, dir, sub_path, options);
    ^
~/fail/main.zig:10:16: 0x10443da5f in f (fail)
    const fd = try Io.Dir.cwd().openFile(io, path, .{});
               ^
~/fail/main.zig:6:5: 0x10443db47 in main (fail)
    try process_file(io, "data.txt");
    ^

Error trace is helpful, but knowing which file is the problem is even more so.

The first attempt at finding a middle ground between fully-fledged diagnostics sink pattern and a plain try is something like this:

const fd = dir.openFile(io, path, .{}) catch |err| {
    log.err("failed to open file '{s}': {t}", .{path, err});
    return err;
}

Unsatisfactory. The friction is high, you need to come up with a reasonably-sounding error message, the “happy path” of the code is obscured, and you need to repeat this for every fallible operation.

A worse-is-better version of the above code is

errdefer log.err("path={s}", .{path});
const fd = try dir.openFile(io, path, .{});

That is, just log error context as key=value pairs, guarded by errdefer. The result is not pretty, but passable:

λ zig build
error: path=./data.txt
error: FileNotFound
~/.cache/zig/p/../lib/std/Io/Threaded.zig:4866:35: 0x1044126c7 in dirOpenFilePosix (fail)
                        .NOENT => return error.FileNotFound,
                                  ^
~/.cache/zig/p/../lib/std/Io/Dir.zig:578:5: 0x104347d8b in openFile (fail)
    return io.vtable.dirOpenFile(io.userdata, dir, sub_path, options);
    ^
~/fail/main.zig:10:16: 0x10443da5f in f (fail)
    const fd = try Io.Dir.cwd().openFile(io, path, .{});
               ^
~/fail/main.zig:6:5: 0x10443db47 in main (fail)
    try process_file(io, "data.txt");
    ^

The friction is reduced a lot:

No need to come up with any error messages beyond existing variable names.
No need to change any of the trys.
The context is set per-block. If a function does several fallible operations on a file, the path needs to be specified only once.
The context is “telescopic” every function in the call-stack can add its own context.

There’s one huge drawback though — the error message is logged, even if the error is subsequently handled. This is especially important in Zig 0.16, where cancelation (serendipitous-success) is a possible error for any IO-ing operation, and which is intended to be handled, rather than reported.

Generalizing:

Happy path adds context to all operations in-progress.
Errors materialize current context.

This does feel like a better error management strategy than decorating errors individually, when they happen. I wonder which language features facilitate this style?

This article https://goldstein.lol/posts/error-progress/ rather convincingly argues that the answer might be “none”?

256 Lines or Less: Test Case Minimization

2026-04-20T00:00:00+00:00

256 Lines or Less: Test Case Minimization

Apr 20, 2026

Property Based Testing and fuzzing are a deep and science-intensive topic. There are enough advanced techniques there for a couple of PhDs, a PBT daemon, and a client-server architecture. But I have this weird parlor-trick PBT library, implementable in a couple of hundred lines of code in one sitting.

This week I’ve been thinking about a cool variation of a consensus algorithm. I implemented it on the weekend. And it took just a couple of hours to write a PBT library itself first, and then a test, that showed a deep algorithmic flaw in my thinking (after a dozen trivial flaws in my coding). So, I don’t get to write more about consensus yet, but I at least can write about the library. It is very simple, simplistic even. To use an old Soviet joke about Babel and Bebel, it’s Gogol rather than Hegel. But for just 256 lines, it’s one of the highest power-to-weight ratio tools in my toolbox.

Read this post if:

You want to stretch your generative testing muscles.
You are a do-it-yourself type, and wouldn’t want to pull a ginormous PBT library off the shelf.
You would pull a library, but want to have a more informed opinion about available options, about essential and accidental complexity.
You want some self-contained real-world Zig examples :P

Zig works well here because it, too, is exceptional in its power-to-weight.

FRNG

The implementation is a single file, FRNG.zig, because the core abstraction here is a Finite Random Number Generator — a PRNG where all numbers are pre-generated, and can run out. We start with standard boilerplate:

const std = @import("std");
const assert = std.debug.assert;

entropy: []const u8,

pub const Error = error{OutOfEntropy};

const FRNG = @This();

pub fn init(entropy: []const u8) FRNG {
    return .{ .entropy = entropy };
}

In Zig, files are structs: you obviously need structs, and the language becomes simpler if structs are re-used for what files are. In the above const FRNG = @This() assigns a conventional name to the file struct, and entropy: []const u8 declares instance fields (only one here). const Error and fn init are “static” (container level) declarations.

The only field we have is just a slice of raw bytes, our pre-generated random numbers. And the only error condition we can raise is OutOfEntropy.

The simplest thing we can generate is a slice of bytes. Typically, API for this takes a mutable slice as an out parameter:

pub fn fill(prng: *PRNG, bytes: []u8) void { ... }

But, due to pre-generated nature of FRNG, we can return the slice directly, provided that we have enough entropy. This is going to be our (sole) basis function, everything else is going to be a convenience helper on top:

pub fn bytes(frng: *FRNG, size: usize) Error![]const u8 {
    if (frng.entropy.len < size) return error.OutOfEntropy;
    const result = frng.entropy[0..size];
    frng.entropy = frng.entropy[size..];
    return result;
}

The next simplest thing is an array (a slice with a fixed size):

pub fn array(frng: *FRNG, comptime size: usize) Error![size]u8 {
    return (try frng.bytes(size))[0..size].*;
}

Notice how Zig goes from runtime-known slice length, to comptime known array type. Because size is a comptime constant, slicing []const u8 with [0..size] returns a pointer to array, *const [size]u8.

We can re-interpret a 4-byte array into u32. But, because this is Zig, we can trivially generalize the function to work for any integer type, by passing in Int comptime parameter of type type:

const builtin = @import("builtin");

pub fn int(frng: *FRNG, Int: type) Error!Int {
    comptime {
        assert(@typeInfo(Int).int.signedness == .unsigned);
        assert(builtin.cpu.arch.endian() == .little);
    }
    return @bitCast(try frng.array(@sizeOf(Int)));
}

This function is monomorphised for every Int type, so @sizeOf(Int) becomes a compile-time constant we can pass to fn array.

Production code would be endian-clean here, but, for simplicity, we encode our endianness assumption as a compile-time assertion. Note how Zig communicates information about endianness to the program. There isn’t any kind of side-channel or extra input to compilation, like --cfg flags. Instead, the compiler materializes all information about target CPU as Zig code. There’s a builtin.zig file somewhere in the compiler caches directory that contains

pub const cpu: std.Target.Cpu = .{
    .arch = .aarch64,
    .model = &std.Target.aarch64.cpu.apple_m3,
    // ...
}

This file can be accessed via @import("builtin") and all the constants inspected at compile time.

We can make an integer, and a boolean is even easier:

pub fn boolean(frng: *FRNG) Error!bool {
    return (try frng.int(u8)) & 1 == 1;
}

Strictly speaking, we only need one bit, not one byte, but tracking individual bits is too much of a hassle.

From an arbitrary int, we can generate an int in range. As per Random Numbers Included, we use a closed range, which makes the API infailable and is usually more convenient at the call-site:

pub fn int_inclusive(frng: *FRNG, Int: type, max: Int) Error!Int

As a bit of PRNG trivia, while this could be implemented as frng.int(Int) % (max + 1), the result will be biased (not uniform). Consider the case where Int = u8, and a call like frng.int_inclusive(u8, 64 * 3).

The numbers in 0..64 are going to be twice as likely as the numbers in 64..(64*3), because the last quarter of 256 range will be aliased with the first one.

Generating an unbiased number is tricky and might require drawing arbitrary number of bytes from entropy. Refer to https://www.pcg-random.org/posts/bounded-rands.html for details. I didn’t, and copy-pasted code from the Zig standard library. Use at your own risk!

pub fn int_inclusive(frng: *FRNG, Int: type, max: Int) Error!Int {
    comptime assert(@typeInfo(Int).int.signedness == .unsigned);
    if (max == std.math.maxInt(Int)) return try frng.int(Int);

    const bits = @typeInfo(Int).int.bits;
    const less_than = max + 1;

    var x = try frng.int(Int);
    var m = std.math.mulWide(Int, x, less_than);
    var l: Int = @truncate(m);
    if (l < less_than) {
        var t = -%less_than;

        if (t >= less_than) {
            t -= less_than;
            if (t >= less_than) t %= less_than;
        }
        while (l < t) {
            x = try frng.int(Int);
            m = std.math.mulWide(Int, x, less_than);
            l = @truncate(m);
        }
    }
    return @intCast(m >> bits);
}

Now we can generate an int bounded from above and below:

pub fn range_inclusive(
    frng: *FRNG, Int: type,
    min: Int, max: Int,
) Error!Int {
    comptime assert(@typeInfo(Int).int.signedness == .unsigned);
    assert(min <= max);
    return min + try frng.int_inclusive(Int, max - min);
}

Another common operation is picking a random element from a slice. If you want to return a pointer to a element, you’ll need a const and mut versions of the function. A simpler and more general solution is to return an index:

pub fn index(frng: *FRNG, slice: anytype) Error!usize {
    assert(slice.len > 0);
    return try frng.range_inclusive(usize, 0, slice.len - 1);
}

At the call site, xs[try frng.index(xs)] doesn’t look too bad, is appropriately const-polymorphic, and is also usable for multiple parallel arrays.

Simulation

So far, we’ve spent about 40% of our line budget implementing a worse random number generator that can fail with OutOfEntropy at any point in time. What is it good for?

We use it to feed our system under test with random inputs, see how it reacts, and check that it does not crash. If we code our system to crash if anything unexpected happens and our random inputs cover the space of all possible inputs, we get a measure of confidence that bugs will be detected in testing.

For my consensus simulation, I have a World struct that holds a FRNG and a set of replicas:

const World = struct {
    frng: *FRNG,
    replicas: []Replica,
    // ...
};

World has methods like:

fn simulate_request(world: *World) !void {
    const replica = try world.frng.index(world.replicas);
    const payload = try world.frng.int(u64);

    world.send_payload(replica, payload);
}

I then select which method to call at random:

fn step(world: *World) !void {
    const action = try world.frng.weighted(.{
        .request = 10,
        .message = 20,
        .crash = 1,
    });
    switch (action) {
        .request => try world.simulate_request(),
        .message => { ... },
        .crash => { ... },
    }
}

Here, fn weighted is another FRNG helper that selects an action at random, proportional to its weight. This helper needs quite a bit more reflection machinery than we’ve seen so far:

pub fn weighted(
    frng: *FRNG,
    weights: anytype,
) Error!std.meta.FieldEnum(@TypeOf(weights)) {
    const fields =
        comptime std.meta.fieldNames(@TypeOf(weights));

    var total: u32 = 0;
    inline for (fields) |field|
        total += @field(weights, field);
    assert(total > 0);

    var pick = try frng.int_inclusive(u64, total - 1);
    inline for (fields) |field| {
        const weight = @field(weights, field);
        if (pick < weight) {
            return @field(
                std.meta.FieldEnum(@TypeOf(weights)),
                field,
            );
        }
        pick -= weight;
    }
    unreachable;
}

weights: anytype is compile-time duck-typing. It means that our weighted function is callable with any type, and each specific type creates a new monomorphised instance of a function. While we don’t explicitly name the type of weights, we can get it as @TypeOf(weights).

FieldEnum is a type-level function that takes a struct type:

const S = struct {
    foo: bool,
    bar: u32,
    baz: []const u8
};

and turns it into an enum type, with a variant per-field, exactly what we want for the return type:

const E = enum { foo, bar, baz };

Tip: if you want to quickly learn Zig’s reflection capabilities, study the implementation of std.meta and std.enums in Zig’s standard library.

The @field built-in function accesses a field given comptime field name. It’s exactly like Python’s getattr / setattr with an extra restriction that it must be evaluated at compile time.

To add one more twist here, I always find it hard to figure out which weights are reasonable, and like to generate the weights themselves at random at the start of the test:

pub fn swarm_weights(frng: *FRNG, Weights: type) Error!Weights {
    var result: Weights = undefined;
    inline for (comptime std.meta.fieldNames(Weights)) |field| {
        @field(result, field) = try frng.range_inclusive(u32, 1, 100);
    }
    return result;
}

(If you feel confused here, check out Swarm Testing Data Structures)

Stepping And Runnig

Now we have enough machinery to describe the shape of test overall:

fn run_test(gpa: Allocator, frng: *FRNG) !void {
    var world = World.init(gpa, &frng) catch |err|
        switch (err) {
            error.OutOfEntropy => return,
            else => return err,
        };
    defer world.deinit(gpa);

    while (true) {
        world.step() catch |err| switch (err) {
            error.OutOfEntropy => break,
        };
    }
}

const World = struct {
    frng: *FRNG,
    weights: ActionWeights,

    // ...

    const ActionWeights = struct {
        request: u32,
        message: u32,
        crash: u32,
        // ...
    };

    pub fn init(gpa: Allocator, frng: *FRNG) !void {
        const weights = try frng.swarm_weights(ActionWeights);
        // ...
    }

    fn step(world: *World) error{OutOfEntropy}!void {
        const action = try world.frng.weighted(world.weights);
        switch (action) {
            .request => { ... },
            // ...
        }
    }
};

A test needs an FRNG (which ultimately determines the outcome) and an General Purpose Allocator for the World. We start by creating a simulated World with random action weights. If FRNG entropy is very low, we can run out of entropy even at this stage. We assume that the code is innocent until proven guilty — if we don’t have enough entropy to find a bug, this particular test returns success. Don’t worry, we’ll make sure that we have enough entropy elsewhere.

We use catch |err| switch(err) to peel off OutOfEntropy error. I find that, whenever I handle errors in Zig, very often I want to discharge just a single error from the error set. I wish I could use parenthesis with a catch:

// NOT ACTUALY ZIG :(

var world = try World.init(gpa, &frng)
    catch (error.OutOfEntropy) return;

Anyway, having created the World, we step through it while we still have entropy left. If any step detects an internal inconsistency, the entire World crashes with an assertion failure. If we got to the end of while(true) loop, we know that at least that particular slice of entropy didn’t uncover anything suspicious.

Notice what isn’t there. We aren’t generating a complete list of actions up-front. Rather, we make random decisions as we go, and can freely use the current state of the World to construct a menu of possible choices (e.g., when sending a message, we can consider only not currently crashed replicas).

Binary Search the Answer

And here we can finally see the reason why we bothered writing a custom Finite PRNG, rather than using an off-the-shelf one. The amount of entropy in FRNG defines the complexity of the test. The fewer random bytes we start with, the faster we exit the step loop. And this gives us an ability to minimize test cases essentially for free.

Suppose you know that a particular entropy slice makes the test fail (cluster enters split brain at the millionth step). Let’s say that the slice was 16KiB. The obvious next step is to see if just 8KiB would be enough to crash it. And, if 8KiB isn’t, than perhaps 12KiB?

You can binary search the minimal amount of entropy that’s enough for the test to fail. And this works for any test, it doesn’t have to be a distributed system. If you can write the code to generate your inputs randomly, you can measure complexity of each particular input by measuring how many random bytes were drawn in its construction.

And now the hilarious part — of course it seems that the way to minimize entropy is to start with a particular failing slice and apply genetic-algorithm mutations to it. But a much simpler approach seems to work in practice — just generated a fresh, shorter entropy slice. If you found some failure at random, then you should be able to randomly stumble into a smaller failing example, if one exists — there are much fewer small examples, so finding a failing one becomes easier when the size goes down!

The Searcher

The problem with binary searching for failing entropy is that a tripped assertion crashes the program. There’s no unwinding in Zig. For this reason, we’ll move the search code to a different process. So a single test will be a binary with a main function, that takes entropy on stdin.

Zig’s new juicy main makes writing this easier than in any previous versions of Zig :D

pub fn main(init: std.process.Init) !void {
    const gpa = init.gpa;
    const io = init.io;

    var stdin_reader = std.Io.File.stdin().reader(io, &.{});
    const entropy = try stdin_reader.interface
        .allocRemaining(gpa, .unlimited);
    defer gpa.free(entropy);

    var frng = FRNG.init(entropy);

    var world = World.init(gpa, &frng, .{}) catch |err|
        switch (err) {
            error.OutOfEntropy => return,
            else => return err,
        };
    defer world.deinit(gpa);

    world.run();
}

Main gets Init as an argument, which provides access to things like command line arguments, default allocator and a default Io implementation. These days, Zig eschews global ambient IO capabilities, and requires threading an Io instance whenever we need to make a syscall. Here, we need Io to read stdin.

Now we will implement a harness to call this main. This will be FRNG.Driver:

pub const Driver = struct {
    io: std.Io,
    sut: []const u8,
    buffer: []u8,

    const log = std.log;
};

It will be spawning external processes, so it’ll need an Io. We also need a path to an executable with a test main function, a System Under Test. And we’ll need a buffer to hold the entropy. This driver will be communicating successes and failures to the users, so we also prepare a log for textual output.

How we get entropy to feed into sut? Because we are only interested in entropy size, we won’t be storing the actual entropy bytes, and instead will generate it from a u64 seed. In other words, just two numbers, entropy size and seed, are needed to reproduce a single run of the test:

fn run_once(driver: Driver, options: struct {
    size: u32,
    seed: u64,
    quiet: bool,
}) !enum { pass, fail } {
    assert(options.size <= driver.buffer.len);
    const entropy = driver.buffer[0..options.size];

    var rng = std.Random.DefaultPrng.init(options.seed);
    rng.random().bytes(entropy);

    var child = try std.process.spawn(driver.io, .{
        .argv = &.{driver.sut},
        .stdin = .pipe,
        .stderr = if (options.quiet) .ignore else .inherit,
    });

    try child.stdin.?.writeStreamingAll(driver.io, entropy);
    child.stdin.?.close(driver.io);
    child.stdin = null;

    const term = try child.wait(driver.io);
    return if (success(term)) .pass else .fail;
}

fn success(term: std.process.Child.Term) bool {
    return term == .exited and term.exited == 0;
}

We use default deterministic PRNG to expand our short seed into entropy slice of the required size. Then we spawn sut proces, feeding the resulting entropy via stdin. Closing child’s stdin signals the end of entropy. We then return either .pass or .fail depending on child’s exit code. So, both explicit errors and crashes will be recognized as failures.

Next, we implement the logic for checking if a particular seed size is sufficient to find a failure. Of course, we won’t be able to say that for sure in a finite amount of time, so we’ll settle for some user-specified amount of retries:

fn run_multiple(driver: Driver, options: struct {
    size: u32,
    attempts: u32,
}) !union(enum) { pass, fail: u64 } {
    // ...
}

The user passes us the number of attempts to make, and we return .pass if they all were successfull, or a specific failing seed if we found one:

assert(options.size <= driver.buffer.len);

for (0..options.attempts) |_| {
    var seed: u64 = undefined;
    driver.io.random(@ptrCast(&seed));

    const outcome = try driver.run_once(.{
        .seed = seed,
        .size = options.size,
        .quiet = true,
    });
    switch (outcome) {
        .fail => return .{ .fail = seed },
        .pass => {},
    }
}
return .pass;

To generate a real seed we need “true” cryptographic non-deterministic randomness, which is provided by io.random.

Finally, the search for the size:

fn search(driver: Driver, options: struct {
    attempts: u32 = 100,
}) !union(enum) {
    pass,
    fail: struct { size: u32, seed: u64 },
} {
    // ...
}

Here, we are going to find a smallest entropy size that crashes sut. If we succeed, we return the seed and the size. The upper bound for the size is the space available in the pre-allocated entropy buffer.

The search loop is essentially a binary search, with a twist — rather than using dichotomy on the size directly, we will be doubling a step we use to change the size between iterations.

That is, we start with a small size and step, and, on every iteration, double the step and add it to the size, until we hit a failure (or run out of buffer for the entropy).

Once we found a failure, we continue the search in the other direction — halving the step and subtracting it from the size, keeping the smaller size if it still fails.

On each step, we log the current size and outcome, and report the smallest failing size at the end.

var found_size: ?u32 = null;
var found_seed: ?u64 = null;

var pass: bool = true;
var size: u32 = 16;
var step: u32 = 16;
for (0..1024) |_| {
    if (step == 0) break;
    const size_next = if (pass) size + step else size -| step;
    if (size > driver.buffer.len) break;

    const outcome = try driver.run_multiple(.{
        .size = size_next,
        .attempts = options.attempts,
    });
    switch (outcome) {
        .pass => log.info("pass: size={}", .{size_next}),
        .fail => |seed| {
            found_size = size_next;
            found_seed = seed;
            log.err("fail: size={} seed={}", .{ size_next, seed });
        },
    }
    const pass_next = (outcome == .pass);

    if (pass and pass_next) {
        step *= 2;
    } else if (!pass and !pass_next) {
        // Keep the step.
    } else {
        step /= 2;
    }

    if (pass or !pass_next) {
        size = size_next;
        pass = pass_next;
    }
} else @panic("safety counter");

if (found_size == null) return .pass;
return .{ .fail = .{
    .size = found_size.?,
    .seed = found_seed.?,
} };

Finally, we wrap Driver’s functionality into main that works in two modes — either reproduces a given failure from seed and size, or searches for a minimal failure:

pub fn main(
    gpa: std.mem.Allocator,
    io: std.Io,
    sut: []const u8,
    operation: union(enum) {
        replay: struct { size: u32, seed: u64 },
        search: struct {
            attempts: u32 = 100,
            size_max: u32 = 4 * 1024 * 1024,
        },
    },
) !void {
    const size_max = switch (operation) {
        .replay => |options| options.size,
        .search => |options| options.size_max,
    };

    const buffer = try gpa.alloc(u8, size_max);
    defer gpa.free(buffer);

    var driver: Driver = .{
        .io = io,
        .buffer = buffer,
        .sut = sut,
    };

    switch (operation) {
        .replay => |options| {
            const outcome = try driver.run_once(.{
                .size = options.size,
                .seed = options.seed,
                .quiet = false,
            });
            log.info("{t}", .{outcome});
        },
        .search => |options| {
            const outcome = try driver.search(.{
                .attempts = options.attempts,
             });
            switch (outcome) {
                .pass => log.info("ok", .{}),
                .fail => |fail| {
                    log.err("minimized size={} seed={}", .{
                        fail.size, fail.seed,
                     });
                },
            }
        },
    }
}

Running the search routine looks like this in a terminal:

Those final seed&size can then be used for .replay, giving you a minimal reproducible failure for debugging!

This … of course doesn’t look too exciting without visualizing a specific bug we can find this way, but the problem there is that interesting examples of systems to test in this way usually take more than 256 lines to implement. So I’ll leave it to your imagination, but you get the idea: if you can make a system fail under a “random” input, you can also systematically search the space of all inputs for the smallest counter-example, without adding knowledge about the system to the searcher. This article also provides a concrete (but somewhat verbose) example.

Here’s the full code:

https://gist.github.com/matklad/343d13547c8bfe9af310e2ca2fbfe109

Consensus Board Game

2026-03-19T00:00:00+00:00

Consensus Board Game

Mar 19, 2026

I have an early adulthood trauma from struggling to understand consensus amidst a myriad of poor explanations. I am overcompensating for that by adding my own attempts to the fray. Today, I want to draw a series of pictures which could be helpful. You can see this post as a set of missing illustrations for Notes on Paxos, or, alternatively, you can view that post as a more formal narrative counter-part for the present one.

The idea comes from my mathematics of consensus lecture, with versions in English and Russian.

The Preamble

I am going to aggressively hand wave the details away, please refer to Notes for filling in the blanks.

And, before we begin, I want to stress again that here I am focusing strictly on the mathematics behind the algorithm, on the logical structure of the universe that makes some things impossible, and others doable. Consensus is but a small part of the engineering behind real data management systems, and I might do something about pragmatics of consensus at some point, just not today ;)

The Problem

There’s a committee of five members that tries to choose a color for a bike shed, but the committee members are not entirely reliable. We want to arrive at a decision even if some members of the committee are absent.

The Vote

The fundamental idea underpinning consensus is simple majority vote. If R0, … R4 are the five committee members, we can use the following board to record the votes:

A successful vote looks like this:

Here, red collected 3 out of 5 votes and wins. Note that R4 hasn’t voted yet. It might, or might not do so eventually, but that won’t affect the outcome.

The problem with voting is that it can get stuck like this:

Here, we have two votes for red, two votes for blue, but the potential tie-breaker, R4, voted for green, the rascal!

To solve split vote, we are going to designate R0 as the committee’s leader, make it choose the color, and allow others only to approve. Note that meaningful voting still takes place, as someone might abstain from voting — you need at least 50% turnout for the vote to be complete:

Here, R0, the leader (marked with yellow napoleonic bicorne), choose red, R2 and R3 acquiesced, so the red “won”, even as R1 and R4 abstained (x signifies absence of a vote).

The problem with this is that our designated leader might be unavailable itself:

The Board

Which brings us to the central illustration that I wanted to share. What are we going to do now is to multiply our voting. Instead of conducting just one vote with a designated leader, the committee will conduct a series of concurrent votes, where the leaders rotate in round-robin pattern. This gives rise to the following half-infinite 2D board on which the game of consensus is played:

Each column plays independently. If you are a leader in a column, and your cell is blank, you can choose whatever color. If you are a follower, you need to wait until column’s leader decision, and then you can either fill the same color, or you can abstain. After several rounds the board might end up looking like this:

The benefit of our 2D setup is that, if any committee member is unavailable, their columns might get stuck, but, as long as the majority is available, some column somewhere might still complete. The drawback is that, while individual column’s decision is clear and unambiguous, the outcome of the board as whole is undefined. In the above example, there’s a column where red wins, and a column where blue wins.

So what we are going to do is to scrap the above board as invalid, and instead require that any two columns that achieved majorities must agree on the color. In other words, the outcome of the entire board is the outcome of any of its columns, whichever finishes first, and the safety condition is that no two colors can reach majorities in different columns.

Let’s take a few steps back when the board wasn’t yet hosed, and try to think about the choice of the next move from the perspective of R3:

As R3 and the leader for your column, you need to pick a color which won’t conflict with any past or future decisions in other columns. Given that there are some greens and blues already, it feels like maybe you shouldn’t pick red… But it could be the case that the three partially filled columns won’t move anywhere in the future, and the first column gets a solid red line! Tough choices! You need to worry about the future and the infinite number of columns to your right!

Luckily, the problem can be made much easier if we assume that everyone plays by the same rules, in which case it’s enough to only worry about the columns to your left. Suppose that you, and everyone else is carefully choosing their moves to not conflict with the columns to the left. Then, if you chose red, your column wins, and subsequently some buffoon on the right chooses green, it’s their problem, because you are to their left.

So let’s just focus on the left part of the board. Again, it seems like blue or green might be good bets, as they are already present on the board, but there’s a chance that the first column will eventually vote for red. To prevent that, what we are going to do is to collect a majority of participants (R0, R2, R3) and require them to commit to not voting in the first columns. Actually, for that matter, let’s prevent them from voting in any column to the left:

Here, you asked R0, R2 and R3 to abstain from casting further votes in the first three columns, signified by black x. With this picture, we can now be sure that red can not win in the first column — no color can win there, because only two out of the five votes are available there!

Still, we have the choice between green and blue, which one should we pick? The answer is the rightmost. R2, the participant that picked blue in the column to our immediate left, was executing the same algorithm. If they picked blue, they did it because they knew for certain that the second column can’t eventually vote for green. R2 got a different majority of participants to abstain from voting in the second column, and, while we, as R3, don’t know which majority that was, we know that it exists because we know that R2 did pick blue, and we assume fair play.

That’s all for today, that’s the trick that makes consensus click, in the abstract. In a full distributed system the situation is more complicated. Each participant only sees its own row, the board as a whole remains concealed. Participants can learn something about others’ state by communicating, but the knowledge isn’t strongly anchored at time. By the time a response is received the answer could as well be obsolete. And yet, the above birds-eye view can be implemented in a few exchanges of messages.

Please see the Notes for further details.

JJ LSP Follow Up

2026-03-05T00:00:00+00:00

JJ LSP Follow Up

Mar 5, 2026

In Majjit LSP, I described an idea of implementing Magit style UX for jj once and for all, leveraging LSP protocol. I’ve learned today that the upcoming 3.18 version of LSP has a feature to make this massively less hacky: Text Document Content Request

LSP can now provide virtual documents, which aren’t actually materialized on disk. So this:

can now be such a virtual document, where highlighting is provided by semantic tokens, things like “check out this commit” are code actions, and “goto definition” jumps from the diff in the virtual file to a real file in the working tree.

Exciting!

Against Query Based Compilers

2026-02-25T00:00:00+00:00

Against Query Based Compilers

Feb 25, 2026

Query based compilers are all the rage these days, so it feels only appropriate to chart some treacherous shoals in those waters.

A query-based compiler is a straightforward application of the idea of incremental computations to, you guessed it, compiling. A compiler is just a simple text transformation program, implemented as a lot of functions. You could visualize a run of a compiler on a particular input source code as a graph of function calls:

Here, schematically, squares are inputs like file text or compiler’s command line arguments, g is an intermediate function (e.g, type checking), which is called twice, with different arguments, and f and h are top-level functions (compile executable, or compute completions for LSP).

Looking at this picture, it’s obvious how to make our compiler “incremental” — if an input changes, it’s enough to re-compute only the results on path from the changed input to the root “query”:

A little more thinking, and you can derive “early cutoff” optimization:

If an input to the function changes, but its result doesn’t (e.g, function type is not affected by whitespace change), you can stop change propagation early.

And that’s … basically it. The beauty of the scheme is its silvery-bullety hue — it can be applied without thinking to any computation, and, with a touch of meta programming, you won’t even have to change code of the compiler significantly.

Build Systems à la Carte is the canonical paper to read here. In a build system, a query is an opaque process whose inputs and outputs are file. In a query-based compiler, queries are just functions.

The reason why we want this in the first place is incremental compilation — in IDE context specifically, the compiler needs to react to a stream of tiny edits, and its time budget is about 100ms. Big-O thinking is useful here: the time to react to the change should be proportional to the size of the change, and not the overall size of the codebase. O(1) change leads to O(1) update of the O(N) codebase.

Similar big-O thinking also demonstrates the principal limitation of the scheme — the update work can’t be smaller than the change in the result.

An example. Suppose our “compiler” makes a phrase upper-case:

compile("hello world") == "HELLO WORLD"

This is easy to incrementalize, as changing a few letters in the input changes only a few letters in the output:

compile("hallo world") == "HALLO WORLD"

But suppose now our “compiler” is a hashing or encryption function:

compile("hello world") == "a948904f2f0"
compile("hallo world") == "a7336983eca"

This is provably impossible to make usefully incremental. The encryption can be implemented as a graph of function calls, and you can apply the general incremental recipe to it. It just won’t be very fast.

The reason for that is the avalanche property — for good encryption, a change in any bit of input should flip roughly half of the bits of the output. So just the work of changing the output (completely ignoring the work to compute what needs to be changed) is O(N), not O(1).

The effectiveness of query-based compiler is limited by the dependency structure of the source language.

A particularly nasty effect here is that even if you have only potential avalanche, where a certain kind of change could affect large fraction of the output, even if it usually doesn’t, your incremental engine likely will spend some CPU time or memory to confirm the absence of dependency.

In my

Three Architectures For Responsive IDE, query-based compilation is presented as a third, fall-back option. I still think that that’s basically true: as a language designer, I think it’s worth listening to your inner Grug and push the need for queries as far down the compilation pipeline as possible, sticking to more direct approaches. Not doing queries is simpler, faster, and simpler to make faster (profiling a query-based compiler is a special genre of hurdle racing).

Zig and Rust provide for a nice comparison. In Zig, every file can be parsed completely in isolation, so compilation starts by parsing all files independently and in parallel. Because in Zig every name needs to be explicitly declared (there’s no use *), name resolution also can run on a per-file basis, without queries. Zig goes even further, and directly converts untyped AST into IR, emitting a whole bunch of errors in the process (e.g, “var doesn’t need to be mutable”). See Zig AstGen: AST => ZIR for details. By the time compiler gets to tracked queries, the data it has to work with is already pretty far from the raw source code, but only because Zig language is carefully designed to allow this.

In contrast, you can’t really parse a file in Rust. Rust macros generate new source code, so parsing can’t be finished until all the macros are expanded. Expanding macros requires name resolution, which, in Rust, is a crate-wide, rather than a file-wide operation. Its a fundamental property of the language that typing something in a.rs can change parsing results for b.rs, and that forces fine-grained dependency tracking and invalidation to the very beginning of the front-end.

Similarly, the nature of the trait system is such that impl blocks relevant to a particular method call can be found almost anywhere. For every trait method call, you get a dependency on the impl block that supplies the implementation, but you also get a dependency on non-existence of conflicting impls in every other file!

Again, refer to the Three Architectures for positive ideas, but the general trick is to leverage language semantics to manually cut the compilation tasks into somewhat coarse-grained chunks which are independent by definition (of the source language). Grug builds an incremental map-reduce compiler for his language:

Recursive directory walk finds all files to be compiled.
In parallel, independently, each file is parsed, name-resolved, and lowered. As much as possible, language features (and errors) are syntax driven and not type driven, and can be processed at this stage.
In parallel, a “summary” is extracted from each file, which is essentially just a list of types and signatures, with function bodies empty.
Sequentially, a “signature evaluation” phase is run on this set of summaries, which turns type references in signatures into actual types, dealing with mutual dependencies between files. This phase is re-run whenever a summary of a file changes. Conversely, changes to the body of any function do not invalidate resolved signatures.
In parallel, every function’s body is type-checked, and lowered to type-and-layout resolved IR, applying function-local optimizations.
Sequentially, a thin-lto style set of analyses are run on compiled functions, making inlining decisions and computing call-graph dependent attributes like function purity.
In parallel, each function is codegened to machine code with unresolved references to other functions (relocations).
Sequentially, functions are concatenated into an executable file, receiving an address.
In parallel, all relocations are resolved to now known addresses.

The above scheme works only if the language has a property that changing the body of function foo (not touching its signature) can’t introduce type errors into an unrelated function bar.

Another trick that becomes less available if you blindly apply queries are in-place updates. Consider a language with package declarations and fully qualified names, like Kotlin:

package org.example

fun printMessage() { /*...*/ }
class Message { /*...*/ }

A compiler for this language probably wants to maintain a map of all public declarations, where the keys are fully qualified names, and values are declarations themselves. If you approach the problem of computing this map with query eyes, you might have a base per-file query that returns a map of file’s declarations, and then a recursive per-directory query. And you’ll probably have some kind of structural sharing of the maps, such that changing a single file updates only the “spine”, without actually copying most of the other entries.

But there’s a more direct way to make this sort of structure responsive to changes. You need only two “queries” — per file, and global. When a file changes, you look at the previous version of the map for this file, compute a diff of added or removed declarations, and then apply this diff to the global map.

Zig is planning to use a similar approach to incrementalize linking — rather than producing a new binary gluing mostly unchanged chunks of machine code, the idea is to in-place patch the previous binary.

If you like this article, you might be interested in some other adjacent stuff I’ve written over the years, roughly in the order of importance: