On Async Mutexes


A brief note on a contradiction or confusion in my language design beliefs that I noticed today.

One of the purported advantages of concurrent programming multiplexed on a single thread is that mutexes become unnecessary. A data race is impossible with only one function executing at any given time.

The standard counterargument to this argument is that mutual exclusion is a property of the logic itself, not of the runtime. If a certain snippet of code should be executed atomically with respect to everything else concurrently, it should be annotated as such in the source code. You can still accidentally start a logical race by adding a
.await That atom should be in the middle of the code. And, while programming, you are adding new .awaitIt’s all the time!

This argument makes sense to me, and it also has a logical conclusion. Given that you’ll want to annotate atomic sections of code anyway, it makes sense to move up to Kotlin-style explicit async implicit await.

The paradox I realized today is that for the last few years I have been working on a system built around the implicit exclusion provided by a single thread – TigerBeetle! Consider compaction, a code that is responsible for rewriting data on disk to make it smaller without changing its logical content. During compaction, the tiger beetle schedules Very Concurrent disk read, disk write, and CPU-side merge. Here is the average callback:

fn read_value_block_callback(
    grid_read: *Grid.Read,
    value_block: BlockPtrConst,
) void {
    const read: *ResourcePool.BlockRead =
        @fieldParentPtr("grid_read", grid_read);
    const compaction: *Compaction = read.parent(Compaction);

    const block = read.block;
    compaction.pool.?.reads.release(read);

    assert(block.stage == .read_value_block);
    stdx.copy_disjoint(.exact, u8, block.ptr, value_block);
    block.stage = .read_value_block_done;
    compaction.counters.in +=
        Table.value_block_values_used(block.ptr).len;
    compaction.compaction_dispatch();
}

This is the code (source) that runs when the disk read ends, and it is mutated. *Compaction – Shared status across all outstanding IOs. It is important that no other IO completions concurrently change the compaction, especially inside it. compaction_dispatch
A monster of a function.

Applying the “explain the exclusion” rule to the code would mean that the entire Compaction It needs to be wrapped in a mutex, and each callback needs to start with a lock/unlock pair. And there’s more to Tigerbeetle than just condensation! Whereas Some? Pairs of callbacks may potentially execute concurrently relative to each other, this varies over time. For example, once we start overlapping compaction and execution, they will use our grid cache (buffer manager) at the same time. Explicit locking therefore probably leads to only having a single global lock on the entire state, which is held for the duration of any callback. At which point, it makes sense to push lock acquisition up to the event loop, and we’re back to the built-in locking API!

This appears to be another case of two paradigms for structuring concurrent programs. The async/await discussion usually assumes the CSP programming style, where you define a set of concurrent threads of execution, and the threads are mostly independent, sharing little data. TigerBeetle is written in a state machine/actor style, where the focal point is a large amount of shared state, evolving in different stages in response to IO events (TigerBeetle has only one “actor”). Additionally, TigerBeetle uses manual callbacks instead of async/await syntax, so inserting a .await
This doesn’t really happen in the middle of the critical section. Any new concurrency requires an explicitly named continuation function to be initialized, and each continuation (callback) typically begins with a set of assertions to determine the current state and ensure that the ground has not moved too far since the IO was originally scheduled. or, as in the case
compaction_dispatchSometimes the callback assumes nothing about the state of the world and instead performs a detailed case analysis from scratch.



Leave a Comment