Published on 1665 words, 7 minutes to read
You would think that given the same bytes of input you would get the same bytes of output. Laugh out loud. Lmao. No, you don’t do that. It’s complicated.
Anubis is about to get WebAssembly-based Proof of Work proof of work so that administrators can use the non-SHA256 Proof of Work method to secure their websites. Part of the implementation goals of this work is that the check logic is defined One Place on both client and server. The client and server will connect to WebAssembly to ensure that they are running in lockstep.
However, a small problem arises. What do you do when the client has WebAssembly disabled? I really don’t want to lock people out of websites. Anubis exists in an impossible balance of user experience, administrator experience, and developer experience and any change in any of these factors disrupts the balance for the other factors.
To work around this and also accomplish the goal of defining the check logic OnceI decided to take inspiration from the famous discussion The Birth and Death of JavaScript and recompile WebAssembly into JavaScript. Sure, the resulting JavaScript equivalent will be slower than WebAssembly (even more so because disabling WASM usually disables the JavaScript JIT, the thing that makes JavaScript fast), but it will pan out. ultimately. Hopefully this will be more efficient than existing JavaScript on low-end hardware, but research is needed.
Luckily, the tool I needed (wasm2js From the BinaryEN project) is packaged in Linux distributions. The bad news is that distributions ship ancient versions of this that don’t get the same output as the version on my development machine’s copy from homebrew.
To really ensure that its output is deterministic (required for reproducible construction), I need to bundle a copy of wasm2js. So I did this by making a version of wasm2js Compiled in WebAssembly with wasi-sdk. The rest of the article is a story of the pain of reproduction leading to implementation which I ended with. Buckle up and enjoy the ride!
Reproducible constructs are surprisingly difficult
There are many surprising ways to accidentally create non-deterministic output when developing C/C++. One of the easiest is to use the builtin __DATE__ And __TIME__ Macros to stamp the build with the time the compiler executed:
#include
int main() {
std::cout << __DATE__ << " " << __TIME__ << std::endl;
return 0;
}
Once I build and run it I get this:
$ make clean && make hello.wasm && wasmtime run -W exceptions=y ./hello.wasm
rm -f hello.o hello.wasm
wasi-sdk-33.0-x86_64-linux/bin/wasm32-wasip1-clang++ -O3 -fwasm-exceptions -mllvm -wasm-use-legacy-eh=false -c hello.cpp -o hello.o
wasi-sdk-33.0-x86_64-linux/bin/wasm32-wasip1-clang++ -O3 -fwasm-exceptions -mllvm -wasm-use-legacy-eh=false -fwasm-exceptions -lunwind --no-wasm-opt hello.o -o hello.wasm
Jun 18 2026 00:00:59
The second time I get this:
$ make clean && make hello.wasm && wasmtime run -W exceptions=y ./hello.wasm
rm -f hello.o hello.wasm
wasi-sdk-33.0-x86_64-linux/bin/wasm32-wasip1-clang++ -O3 -fwasm-exceptions -mllvm -wasm-use-legacy-eh=false -c hello.cpp -o hello.o
wasi-sdk-33.0-x86_64-linux/bin/wasm32-wasip1-clang++ -O3 -fwasm-exceptions -mllvm -wasm-use-legacy-eh=false -fwasm-exceptions -lunwind --no-wasm-opt hello.o -o hello.wasm
Jun 18 2026 00:01:11
even though the source code was same bytesThe compiler’s output was wildly different.
To get users and packagers to trust binaries wasm2js I committed to the Anubis repo, I need to make sure you can create the same version I created, all the way down to same bytes. For an added bonus, you must be able to make it your machine And get the same bytes that I got.
clang runs quietly wasm-opt From $PATH behind your back
among other tools like wasm2jsBinaryEN has a bunch of other useful tools like wasm-opt. wasm-opt Optimizes WebAssembly compiler output to let you get more performance. It doesn’t work in every situation, but when it works it works Huge Difference. Like, clang shells out wasm-opt While building.
This normally makes sense, but in this case it caused the build to fail on my DGX Spark because its version wasm-opt is very old:
$ uname -m && which wasm-opt && wasm-opt --version
aarch64
/usr/bin/wasm-opt
wasm-opt version 108
than my workstation which is installed wasm-opt From Homebrew:
$ uname -m && which wasm-opt && wasm-opt --version
x86_64
/home/linuxbrew/.linuxbrew/bin/wasm-opt
wasm-opt version 130
Turns out that wasi-sdk and binaryen depend on the WebAssembly Exception extension. It’s a fair thing to assume that wasi-sdk mostly assumes that you’re building things for web browsers and that 93.86% of browser users have a browser engine new enough to support it. C++ is also one of the main places where exceptions are used, so I think WebAssembly-native exception handling removes a lot of the boilerplate here.
Both wasmtime and wazero require you to flag in exception support. It’s okay; we can just pass -W exceptions=y Use a custom runner harness for wasmtime and for wezero. The annoying part is when my Arm machine’s anemic build of was-opt shows up with exception handling instructions, causing it to crash. Due to this the construction failed.
solution was to pass --no-wasm-opt At the linking stage. This removed one angle of unattainability.
Clang depends on address layout for ordering things
The version of clang that I use to compile wasm2js There is some address-sensitive code generation hidden in its exception handling path. Raw indicator values leak out of order to some extent try_table The blocks come out. The surface of each build differs from the next build by about 29 bytes:
-002a9af0: 2802 0441 0647 0d00 1f40 0103 0820 0241 (..A.G...@... .A
-002a9b00: 206a 2103 2002 4138 6a20 0141 086a 10b5 j!. .A8j .A.j..
-002a9b10: 8881 8000 2104 0b1f 4001 0304 2003 2004 ....!...@... . .
+002a9af0: 2802 0441 0647 0d00 1f40 0103 041f 4001 (..A.G...@....@.
+002a9b00: 0309 2002 4120 6a21 0320 0241 386a 2001 .. .A j!. .A8j .
+002a9b10: 4108 6a10 b588 8180 0021 040b 2003 2004 A.j......!.. . .
To make it easier to identify, here is a partial breakdown:
i32.load offset=4 ;; 28 02 04
i32.const 6 ;; 41 06
i32.ne ;; 47
br_if 0 ;; 0d 00
- try_table (catch_all_ref 8) ;; 1f 40 01 03 08
+ try_table (catch_all_ref 4) ;; 1f 40 01 03 04
+ try_table (catch_all_ref 9) ;; 1f 40 01 03 09
local.get 2 ;; 20 02
i32.const 32 ;; 41 20
i32.add ;; 6a
local.set 3 ;; 21 03
local.get 2 ;; 20 02
i32.const 56 ;; 41 38
i32.add ;; 6a
local.get 1 ;; 20 01
i32.const 8 ;; 41 08
i32.add ;; 6a
call 17461 ;; 10 b5 88 81 80 00
local.set 4 ;; 21 04
end ;; 0b
- try_table (catch_all_ref 4) ;; 1f 40 01 03 04
local.get 3 ;; 20 03
local.get 4 ;; 20 04
The calculation is almost the same, but the byte order is different so much that the catch references also differ. It also kicks in when you build this pinned version of wasm2js on Arm64 machines because its pointer iteration order is different from my workstation.
To resolve this, I took two steps:
- Disable address-space randomization for this build
setarch --addr-no-randomize. - Build this program on machines I trust to create known good sha256 checksums for both x86_64 and arm64.
I also made sure to do this in the CI job:
- name: Ensure reproducibility
run: |
cd ./utils/wasm/wasm2js
./build.sh
if sha256sum -c --status shasums.x86_64; then
echo "OK: rebuilt modules match the recorded x86_64 checksums"
elif sha256sum -c --status shasums.arm64; then
echo "OK: rebuilt modules match the recorded arm64 checksums"
else
echo "::error::rebuilt wasm2js/wasm-opt match neither recorded checksum set on ${{ matrix.runner }}" >&2
sha256sum wasm-opt_130.wasm wasm2js_130.wasm
exit 1
fi
To be extra sure, we have this task running on both x86_64 and Arm64 hosts. I would really love to reproduce this on all hosts, but this is an upstream LLVM bug that I am not powerful enough to deal with. If you work on LLVM and are reading this, it would be good to set some kind of seed to ensure that this iteration order is fixed across all architectures.
At least builds are deterministic Inside Architecture. This may be good enough for now.
Facts and circumstances may have changed since publication. If something seems wrong or unclear please contact me before jumping to conclusions.
tag:
<a href