Addressing the hookup situation
Written by me, proof-read by LLM.
Details at the end.
Yesterday we looked at how zero register compilers use registers efficiently. Let’s take a look today at something a little less common (though not much): adding two integers. What do you think is a simple x86 function to add two ints? How will it look? One addCorrect? let’s take a look!
Probably not what you were thinking, right? x86 is unusual in that it has a maximum of two operands per instructionthere is no one there add assembling instructions edi To esiI am posting the results eaxOn an ARM machine this would be simple add r0, r0, r1 Or similar, because ARM has a different destination operand. on x86, things like add are not result = lhs + rhs But lhs += rhsThis may be a limitation, because we cannot control which register the result will go into, and we actually lose the old value of lhs,
So how do compilers work around this limitation? The answer lies in an unexpected place – x86’s sophisticated memory addressing system. Almost every operand can be a memory reference – there is no specific “load” or “store”; A mov Can refer directly to memory. Those memory references are very rich: you can refer to memory addressed by a constant, relative to a register, or by an offset relative to a register (optionally multiplied by 1, 2, 4, or 8). something like this add eax, word ptr [rdi + rsi * 4 + 0x1000] still have the same instructions,
sometimes you don’t want access At one of these complex addresses of memory, you just want to calculate what the address will be. Something like C’s “address-of” (&) operator. this is it lea (Load Effective Address): It calculates the address without touching the memory.
Why is it useful to add? Well, if we are not actually accessing the memory, we can abuse the addressing hardware as in the calculator! That complex addressing mode with its register-plus-register-time-scale is really just shift-and-add – so lea Becomes a cheeky way to do three-operand addition,
The compiler writes our simple addition in terms of memory addresses rdi offset by rsiWe get the complete sum of two registers And We also have to specify the destination. You will see that the operands are referred to as rdi And rsi (64-bit version) Even if we only wanted 32-bit add: Because we are using a memory addressing system, it implicitly calculates 64-bit addresses. However, in this case it doesn’t matter; those top parts Deleted when result is written to 32-bit eax,
using the lea Often saves an instruction, useful if both operands are still needed in other calculations later (as it leaves them unchanged), and can execute on multiple execution units of the x86 in a single cycle. However the compilers know this, so you don’t have to worry!
Watch the video accompanying this post.
This post is day two of Compiler Optimization 2025, a 25-day series exploring how compilers change our code.
This post was written by a human (Matt Godbolt) and reviewed and proof-read by LLM and humans.
Support Compiler Explorer on Patreon or GitHub, or by purchasing a CE product in the Compiler Explorer Shop,
<a href