Binja LLIL

There’s some basic documentation of Binary Ninja’s low level intermediate language (LLIL) here. However, that documentation is lacking in some specificity for some of the instructions.

To start off, the documentation of the overall structure is excellent. LLIL forms what is essentially an abstract syntax tree of each assembly instruction. So to figure out what in instruction does, you just need to traverse the tree in some order and you can figure out what all of the inputs and outputs are and how they are combined together.

Each instruction tree will be formed from two different kinds of nodes: instructions and operations. Looking at the rust documentation for LLIL, there are instructions and expressions. An instruction will always be the root node of a tree and expressions will make up all of the child nodes. Generally, instructions are the higher level abstractions of LLIL and expressions are the arithmetic type opeations.

An instruction’s output will be one of the child trees of an instruction and the input will be other. The output of an instruction might be a memory address that gets written to. That may require calculating the address via some number of operations so you might need to evaluate a tree of operations to know the output address.

Some of the specific instructions and instructions are very lightly documented on that page. Some of them could be better documented. The ones that I have figured out have some extra documentation below.

Call⌗

This is an LLIL call instruction. It will jump execution to the target address which could be just a constant address or something that needs to be calculated. It will implicitly fall through to the next LLIL instruction that comes after it via a return from the called instruction.

Binary Ninja will not calculate and set the actual return address and put it in a link register or push it onto the stack. That operation is implied and not explicitly represented in LLIL instructions. On entry to a function, Binary Ninja will assume that the return address has been placed on the stack or link register.

Push and Pop⌗

Binary Ninja will lift some accesses to the stack directly to a push or pop operation. Those operations implicitly access the stack and update the stack pointer. These operations will implicitly update the stack pointer by adding or subtracting the size of the value that was pushed or popped. Any other operation that accesses the stack will be based on that assumption.

Jump, JumpTo, TailCall⌗

These are all essentially the same instruction just with slightly different semantics. They each redirect control flow to a new address. Binary Ninja uses them to differentiate the context in which that control flow jump is happening.

Jump is the simplest in that it just redirects control flow and doesn’t imply any other kind of control structure.

JumpTo implies that the jump is part of some kind of switch statement.

TailCall indicates that the jump is a tail call and that the code being jumped to is not part of the current function.

Goto⌗

This is an LLIL jump that does not relate to the underlying assembly. It will target an LLIL index in the lifted function and not an actual program address. It mainly used for expressing instructions that have some amount of conditional execution behavior and can be used to jump past the conditional code or just jump to the next instruction.

Intrinsic⌗

This is a somewhat catch all instruction that is used to represent an assembly instruction that cannot be otherwise lifted to LLIL. Generally used for vector extension instructions, security extensions, or similar instructions that do not necessarily needed to be explicitly represented by LLIL to properly decompile a function.

Intrinsics can have any number of inputs and output to any number of flags or registers.

Add Overflow⌗

This appears to just be a binary operator that outputs a boolean of whether or not the addition would overflow. I believe the addition is supposed to be a signed addition of the two parameters. Given that the addition is signed, I think that it checks for both overflow and underflow.

Shifts⌗

Binary Ninja supports all of the standard shift operations. I’m not sure if the shift amount needs to be bounded to the size of the value being shifted. The arm64 shifts are implicitly bounded to the register size. Binary Ninja lifts those shifts to just the shift instruction without any masking. I have not checked if the same behavrio is there for other architectures.

Double Precision Multiplication⌗

MuluDp and MulsDp are unsigned and signed double precision multiplication instructions respectively. The double precision means that the source operands need to be zero or sign extended to double the number of bits and then multiplied. The output will stay double the bitness of the source operands.

I have only seen two version of this instruction: 32-bit and 64-bit inputs.

I have not come across the double precision division operation yet. I’m guessing it works in a similar way.

Temporary Values⌗

Binary Ninja uses temporary registers and flags to hold intermediate values that don’t get committed to an architectural register. You can tell if a register or flag is temporary by checking the highest bit of its id. If it is set then it is temporary otherwise it is architectural.