CSAPP Chapater 3
Computer Systems: A Programmer’s Perspective
Machine-Level Representation of Programs
Program Encodings
Instruction Set Architecture: ISA, defining the processor state, the format of the instructions, and the effect each of these instructions will have on the state.
x86_64 virtual addresses are represented by 64-bit word, the upper $16$ bits must be set to zero, so an address can potentially specify a byte over range of $2^{48}$. after assembler
converting the assembly-code to machine code, the linker
has shifted the location of this code to a different range of addresses. The term “word” refer to a 16-bit data type, 32-bit refer to “double words” and 64-bit refer to “quad words”.
two conventions for different size of registers:
- Those that generate 1-byte or 2-byte quantities leave the remaining bytes unchanged.
- Those that generate 4-byte quantities set the upper 4 bytes of the register to zero.
Table of X86-86 registers:
64-bit | 32-bit | 16-bit | 8-bit | name | property |
---|---|---|---|---|---|
%rax | %eax | %ax | %al | Accumulator | Return value |
%rbx | %ebx | %bx | %bl | Base | Callee saved |
%rcx | %ecx | %cx | %cl | Counter | 4th argument |
%rdx | %edx | %dx | %dl | Data | 3rd argument |
%rsi | %esi | %si | %sil | Source Index | 2nb argument |
%rdi | %edi | %di | %dil | Destination Index | 1st argument |
%rbp | %ebp | %bp | %blp | Base Pointer | Callee saved |
%rsp | %esp | %sp | %spl | Stack Pointer | Stack Pointer |
%r8 | %r8d | %r8w | %r8b | Register eight | 5th argument |
%r9 | %r9d | %r9w | %r9b | Register nine | 6th argument |
%r10 | %r10d | %r10w | %r10b | Register ten | Caller saved |
%r11 | %r11d | %r11w | %r11b | Register eleven | Caller saved |
%r12 | %r12d | %r12w | %r12b | Register twelve | Callee saved |
%r13 | %r13d | %r13w | %r13b | Register thirteen | Callee saved |
%r14 | %r14d | %r14w | %r14b | Register fourteen | Callee saved |
%r15 | %r15d | %r15w | %r15b | Register fiveteen | Callee saved |
%rsp
, Stack Pointer, used to indicate the end position in the run-time stack (the last element).
The address Imm(rb, ri, s)
has four components: immediate offset $Imm$, a base register $rb$, an index register $ri$, and a scale factor $s$, where $s$ must be $ 1 , 2, 4, 8 $. Both the base and index must be 64-bit registers. effective address is computed as $ Imm + R[rb] + R[ri] * s $;
e.g: walk through an array can loop by $(rbp, rdi, 4)$.
Data Movement instructions
Instruction | Effect | Description |
---|---|---|
MOV S,D | D <- S | Move |
movb | Move byte | |
movw | Move word | |
movl | move double word | |
movq | Move quad word | |
movabsq I, R | R <- I | Move absolute quad word |
Source operand can be a immediate, stored in a register, or stored in memory.
Destination operand can be a location that is eiigher a rigister or a memory address. The two operands cannot both are in memory locations.
For most cases, the mov
instructions will only update the specific register bytes or memory locations nidicated by the destination operand, but movl has a 64-bit register as the destinatoin, it will also set the remaining bytes to 0. Addition, both source and destination are satisfied with the suffix.
Note: thees mov
instructions are means copying, not really move.
Zero extending
The source can be either register or memory while destination only can be a register.
Instruction | Effect | Description |
---|---|---|
MOVZ S, R | R <- ZeroExtend(S) | Move with zero extension |
movzbw | Move zero-extended byte to word | |
movzbl | Move zero-extended byte to double word | |
movzwl | Move zero-extended word to double word | |
movzbq | Move zero-extended byte to quad word | |
movzwq | Move zero-extended word to quad word |
Note: there have no movzlq
, since the 32-bit move to 64-bit is set the high-order 4 bytes of the destination to 0 by default.
Sign extending
The source can be either register or memory while destination only can be a register.
Instruction | Effect | Description |
---|---|---|
MOVS S, R | R <- SignExtend(S) | Move with sign extensioin |
movsbw | Move sign-extended byte to word | |
movsbl | Move sign-extended byte to double word | |
movswl | Move sign-extended word to double word | |
movsbq | Move sign-extended byte to quad word | |
movswq | Move sign-extended word to quad word | |
movslq | Move sign-extended long word to quad word | |
cltq | %rax <- SignExtend(%eax) | Always sign-extended %eax to %rax |
Code example
1 |
|
1 |
|
Arguments are passed to functions in registers or run-time stack. A function returns a value by storing it in register %rax
.
Instruction | Effect | Description |
---|---|---|
pushq S | R[%rsp] <- R[%rsp] - 8, M[R[%rsp]] <- S | Push quad word |
pop S | D <- M[R[%rsp]], R[%rsp] <- R[%rsp] + 8 | Pop quad word |
The stack pointer %rsp
holds the address of the top stack element.
run-time stack can be arbitrary assecced. the instruction movq 8(%rsp), %rdx
will copy the scond quad word form the stack to register %rdx
.
Arithmetic and Logical Operations
The operations are divided into four groups: load effective address, unary, binary, and shifts.
Instruction | Effect | Description |
---|---|---|
leaq S, D | D <- &S | Load effective address |
Unarys | ||
INC D | D <- D + 1 | Increment |
DEC D | D <- D - 1 | Decrement |
NEG D | D <- -D | Negate |
NOT D | D <- ~D | Complement |
Binarys | ||
ADD S, D | D <- D + S | Add |
SUB S, D | D <- D - S | Subtract |
IMUL S, D | D <- D * S | multiply |
XOR S, D | D <- D ^ S | Exclusive-or |
OR S, D | D <- D or S | Or |
AND S, D | D <- D & S | And |
Shifts | ||
SAL k, D | D <- D << k | Left shift |
SHL k, D | D <- D << k | Left shift (same as SAL ) |
SAR k, D | $D <- D >>_A k$ | Arithmetic right shift |
SHR k, D | $D <- D >>_L k$ | Logical right shift |
The leaq
instruction copies the effective address to the destination register, also can be used to compactly describe common arithmetic operations.
The following C program can be implemented by a sequence of three leaq
insructions.
1 |
|
1 |
|
The unary and binary instructions are same as the MOV instructions, which the Source can be either immediate, register or memory reference, while the Destination can be either register or memory reference. But cannot both source and destination are memory reference.
Code example:
1 |
|
1 |
|
Special Arithmetic Operations
The x86-64 provides limited support for operations involving 128-bit(16 bytes) numbers, called oct word. The following instructions support generating the full 128-bitproduct of two 64-bit numbers, as well as division.
Instruction | Effect | Description |
---|---|---|
imulq S | R[%rdx]:R[%rax] <- S * R[%rax] | Signed full multiply |
mulq S | R[%rdx]:R[%rax] <- S * R[%rax] | Unsigned full multiply |
cqto | R[%rdx]:R[%rax] <- SignExtend(R[%rax]) | Convert to oct word |
idivq S | R[%rdx] <- R[%rdx]:R[%rax] mod S; R[%rax] <- R[%rdx]:R[%rax] / S |
Signed divide |
divq S | R[%rdx] <- R[%rdx]:R[%rax] mod S; R[%rax] <- R[%rdx]:R[%rax] / S |
Unsigned divide |
The one operand must be %rax
register, the assembler can tell which one is intended by counting the number of oerands.
Code example(on little-endian machine):
1 |
|
1 |
|
division example:
1 |
|
1 |
|
Control Instructions
The execution order of a set of machine-code instructions can be altered with a jump
instruction, indicating that control should pass to some other part of the program, possibly contingent on the result of some test.
Condition Codes
CPU maintains a set of single-bit condition code registers describing attributes of the most recent arithmetic or logical opeeration.
- CF: Carry flag. The most recent operation generated a carry out of the most significant bit. Used to detect overflow for unsigned operations.
- ZF: Zero flag. The most recent operation yielded zero.
- SF: Sign flag. The most recent operation yielded a negative value.
- OF: Overflow flag. The most recent operation caused a two’s-complement overflow – eigher negative or positive.
The leaq
insstruction does not alter any condition codes. For the logical operations, such as XOR
, the carry and overflow flags are set to zero. For the shift operations, the carry flag is set to the last bit shifted out, while the overflow flag is set to zero.
There are two classes instructions that perform as same as the SUB
or AND
instructions, but only set condition code, not change any registers.
Instruction | Based on | description |
---|---|---|
CMP S1, S2 | S2 - S1 | Compare |
TEST S1, S2 | S1 & S2 | Test |
The following instructions set a single byte to 0 or to 1 depending on some combination of the condition codes. A SET
instruction has either a single-byte register or a single-byte memory location as its destination. To generate a 32-bit or 64-bit result, wwe must also clear the high-order bits by using movsxx
or movzxx
Instruction | Synonym | Effect | Set condition |
---|---|---|---|
SET D | D <- conditions | depending on the suffix | |
sete D | setz | D <- ZF | Equal / Zero |
setne D | setnz | D <- ~ZF | Not equal / zero |
sets D | D <- SF | Negative | |
setns D | D <- ~SF | Nonnegative | |
setg D | setnle | D <- ~(OF ^ SF) & ~ZF | Greater (signed >) |
setge D | setnl | D <- ~(OF ^ SF) | Greater or equal (signeed >=) |
setl D | setnge | D <- OF ^ SF | Less (signed <) |
setle D | setng | D <- (OF ^ SF) or ZF | Less or equal (signed <=) |
seta D | setnbe | D <- ~CF & ~ZF | Above (unsigned) |
setae D | setnb | D <- ~CF | Above or equal (unsigned) |
setb D | setnae | D <- CF | Below (unsigned) |
setbe D | setna | D <- CF or ZF | Below or equal (unsigned) |
Code example:
1 |
|
A jump
instruction can cause the execution to switch to a completely new position in the program.
Instruction | Synonym | Jump condition | Description |
---|---|---|---|
jmp Label | 1 | Direct jump | |
jmp *Operand | 1 | Indirect jump | |
j-suffix Label | condition combination | Conditional jump |
The jmp
instruction jumps unconditionally. It can be either a direct jump, where the jump target is read from a register or a memory location. e.g. jmp .L1
. Indirect jumps are wwritten using ‘*’ followed by an operand specific the address. e.g. jmp *%rax
and jmp *(%rax)
. Other instructions with suffix are called conditional jump, they only can be direct.
jmp
instructions often use PC relative encode method, that is, they encode the difference between the address of the target instruction and the address of the instruction immediately following the jump. These offsets caan be encoded using 1, 2, or 4 bytes. A second encoding method is to give an “absolute” address, using 4 bytes to directly specify the target. The assembler and linker sselectthe appropriate encodings of the jump destinations.
example:
1 |
|
The disassembled version of the .o format is as follows:
1 |
|
The two jump instructions are user PC relative encoding method. The first one get effective address by $5 \plus 0x3$, which $ 5 $ is the next instruction’s address while 0x3 is the second byte of the jmp instruction. By this method, whatever the linker
reload instructions to different address, the jmp
instruction will not change it’s target address.
Conditional Branches with Conditional control
Code example:
(a) Original C code:
1 |
|
(b) Equivalent goto version
1 |
|
© Generated assembly code
1 |
|
The general form of an if-else statement in C is given by the template:
1 |
|
where test-expr is an intege expression that evaluates either to zero (interpreted as meaning “false”) or to a nonzero value (interpreted as “true”). the assembly implementation typically adheres to the following form:
1 |
|
note that the goto
condition always false condition, the true condition always followed if statement.
Conditional Branches with Conditional Moves
Some times use conditional move instruction can get a better matched to the performance characteristics of modern processors. As same example like Conditional control, but do not modify the value of lt_cnt
and ge_cnt
.
(a) Original C code
1 |
|
(b) using conditional assignment
1 |
|
© Generated assembly code
1 |
|
The single cmovge
instruction implements the conditional assignment. modern precessors achieve high performance through pipeline, where an instruction is processed via a sequence of stages, each performing one small portion of the required operations (e.g., fetching the instruction from memory, determining the instruction type, reading from memory, performing an arithmetic operation, writing to memory, and updating the program counter.) By using overlapping the steps of the successive instructions, such as fetching one instruction while performing the arighmetic operations for a previouss instruction. It need to able to determine the sequence of instructions to be executed well aheda of time in order to keep the pipeline full of instructions.
Processors employ sophisticated branch prediction logic to try to guess whether or not each jump instruction will be followed. modern microprocessor designs try to achieve success tates on the order of 90%. The control flow(jmp
instruction) are hard to predict, while cmov
instructions are considered as a single instruction.
on GCC
compiler, using -Og
flag will still to generating control flow
instruction, while using -O2
flag will generating control data
assembly-code by considering optimize.
The CMOV
instructions have same suffix as SET
instruction. The source and destination values can be 16, 32, or 64 bits long. single-byte conditional moves are not supported. The operand length can be inferred from the name of the destination register, so they do not have length suffix like MOV
instructions.
Not all conditional expressions can be compiled using conditional moves, since conditional moves will execute both two conditions. If one of those two expressions could possibly generate an error or a side effect, this could lead to invalid behavior.
Code example:
1 |
|
1 |
|
The upper assembly-code may cause null pointer reference.
Loops
Almost all assembly-code language are not support instructions like while
or for
, but we can implement loop by use goto
or jmp
instruction.
Do-While Loops
The general form of a do-while statement is as follows:
1 |
|
Observe that body-statement is executed at least once. This general form can be translated into conditionals and goto
statements as follows:
1 |
|
(a) C code
1 |
|
(b) goto version
1 |
|
© assembly-code
1 |
|
Reverse engineering assembly code have much challenging for sove complex programs. A key to reverse the loop is to find a mapping values and registers.
While Loops
The general form of a while statement is as follows:
1 |
|
There are a number of ways to translate a while loop tinto machine code, two of which are used in GCC.
In first method, which we refer to as jump to middle, performs the initial test by performing an unconditional jump to the test at the end of the loop. The general template is as follows:
1 |
|
Code example:
(a) C code
1 |
|
(b) goto version
1 |
|
© assmbly-code
1 |
|
The second translation method, which we refer to as guarded do, first transforms the code into a do-while loop by using a conditional branch to skip over the loop if the initial test fails. GCC follows this strategy when compiling with higher levels of optimization.