CSAPP Chapater 3

Computer Systems: A Programmer’s Perspective

Machine-Level Representation of Programs

Program Encodings

Instruction Set Architecture: ISA, defining the processor state, the format of the instructions, and the effect each of these instructions will have on the state.

x86_64 virtual addresses are represented by 64-bit word, the upper $16$ bits must be set to zero, so an address can potentially specify a byte over range of $2^{48}$. after assembler converting the assembly-code to machine code, the linker has shifted the location of this code to a different range of addresses. The term “word” refer to a 16-bit data type, 32-bit refer to “double words” and 64-bit refer to “quad words”.

two conventions for different size of registers:

  • Those that generate 1-byte or 2-byte quantities leave the remaining bytes unchanged.
  • Those that generate 4-byte quantities set the upper 4 bytes of the register to zero.

Table of X86-86 registers:

64-bit 32-bit 16-bit 8-bit name property
%rax %eax %ax %al Accumulator Return value
%rbx %ebx %bx %bl Base Callee saved
%rcx %ecx %cx %cl Counter 4th argument
%rdx %edx %dx %dl Data 3rd argument
%rsi %esi %si %sil Source Index 2nb argument
%rdi %edi %di %dil Destination Index 1st argument
%rbp %ebp %bp %blp Base Pointer Callee saved
%rsp %esp %sp %spl Stack Pointer Stack Pointer
%r8 %r8d %r8w %r8b Register eight 5th argument
%r9 %r9d %r9w %r9b Register nine 6th argument
%r10 %r10d %r10w %r10b Register ten Caller saved
%r11 %r11d %r11w %r11b Register eleven Caller saved
%r12 %r12d %r12w %r12b Register twelve Callee saved
%r13 %r13d %r13w %r13b Register thirteen Callee saved
%r14 %r14d %r14w %r14b Register fourteen Callee saved
%r15 %r15d %r15w %r15b Register fiveteen Callee saved

%rsp, Stack Pointer, used to indicate the end position in the run-time stack (the last element).

The address Imm(rb, ri, s) has four components: immediate offset $Imm$, a base register $rb$, an index register $ri$, and a scale factor $s$, where $s$ must be $ 1 , 2, 4, 8 $. Both the base and index must be 64-bit registers. effective address is computed as $ Imm + R[rb] + R[ri] * s $;

e.g: walk through an array can loop by $(rbp, rdi, 4)$.

Data Movement instructions

Instruction Effect Description
MOV S,D D <- S Move
movb Move byte
movw Move word
movl move double word
movq Move quad word
movabsq I, R R <- I Move absolute quad word

Source operand can be a immediate, stored in a register, or stored in memory.
Destination operand can be a location that is eiigher a rigister or a memory address. The two operands cannot both are in memory locations.

For most cases, the mov instructions will only update the specific register bytes or memory locations nidicated by the destination operand, but movl has a 64-bit register as the destinatoin, it will also set the remaining bytes to 0. Addition, both source and destination are satisfied with the suffix.

Note: thees mov instructions are means copying, not really move.

Zero extending
The source can be either register or memory while destination only can be a register.

Instruction Effect Description
MOVZ S, R R <- ZeroExtend(S) Move with zero extension
movzbw Move zero-extended byte to word
movzbl Move zero-extended byte to double word
movzwl Move zero-extended word to double word
movzbq Move zero-extended byte to quad word
movzwq Move zero-extended word to quad word

Note: there have no movzlq, since the 32-bit move to 64-bit is set the high-order 4 bytes of the destination to 0 by default.

Sign extending
The source can be either register or memory while destination only can be a register.

Instruction Effect Description
MOVS S, R R <- SignExtend(S) Move with sign extensioin
movsbw Move sign-extended byte to word
movsbl Move sign-extended byte to double word
movswl Move sign-extended word to double word
movsbq Move sign-extended byte to quad word
movswq Move sign-extended word to quad word
movslq Move sign-extended long word to quad word
cltq %rax <- SignExtend(%eax) Always sign-extended %eax to %rax

Code example

1
2
3
4
5
6
long exchange (long *xp, long y)
{
long x = *xp;
*xp = y;
return x;
}
1
2
3
4
5
6
; long exchange(long *xp, long y)
; *xp in %rdi, y in %rsi;
exchange:
movq (%rdi), %rax
movq %rax, (%rdi) ; cannot be `movq (%rsi), (%rdi)`
ret

Arguments are passed to functions in registers or run-time stack. A function returns a value by storing it in register %rax.

Instruction Effect Description
pushq S R[%rsp] <- R[%rsp] - 8, M[R[%rsp]] <- S Push quad word
pop S D <- M[R[%rsp]], R[%rsp] <- R[%rsp] + 8 Pop quad word

The stack pointer %rsp holds the address of the top stack element.

run-time stack can be arbitrary assecced. the instruction movq 8(%rsp), %rdx will copy the scond quad word form the stack to register %rdx.

Arithmetic and Logical Operations

The operations are divided into four groups: load effective address, unary, binary, and shifts.

Instruction Effect Description
leaq S, D D <- &S Load effective address
Unarys
INC D D <- D + 1 Increment
DEC D D <- D - 1 Decrement
NEG D D <- -D Negate
NOT D D <- ~D Complement
Binarys
ADD S, D D <- D + S Add
SUB S, D D <- D - S Subtract
IMUL S, D D <- D * S multiply
XOR S, D D <- D ^ S Exclusive-or
OR S, D D <- D or S Or
AND S, D D <- D & S And
Shifts
SAL k, D D <- D << k Left shift
SHL k, D D <- D << k Left shift (same as SAL)
SAR k, D $D <- D >>_A k$ Arithmetic right shift
SHR k, D $D <- D >>_L k$ Logical right shift

The leaq instruction copies the effective address to the destination register, also can be used to compactly describe common arithmetic operations.

The following C program can be implemented by a sequence of three leaq insructions.

1
2
3
4
long scale(long x, long y, long z) {
long t = x + 4 * y + 12 * z;
return t
}
1
2
3
4
scale:
leaq (%rdi, %rsi, 4), %rax
leaq (%rdx, %rdx, 2), %rdx
laeq (%rax, %rdx, 4), %rax

The unary and binary instructions are same as the MOV instructions, which the Source can be either immediate, register or memory reference, while the Destination can be either register or memory reference. But cannot both source and destination are memory reference.

Code example:

1
2
3
4
5
6
7
8
long arith(long x, long y, long z) 
{
long t1 = x ^ y;
long t2 = z * 48;
long t3 = t1 & 0x0F0F0F0F;
long t4 = t2 - t3;
return t4;
}
1
2
3
4
5
6
7
arith:
xorq %rsi, %rdi
leaq (%rdx, %rdx, 2), %rax
salq $4, %rax
andl $252645135, %edi
subq %rdi, %rax
ret

Special Arithmetic Operations

The x86-64 provides limited support for operations involving 128-bit(16 bytes) numbers, called oct word. The following instructions support generating the full 128-bitproduct of two 64-bit numbers, as well as division.

Instruction Effect Description
imulq S R[%rdx]:R[%rax] <- S * R[%rax] Signed full multiply
mulq S R[%rdx]:R[%rax] <- S * R[%rax] Unsigned full multiply
cqto R[%rdx]:R[%rax] <- SignExtend(R[%rax]) Convert to oct word
idivq S R[%rdx] <- R[%rdx]:R[%rax] mod S;
R[%rax] <- R[%rdx]:R[%rax] / S
Signed divide
divq S R[%rdx] <- R[%rdx]:R[%rax] mod S;
R[%rax] <- R[%rdx]:R[%rax] / S
Unsigned divide

The one operand must be %rax register, the assembler can tell which one is intended by counting the number of oerands.

Code example(on little-endian machine):

1
2
3
4
5
6
7
#include <inttypes.h>

typedef unsigned __int128 uint128_t

void store_uprod(uint128_t *dest, uint64_t x, uint64_t y) {
*dest = x * (uint128_t) y;
}
1
2
3
4
5
6
store_uprod:
movq %rsi, %rax
mulq %rdx
movq %rax, (%rdi)
movq %rdx, 8(%rdi)

division example:

1
2
3
4
5
6
void remdiv(long x, long y, long *qp, long *rp) {
long q = x/y;
long r = x%y;
*qp = q;
*rp = r;
}
1
2
3
4
5
6
7
remdiv:
movq %rdx, %r8
movq %rdi, %rax
cqto
idivq %rsi
movq %rax, (%r8)
movq %rdx, (%rcx)

Control Instructions

The execution order of a set of machine-code instructions can be altered with a jump instruction, indicating that control should pass to some other part of the program, possibly contingent on the result of some test.

Condition Codes

CPU maintains a set of single-bit condition code registers describing attributes of the most recent arithmetic or logical opeeration.

  • CF: Carry flag. The most recent operation generated a carry out of the most significant bit. Used to detect overflow for unsigned operations.
  • ZF: Zero flag. The most recent operation yielded zero.
  • SF: Sign flag. The most recent operation yielded a negative value.
  • OF: Overflow flag. The most recent operation caused a two’s-complement overflow – eigher negative or positive.

The leaq insstruction does not alter any condition codes. For the logical operations, such as XOR, the carry and overflow flags are set to zero. For the shift operations, the carry flag is set to the last bit shifted out, while the overflow flag is set to zero.

There are two classes instructions that perform as same as the SUB or AND instructions, but only set condition code, not change any registers.

Instruction Based on description
CMP S1, S2 S2 - S1 Compare
TEST S1, S2 S1 & S2 Test

The following instructions set a single byte to 0 or to 1 depending on some combination of the condition codes. A SET instruction has either a single-byte register or a single-byte memory location as its destination. To generate a 32-bit or 64-bit result, wwe must also clear the high-order bits by using movsxx or movzxx

Instruction Synonym Effect Set condition
SET D D <- conditions depending on the suffix
sete D setz D <- ZF Equal / Zero
setne D setnz D <- ~ZF Not equal / zero
sets D D <- SF Negative
setns D D <- ~SF Nonnegative
setg D setnle D <- ~(OF ^ SF) & ~ZF Greater (signed >)
setge D setnl D <- ~(OF ^ SF) Greater or equal (signeed >=)
setl D setnge D <- OF ^ SF Less (signed <)
setle D setng D <- (OF ^ SF) or ZF Less or equal (signed <=)
seta D setnbe D <- ~CF & ~ZF Above (unsigned)
setae D setnb D <- ~CF Above or equal (unsigned)
setb D setnae D <- CF Below (unsigned)
setbe D setna D <- CF or ZF Below or equal (unsigned)

Code example:

1
2
3
4
5
6
7
; int comp(long a, long b)
; a in %rdi, b in %rsi
comp:
cmpq %rsi, %rdi
setl %al
movzbl %al, %eax
ret

A jump instruction can cause the execution to switch to a completely new position in the program.

Instruction Synonym Jump condition Description
jmp Label 1 Direct jump
jmp *Operand 1 Indirect jump
j-suffix Label condition combination Conditional jump

The jmp instruction jumps unconditionally. It can be either a direct jump, where the jump target is read from a register or a memory location. e.g. jmp .L1. Indirect jumps are wwritten using ‘*’ followed by an operand specific the address. e.g. jmp *%rax and jmp *(%rax) . Other instructions with suffix are called conditional jump, they only can be direct.

jmp instructions often use PC relative encode method, that is, they encode the difference between the address of the target instruction and the address of the instruction immediately following the jump. These offsets caan be encoded using 1, 2, or 4 bytes. A second encoding method is to give an “absolute” address, using 4 bytes to directly specify the target. The assembler and linker sselectthe appropriate encodings of the jump destinations.

example:

1
2
3
4
5
6
7
8
    movq %rdi, %rax
jmp .L2
.L3:
sarq %rax
.L2:
testq %rax, %rax
jg .L3
rep ret

The disassembled version of the .o format is as follows:

1
2
3
4
5
6
0:  48 89 f8        mov %rdi, %rax
3: eb 03 jmp
5: 48 d1 f8 sar %rax
8: 48 85 c0 test %rax, %rax
b: 71 f8 jg 5<loop+0x5>
d: f3 c3 repz retq

The two jump instructions are user PC relative encoding method. The first one get effective address by $5 \plus 0x3$, which $ 5 $ is the next instruction’s address while 0x3 is the second byte of the jmp instruction. By this method, whatever the linker reload instructions to different address, the jmp instruction will not change it’s target address.

Conditional Branches with Conditional control

Code example:
(a) Original C code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
long lt_cnt = 0;
long ge_cnt = 0;

long absdiff_se(long x, long y) {
long result;
if (x < y) {
lt_cnt ++;
result = y - x;
} else {
ge_cnt ++;
result = x - y;
}
return result;
}

(b) Equivalent goto version

1
2
3
4
5
6
7
8
9
10
11
long gotodiff_se(long x, long y) {
long result;
if (x >= y) goto x_ge_y;
lt_cnt ++;
result = y - x;
return result;
x_ge_y:
ge_cnt ++;
result = x - y;
return result;
}

© Generated assembly code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
; long absdiff_se(long x, long y)
; x in %rdi, y in %rsi
absdiff_se:
cmpq %rsi, %rdi
jge .L2
addq $1, lt_cnt($rip)
movq $rsi, $rax
subq %rdi, $rax
ret
.L2:
addq $1, ge_cnt($rip)
movq $rdi, $rax
subq $rsi, $rax
ret

The general form of an if-else statement in C is given by the template:

1
2
3
4
if (test-expr)
then-statement;
else
else-statement;

where test-expr is an intege expression that evaluates either to zero (interpreted as meaning “false”) or to a nonzero value (interpreted as “true”). the assembly implementation typically adheres to the following form:

1
2
3
4
5
6
7
8
t = test-expr;
if (!it) goto false
then-statement
goto done;
false:
else-statement
done:
return;

note that the goto condition always false condition, the true condition always followed if statement.

Conditional Branches with Conditional Moves

Some times use conditional move instruction can get a better matched to the performance characteristics of modern processors. As same example like Conditional control, but do not modify the value of lt_cnt and ge_cnt.

(a) Original C code

1
2
3
4
5
6
7
8
9
long absdiff(long x, long y)
{
long result;
if (x < y)
result = y - x;
else
result = x - y;
return result;
}

(b) using conditional assignment

1
2
3
4
5
6
7
8
long cmovdiff(long x. long y)
{
long rval = y - x;
long eval = x - y;
long ntest = x >= y;
if (ntest) rval = eval;
return rval;
}

© Generated assembly code

1
2
3
4
5
6
7
8
9
10
; long absdiff(long x, long y)
; x in %rdi, y in %rsi
absdiff:
movq %rsi, %rax
subq %rdi, %rax ; rval = y - x
movq %rdi, %rdx
subq %rsi, %rdx ; eval = x - y
cmpq %rsi, %rdi ; ntest
cmovge %rdx, %rax ; if statement
ret

The single cmovge instruction implements the conditional assignment. modern precessors achieve high performance through pipeline, where an instruction is processed via a sequence of stages, each performing one small portion of the required operations (e.g., fetching the instruction from memory, determining the instruction type, reading from memory, performing an arithmetic operation, writing to memory, and updating the program counter.) By using overlapping the steps of the successive instructions, such as fetching one instruction while performing the arighmetic operations for a previouss instruction. It need to able to determine the sequence of instructions to be executed well aheda of time in order to keep the pipeline full of instructions.

Processors employ sophisticated branch prediction logic to try to guess whether or not each jump instruction will be followed. modern microprocessor designs try to achieve success tates on the order of 90%. The control flow(jmp instruction) are hard to predict, while cmov instructions are considered as a single instruction.

on GCC compiler, using -Og flag will still to generating control flow instruction, while using -O2 flag will generating control data assembly-code by considering optimize.

The CMOV instructions have same suffix as SET instruction. The source and destination values can be 16, 32, or 64 bits long. single-byte conditional moves are not supported. The operand length can be inferred from the name of the destination register, so they do not have length suffix like MOV instructions.

Not all conditional expressions can be compiled using conditional moves, since conditional moves will execute both two conditions. If one of those two expressions could possibly generate an error or a side effect, this could lead to invalid behavior.

Code example:

1
2
3
long cread(long *xp) {
return (xp ? *xp : 0);
}
1
2
3
4
5
6
7
8
9
; long cread(long *xp)
; Invalid implementation of function cread
; *xp in %rdi
cread:
movq (%rdi), %rax
movl $0, %edx
testq %rdi, %rdi
cmove %rdx, %rax
ret

The upper assembly-code may cause null pointer reference.

Loops

Almost all assembly-code language are not support instructions like while or for, but we can implement loop by use goto or jmp instruction.

Do-While Loops
The general form of a do-while statement is as follows:

1
2
3
do
body-statement
while (test-expr)

Observe that body-statement is executed at least once. This general form can be translated into conditionals and goto statements as follows:

1
2
3
4
loop:
body-statement;
t = test-expr;
if (t) goto loop;

(a) C code

1
2
3
4
5
6
7
8
9
long fact_do(long n)
{
long result = 1;
do {
result *= n;
n = n - 1;
} while (n > 1);
return result;
}

(b) goto version

1
2
3
4
5
6
7
8
long fact_do_goto(long n)
{
long result = 1;
loop:
result *= n;
n = n - 1;
if (n > 1) goto loop;
}

© assembly-code

1
2
3
4
5
6
7
8
9
10
11
; long fact_do(long n)
; n in %rdi
fact_do:
movq $1, %eax
.L2:
imulq %rdi, %rax
subq $1, %rdi
cmpq $1, %rdi
jg .L2
rep
ret

Reverse engineering assembly code have much challenging for sove complex programs. A key to reverse the loop is to find a mapping values and registers.

While Loops

The general form of a while statement is as follows:

1
2
while (test-expr)
body-statement

There are a number of ways to translate a while loop tinto machine code, two of which are used in GCC.

In first method, which we refer to as jump to middle, performs the initial test by performing an unconditional jump to the test at the end of the loop. The general template is as follows:

1
2
3
4
5
6
    goto test;
loop:
body-statement
test:
t = test-expr;
if (t) goto loop;

Code example:
(a) C code

1
2
3
4
5
6
7
8
9
long  fact_while(long n)
{
long result = 1;
while (n > 1) {
result *= n;
n = n - 1;
}
return result;
}

(b) goto version

1
2
3
4
5
6
7
8
9
10
long fact_while_jm_goto(long n)
{
long result = 1;
goto test;
loop:
result *= n;
n = n -1;
test:
if (n > 1) goto loop;
}

© assmbly-code

1
2
3
4
5
6
7
8
9
10
11
12
; long fact_while (long n)
; n in %rdi
fact_while:
movq $1, %eax
jmp .L5
.L6:
imulq %rdi, %rax
subq $1, %rdi
.L5:
cmpq $1, %rdi
jg .L6
rep ret

The second translation method, which we refer to as guarded do, first transforms the code into a do-while loop by using a conditional branch to skip over the loop if the initial test fails. GCC follows this strategy when compiling with higher levels of optimization.


CSAPP Chapater 3
https://acmicpc.top/2025/05/02/Computer System - A Programmer Perspective/
作者
江欣婷
发布于
2025年5月2日
许可协议