CSAPP Chapater 3

Computer Systems: A Programmer’s Perspective

Machine-Level Representation of Programs

Program Encodings

Instruction Set Architecture: ISA, defining the processor state, the format of the instructions, and the effect each of these instructions will have on the state.

x86_64 virtual addresses are represented by 64-bit word, the upper $16$ bits must be set to zero, so an address can potentially specify a byte over range of $2^{48}$. after assembler converting the assembly-code to machine code, the linker has shifted the location of this code to a different range of addresses. The term “word” refer to a 16-bit data type, 32-bit refer to “double words” and 64-bit refer to “quad words”.

two conventions for different size of registers:

Those that generate 1-byte or 2-byte quantities leave the remaining bytes unchanged.
Those that generate 4-byte quantities set the upper 4 bytes of the register to zero.

Table of X86-86 registers:

64-bit	32-bit	16-bit	8-bit	name	property
%rax	%eax	%ax	%al	Accumulator	Return value
%rbx	%ebx	%bx	%bl	Base	Callee saved
%rcx	%ecx	%cx	%cl	Counter	4th argument
%rdx	%edx	%dx	%dl	Data	3rd argument
%rsi	%esi	%si	%sil	Source Index	2nb argument
%rdi	%edi	%di	%dil	Destination Index	1st argument
%rbp	%ebp	%bp	%blp	Base Pointer	Callee saved
%rsp	%esp	%sp	%spl	Stack Pointer	Stack Pointer
%r8	%r8d	%r8w	%r8b	Register eight	5th argument
%r9	%r9d	%r9w	%r9b	Register nine	6th argument
%r10	%r10d	%r10w	%r10b	Register ten	Caller saved
%r11	%r11d	%r11w	%r11b	Register eleven	Caller saved
%r12	%r12d	%r12w	%r12b	Register twelve	Callee saved
%r13	%r13d	%r13w	%r13b	Register thirteen	Callee saved
%r14	%r14d	%r14w	%r14b	Register fourteen	Callee saved
%r15	%r15d	%r15w	%r15b	Register fiveteen	Callee saved

%rsp, Stack Pointer, used to indicate the end position in the run-time stack (the last element).

The address Imm(rb, ri, s) has four components: immediate offset $Imm$, a base register $rb$, an index register $ri$, and a scale factor $s$, where $s$ must be $ 1 , 2, 4, 8 $. Both the base and index must be 64-bit registers. effective address is computed as $ Imm + R[rb] + R[ri] * s $;

e.g: walk through an array can loop by $(rbp, rdi, 4)$.

Data Movement instructions

Instruction	Effect	Description
MOV S,D	D <- S	Move
movb		Move byte
movw		Move word
movl		move double word
movq		Move quad word
movabsq I, R	R <- I	Move absolute quad word

Source operand can be a immediate, stored in a register, or stored in memory.
Destination operand can be a location that is eiigher a rigister or a memory address. The two operands cannot both are in memory locations.

For most cases, the mov instructions will only update the specific register bytes or memory locations nidicated by the destination operand, but movl has a 64-bit register as the destinatoin, it will also set the remaining bytes to 0. Addition, both source and destination are satisfied with the suffix.

Note: thees mov instructions are means copying, not really move.

Zero extending
The source can be either register or memory while destination only can be a register.

Instruction	Effect	Description
MOVZ S, R	R <- ZeroExtend(S)	Move with zero extension
movzbw		Move zero-extended byte to word
movzbl		Move zero-extended byte to double word
movzwl		Move zero-extended word to double word
movzbq		Move zero-extended byte to quad word
movzwq		Move zero-extended word to quad word

Note: there have no movzlq, since the 32-bit move to 64-bit is set the high-order 4 bytes of the destination to 0 by default.

Sign extending
The source can be either register or memory while destination only can be a register.

Instruction	Effect	Description
MOVS S, R	R <- SignExtend(S)	Move with sign extensioin
movsbw		Move sign-extended byte to word
movsbl		Move sign-extended byte to double word
movswl		Move sign-extended word to double word
movsbq		Move sign-extended byte to quad word
movswq		Move sign-extended word to quad word
movslq		Move sign-extended long word to quad word
cltq	%rax <- SignExtend(%eax)	Always sign-extended %eax to %rax

Code example

long exchange (long *xp, long y)
{
    long x = *xp;
    *xp = y;
    return x;
}

; long exchange(long *xp, long y)
; *xp in %rdi, y in %rsi;
exchange:
    movq (%rdi), %rax
    movq %rax, (%rdi) ; cannot be `movq (%rsi), (%rdi)`
    ret

Arguments are passed to functions in registers or run-time stack. A function returns a value by storing it in register %rax.

Instruction	Effect	Description
pushq S	R[%rsp] <- R[%rsp] - 8, M[R[%rsp]] <- S	Push quad word
pop S	D <- M[R[%rsp]], R[%rsp] <- R[%rsp] + 8	Pop quad word

The stack pointer %rsp holds the address of the top stack element.

run-time stack can be arbitrary assecced. the instruction movq 8(%rsp), %rdx will copy the scond quad word form the stack to register %rdx.

Arithmetic and Logical Operations

The operations are divided into four groups: load effective address, unary, binary, and shifts.

Instruction	Effect	Description
leaq S, D	D <- &S	Load effective address
Unarys
INC D	D <- D + 1	Increment
DEC D	D <- D - 1	Decrement
NEG D	D <- -D	Negate
NOT D	D <- ~D	Complement
Binarys
ADD S, D	D <- D + S	Add
SUB S, D	D <- D - S	Subtract
IMUL S, D	D <- D * S	multiply
XOR S, D	D <- D ^ S	Exclusive-or
OR S, D	D <- D or S	Or
AND S, D	D <- D & S	And
Shifts
SAL k, D	D <- D << k	Left shift
SHL k, D	D <- D << k	Left shift (same as `SAL`)
SAR k, D	$D <- D >>_A k$	Arithmetic right shift
SHR k, D	$D <- D >>_L k$	Logical right shift

The leaq instruction copies the effective address to the destination register, also can be used to compactly describe common arithmetic operations.

The following C program can be implemented by a sequence of three leaq insructions.

long scale(long x, long y, long z) {
    long t = x + 4 * y + 12 * z;
    return t
}

scale:
    leaq (%rdi, %rsi, 4), %rax
    leaq (%rdx, %rdx, 2), %rdx
    laeq (%rax, %rdx, 4), %rax

The unary and binary instructions are same as the MOV instructions, which the Source can be either immediate, register or memory reference, while the Destination can be either register or memory reference. But cannot both source and destination are memory reference.

Code example:

long arith(long x, long y, long z) 
{
    long t1 = x ^ y;
    long t2 = z * 48;
    long t3 = t1 & 0x0F0F0F0F;
    long t4 = t2 - t3;
    return t4;
}

arith:
    xorq %rsi, %rdi
    leaq (%rdx, %rdx, 2), %rax
    salq $4, %rax
    andl $252645135, %edi
    subq %rdi, %rax
    ret

Special Arithmetic Operations

The x86-64 provides limited support for operations involving 128-bit(16 bytes) numbers, called oct word. The following instructions support generating the full 128-bitproduct of two 64-bit numbers, as well as division.

Instruction	Effect	Description
imulq S	R[%rdx]:R[%rax] <- S * R[%rax]	Signed full multiply
mulq S	R[%rdx]:R[%rax] <- S * R[%rax]	Unsigned full multiply
cqto	R[%rdx]:R[%rax] <- SignExtend(R[%rax])	Convert to oct word
idivq S	R[%rdx] <- R[%rdx]:R[%rax] mod S; R[%rax] <- R[%rdx]:R[%rax] / S	Signed divide
divq S	R[%rdx] <- R[%rdx]:R[%rax] mod S; R[%rax] <- R[%rdx]:R[%rax] / S	Unsigned divide

The one operand must be %rax register, the assembler can tell which one is intended by counting the number of oerands.

Code example(on little-endian machine):

#include <inttypes.h>

typedef unsigned __int128 uint128_t

void store_uprod(uint128_t *dest, uint64_t x, uint64_t y) {
    *dest = x * (uint128_t) y;
}

store_uprod:
    movq %rsi, %rax
    mulq %rdx
    movq %rax, (%rdi)
    movq %rdx, 8(%rdi)

division example:

void remdiv(long x, long y, long *qp, long *rp) {
    long q = x/y;
    long r = x%y;
    *qp = q;
    *rp = r;
}

remdiv:
    movq %rdx, %r8
    movq %rdi, %rax
    cqto
    idivq %rsi
    movq %rax, (%r8)
    movq %rdx, (%rcx)

Control Instructions

The execution order of a set of machine-code instructions can be altered with a jump instruction, indicating that control should pass to some other part of the program, possibly contingent on the result of some test.

Condition Codes

CPU maintains a set of single-bit condition code registers describing attributes of the most recent arithmetic or logical opeeration.

CF: Carry flag. The most recent operation generated a carry out of the most significant bit. Used to detect overflow for unsigned operations.
ZF: Zero flag. The most recent operation yielded zero.
SF: Sign flag. The most recent operation yielded a negative value.
OF: Overflow flag. The most recent operation caused a two’s-complement overflow – eigher negative or positive.

The leaq insstruction does not alter any condition codes. For the logical operations, such as XOR, the carry and overflow flags are set to zero. For the shift operations, the carry flag is set to the last bit shifted out, while the overflow flag is set to zero.

There are two classes instructions that perform as same as the SUB or AND instructions, but only set condition code, not change any registers.

Instruction	Based on	description
CMP S1, S2	S2 - S1	Compare
TEST S1, S2	S1 & S2	Test

The following instructions set a single byte to 0 or to 1 depending on some combination of the condition codes. A SET instruction has either a single-byte register or a single-byte memory location as its destination. To generate a 32-bit or 64-bit result, wwe must also clear the high-order bits by using movsxx or movzxx

Instruction	Synonym	Effect	Set condition
SET D		D <- conditions	depending on the suffix
sete D	setz	D <- ZF	Equal / Zero
setne D	setnz	D <- ~ZF	Not equal / zero
sets D		D <- SF	Negative
setns D		D <- ~SF	Nonnegative
setg D	setnle	D <- ~(OF ^ SF) & ~ZF	Greater (signed >)
setge D	setnl	D <- ~(OF ^ SF)	Greater or equal (signeed >=)
setl D	setnge	D <- OF ^ SF	Less (signed <)
setle D	setng	D <- (OF ^ SF) or ZF	Less or equal (signed <=)
seta D	setnbe	D <- ~CF & ~ZF	Above (unsigned)
setae D	setnb	D <- ~CF	Above or equal (unsigned)
setb D	setnae	D <- CF	Below (unsigned)
setbe D	setna	D <- CF or ZF	Below or equal (unsigned)

Code example:

; int comp(long a, long b)
; a in %rdi, b in %rsi
comp:
    cmpq %rsi, %rdi
    setl %al
    movzbl %al, %eax
    ret

A jump instruction can cause the execution to switch to a completely new position in the program.

Instruction	Jump condition	Description
jmp Label	1	Direct jump
jmp *Operand	1	Indirect jump
j-suffix Label	condition combination	Conditional jump

The jmp instruction jumps unconditionally. It can be either a direct jump, where the jump target is read from a register or a memory location. e.g. jmp .L1. Indirect jumps are wwritten using ‘*’ followed by an operand specific the address. e.g. jmp *%rax and jmp *(%rax) . Other instructions with suffix are called conditional jump, they only can be direct.

jmp instructions often use PC relative encode method, that is, they encode the difference between the address of the target instruction and the address of the instruction immediately following the jump. These offsets caan be encoded using 1, 2, or 4 bytes. A second encoding method is to give an “absolute” address, using 4 bytes to directly specify the target. The assembler and linker sselectthe appropriate encodings of the jump destinations.

example:

    movq %rdi, %rax
    jmp .L2
.L3:
    sarq %rax
.L2:
    testq %rax, %rax
    jg .L3
    rep ret

The disassembled version of the .o format is as follows:

0:  48 89 f8        mov %rdi, %rax
3:  eb 03           jmp 
5:  48 d1 f8        sar %rax
8:  48 85 c0        test %rax, %rax
b:  71  f8          jg  5<loop+0x5>
d:  f3 c3           repz retq

The two jump instructions are user PC relative encoding method. The first one get effective address by $5 \plus 0x3$, which $ 5 $ is the next instruction’s address while 0x3 is the second byte of the jmp instruction. By this method, whatever the linker reload instructions to different address, the jmp instruction will not change it’s target address.

Conditional Branches with Conditional control

Code example:
(a) Original C code:

long lt_cnt = 0;
long ge_cnt = 0;

long absdiff_se(long x, long y) {
    long result;
    if (x < y) {
        lt_cnt ++;
        result = y - x;
    } else {
        ge_cnt ++;
        result = x - y;
    }
    return result;
}

(b) Equivalent goto version

long gotodiff_se(long x, long y) {
    long result;
    if (x >= y) goto x_ge_y;
    lt_cnt ++;
    result = y - x;
    return result;
x_ge_y:
    ge_cnt ++;
    result = x - y;
    return result;
}

; long absdiff_se(long x, long y)
; x in %rdi, y in %rsi
absdiff_se:
    cmpq %rsi, %rdi
    jge .L2
    addq $1, lt_cnt($rip)
    movq $rsi, $rax
    subq %rdi, $rax
    ret
.L2:
    addq $1, ge_cnt($rip)
    movq $rdi, $rax
    subq $rsi, $rax
    ret

The general form of an if-else statement in C is given by the template:

if (test-expr)
    then-statement;
else
    else-statement;

where test-expr is an intege expression that evaluates either to zero (interpreted as meaning “false”) or to a nonzero value (interpreted as “true”). the assembly implementation typically adheres to the following form:

t = test-expr;
if (!it) goto false
then-statement
goto done;
false:
    else-statement
done:
    return;

note that the goto condition always false condition, the true condition always followed if statement.

Conditional Branches with Conditional Moves

Some times use conditional move instruction can get a better matched to the performance characteristics of modern processors. As same example like Conditional control, but do not modify the value of lt_cnt and ge_cnt.

(a) Original C code

long absdiff(long x, long y)
{
    long result;
    if (x < y)
        result = y - x;
    else 
        result = x - y;
    return result;
}

(b) using conditional assignment

long cmovdiff(long x. long y)
{
    long rval = y - x;
    long eval = x - y;
    long ntest = x >= y;
    if (ntest) rval = eval;
    return rval;
}

; long absdiff(long x, long y)
; x in %rdi, y in %rsi
absdiff:
    movq %rsi, %rax
    subq %rdi, %rax     ; rval = y - x
    movq %rdi, %rdx
    subq %rsi, %rdx     ; eval = x - y
    cmpq %rsi, %rdi     ; ntest
    cmovge %rdx, %rax   ; if statement
    ret

The single cmovge instruction implements the conditional assignment. modern precessors achieve high performance through pipeline, where an instruction is processed via a sequence of stages, each performing one small portion of the required operations (e.g., fetching the instruction from memory, determining the instruction type, reading from memory, performing an arithmetic operation, writing to memory, and updating the program counter.) By using overlapping the steps of the successive instructions, such as fetching one instruction while performing the arighmetic operations for a previouss instruction. It need to able to determine the sequence of instructions to be executed well aheda of time in order to keep the pipeline full of instructions.

Processors employ sophisticated branch prediction logic to try to guess whether or not each jump instruction will be followed. modern microprocessor designs try to achieve success tates on the order of 90%. The control flow(jmp instruction) are hard to predict, while cmov instructions are considered as a single instruction.

on GCC compiler, using -Og flag will still to generating control flow instruction, while using -O2 flag will generating control data assembly-code by considering optimize.

The CMOV instructions have same suffix as SET instruction. The source and destination values can be 16, 32, or 64 bits long. single-byte conditional moves are not supported. The operand length can be inferred from the name of the destination register, so they do not have length suffix like MOV instructions.

Not all conditional expressions can be compiled using conditional moves, since conditional moves will execute both two conditions. If one of those two expressions could possibly generate an error or a side effect, this could lead to invalid behavior.

Code example:

1
2
3

long cread(long *xp) {
    return (xp ? *xp : 0);
}

; long cread(long *xp)
; Invalid implementation of function cread
; *xp in %rdi
cread:
    movq (%rdi), %rax
    movl $0, %edx
    testq %rdi, %rdi
    cmove %rdx, %rax
    ret

The upper assembly-code may cause null pointer reference.

Loops

Almost all assembly-code language are not support instructions like while or for, but we can implement loop by use goto or jmp instruction.

Do-While Loops
The general form of a do-while statement is as follows:

1
2
3

do
    body-statement
    while (test-expr)

Observe that body-statement is executed at least once. This general form can be translated into conditionals and goto statements as follows:

loop:
    body-statement;
    t = test-expr;
    if (t) goto loop;

(a) C code

long fact_do(long n)
{
    long result = 1;
    do {
        result *= n;
        n = n - 1;
    } while (n > 1);
    return result;
}

(b) goto version

long fact_do_goto(long n)
{
    long result = 1;
loop:
    result *= n;
    n = n - 1;
    if (n > 1) goto loop;
}

; long fact_do(long n)
; n in %rdi
fact_do:
    movq $1, %eax
.L2:
    imulq %rdi, %rax
    subq $1, %rdi
    cmpq $1, %rdi
    jg .L2
    rep
    ret

Reverse engineering assembly code have much challenging for sove complex programs. A key to reverse the loop is to find a mapping values and registers.

While Loops

The general form of a while statement is as follows:

1 2	`while (test-expr) body-statement`

There are a number of ways to translate a while loop tinto machine code, two of which are used in GCC.

In first method, which we refer to as jump to middle, performs the initial test by performing an unconditional jump to the test at the end of the loop. The general template is as follows:

    goto test;
loop:
    body-statement
test:
    t = test-expr;
    if (t) goto loop;

Code example:
(a) C code

long  fact_while(long n)
{
    long result = 1;
    while (n > 1) {
        result *= n;
        n = n - 1;
    }
    return result;
}

(b) goto version

long fact_while_jm_goto(long n)
{
    long result = 1;
    goto test;
loop:
    result *= n;
    n = n -1;
test:
    if (n > 1) goto loop;
}

; long fact_while (long n)
; n in %rdi
fact_while:
    movq $1, %eax
    jmp .L5
.L6:
    imulq %rdi, %rax
    subq $1, %rdi
.L5:
    cmpq $1, %rdi
    jg .L6
    rep ret

The second translation method, which we refer to as guarded do, first transforms the code into a do-while loop by using a conditional branch to skip over the loop if the initial test fails. GCC follows this strategy when compiling with higher levels of optimization.

CSAPP

#6.S081

CSAPP Chapater 3

https://acmicpc.top/2025/05/02/Computer System - A Programmer Perspective/

作者

江欣婷

发布于

2025年5月2日

许可协议

MIT 6.S081 lab4：traps | 陷入下一篇