programming_under_the_hood/InstructionsAp.tex at master · johnnyb/programming_under_the_hood · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
\chapter{Common x86 Instructions}
\label{instructionsappendix}

\section{Reading the Tables}

The tables of instructions presented in this appendix include:

\begin{itemize}\item The instruction code
\item The operands used
\item The flags used
\item A brief description of what the instruction does
\end{itemize}

In the operands section, it will list the type of operands it takes.  If
it takes more than one operand, each operand will be separated by a comma.
Each operand will have a list of codes which tell whether the operand can
be an immediate-mode value (I), a register (R), or a memory address (M).
For example, the \icode{movl} instruction is listed as
\icode{I/R/M, R/M}.  This means that the first operand can
be any kind of value, while the second operand must be a register or memory
location.  Note, however, that in x86 assembly language you cannot have more
than one operand be a memory location.

In the flags\index{flags} section, it lists the flags
in the {\eflagsRegIdx} register affected by the instruction.
The following flags are mentioned:

\begin{description}
\item[O] Overflow flag\index{overflow flag}.  This is set to true if the destination operand was not large enough to hold the result of the instruction.
\item[S] Sign flag\index{sign flag}.  This is set to the sign of the last result.
\item[Z] Zero flag\index{zero flag}.  This flag is set to true if the result of the instruction is zero.
\item[A] Auxiliary carry flag\index{auxiliary carry flag}.  This flag is set for carries and borrows between the third and fourth bit.  It is not often used.
\item[P] Parity flag\index{parity flag}.  This flag is set to true if the low byte of the last result had an even number of 1 bits.
\item[C] Carry flag\index{carry flag}.  Used in arithmetic to say whether or not the result should be carried over to an additional byte.  If the carry flag is set, that usually means
that the destination register could not hold the full result.  It is up to
the programmer to decide on what action to take (i.e. - propogate the result
to another byte, signal an error, or ignore it entirely).
\end{description}

Other flags exist, but they are much less important.

\section{Data Transfer Instructions}
\label{dtins}

These instructions perform little, if any computation.  Instead they are mostly used for moving data from one place to another.

\begin{table}[h]
\begin{tabular}{l | l | l | l}

Instruction & Operands & Affected Flags \\
movl\index{movl} & I/R/M, I/R/M & O/S/Z/A/C & This copies a word of data from one location to another.  \icode{movl {\eaxBare}, {\ebxBare}} copies the contents of {\eaxReg} to {\ebxReg} \\
movb\index{movb} & I/R/M, I/R/M & O/S/Z/A/C & Same as \icode{movl}, but operates on individual bytes. \\
leal\index{leal} & M, I/R/M & O/S/Z/A/C & This takes a memory location given in the standard format, and, instead of loading the contents of the memory location, loads the computed address.  For example, \icode{leal 5({\ebpBare},{\ecxBare},1), {\eaxBare}} loads the address computed by \icode{5 + {\ebpBare} + 1*{\ecxBare}} and stores that in {\eaxReg} \\
popl\index{popl} & R/M & O/S/Z/A/C & Pops the top of the stack into the given location.  This is equivalent to performing \icode{movl ({\espBare}), R/M} followed by \icode{addl \$4, {\espBare}}.  \icode{popfl} is a variant which pops the top of the stack into the {\eflagsReg}\index{\%eflags} register. \\
pushl\index{pushl} & I/R/M & O/S/Z/A/C & Pushes the given value onto the stack.  This is the equivalent to performing \icode{subl \$4, {\espBare}} followed by \icode{movl I/R/M, ({\espBare})}. \icode{pushfl} is a variant which pushes the current contents of the {\eflagsRegIdx} register onto the top of the stack. \\
xchgl\index{xchgl} & R/M, R/M & O/S/Z/A/C Exchange the values of the given operands. \\
\end{tabular}
\caption{Data Transfer Instructions}
\end{table}

\section{Integer Instructions}
\label{intins}

These are basic calculating instructions that operate on signed or unsigned
integers.

\begin{table}[h]
\begin{tabular}{l | l | l | l}

Instruction & Operands & Affected Flags \\
adcl\index{adcl} & I/R/M, R/M & O/S/Z/A/P/C & Add with carry.  Adds the carry bit and the first operand to the second, and, if there is an overflow, sets overflow and carry to true.  This is usually used for operations larger than a machine word.  The addition on the least-significant word would take place using \icode{addl}, while additions to the other words would used the \icode{adcl} instruction to take the carry from the previous add into account.  For the usual case, this is not used, and \icode{addl} is used instead. \\
addl\index{addl} & I/R/M, R/M & O/S/Z/A/P/C & Addition.  Adds the first operand to the second, storing the result in the second.  If the result is larger than the destination register, the overflow and carry bits are set to true.  This instruction operates on both signed and unsigned integers. \\
cdq\index{cdq} &  & O/S/Z/A/P/C & Converts the \index{{\eaxReg}}\index{{\edxReg}}{\eaxReg} word into the double-word consisting of {\edxReg}:{\eaxReg} with sign extension.  The \icode{q} signifies that it is a \emph{quad-word}.  It's actually a double-word, but it's called a quad-word because of the terminology used in the 16-bit days.  This is usually used before issuing an \icode{idivl} instruction. \\
cmpl\index{cmpl} & I/R/M, R/M & O/S/Z/A/P/C & Compares two integers.  It does this by subtracting the first operand from the second.  It discards the results, but sets the flags accordingly.  Usually used before a conditional jump. \\
decl\index{decl} & R/M & O/S/Z/A/P & Decrements the register or memory location.  Use \icode{decb} to decrement a byte instead of a word. \\
divl\index{divl} & R/M & O/S/Z/A/P & Performs unsigned division.  Divides the contents of the double-word contained in the combined {\edxReg}:{\eaxRegIdx} registers by the value in the register or memory location specified.  The {\eaxReg} register contains the resulting quotient, and the {\edxReg} register contains the resulting remainder.  If the
quotient is too large to fit in {\eaxReg}, it triggers a type 0 interrupt. \\
idivl\index{idivl} & R/M & O/S/Z/A/P & Performs signed division.  Operates just like \icode{divl} above. \\
imull\index{imull} & R/M/I, R & O/S/Z/A/P/C & Performs signed multiplication and stores the result in the second operand.  If the second operand is left out, it is assumed to be {\eaxReg}, and the full result is stored in the double-word {\edxRegIdx}:{\eaxRegIdx}. \\
incl\index{incl} & R/M & O/S/Z/A/P & Increments the given register or memory location.  Use \icode{incb} to increment a byte instead of a word. \\
mull\index{mull} & R/M/I, R & O/S/Z/A/P/C & Perform unsigned multiplication.  Same rules as apply to \icode{imull}. \\
negl\index{negl} & R/M & O/S/Z/A/P/C & Negates (gives the two's complement\index{two's complement} inversion of) the given register or memory location. \\
sbbl\index{sbbl} & I/R/M, R/M & O/S/Z/A/P/C & Subtract with borrowing.  This is used in the same way that \icode{adc} is, except for subtraction.  Normally only \icode{subl} is used. \\
subl\index{subl} & I/R/M, R/M & O/S/Z/A/P/C & Subtract the two operands.  This subtracts the first operand from the second, and stores the result in the second operand.  This instruction can be used on both signed and unsigned numbers. \\
\end{tabular}
\caption{Integer Instructions}
\end{table}

\section{Logic Instructions}
\label{logicins}

These instructions operate on memory as bits instead of words.

\begin{table}[h]
\begin{tabular}{l | l | l | l}

Instruction & Operands & Affected Flags \\
andl\index{andl} & I/R/M, R/M & O/S/Z/P/C & Performs a logical and of the contents of the two operands, and stores the result in the second operand.  Sets the overflow and carry flags to false. \\
notl\index{notl} & R/M &  & Performs a logical not on each bit in the operand.  Also known as a one's complement\index{one's complement}. \\
orl\index{orl} & I/R/M, R/M & O/S/Z/A/P/C & Performs a logical or between the two operands, and stores the result in the second operand.  Sets the overflow and carry flags to false. \\
rcll\index{rcll} & I/{\clReg}, R/M & O/C & Rotates the given location's bits to the left the number of times in the first operand, which is either an immediate-mode value or the register {\clReg}.  The carry flag is included in the rotation, making it use 33 bits instead of 32.  Also sets the overflow flag. \\
rcrl\index{rcrl} & I/{\clReg}, R/M & O/C & Same as above, but rotates right. \\
roll\index{roll} & I/{\clReg}, R/M & O/C & Rotate bits to the left.  It sets the overflow and carry flags, but does not count the carry flag as part of the rotation.  The number of bits to roll is either specified in immediate mode or is contained in the {\clReg} register. \\
rorl\index{rorl} & I/{\clReg}, R/M & O/C & Same as above, but rotates right. \\
sall\index{sall} & I/{\clReg}, R/M & C & Arithmetic shift left.  The sign bit is shifted out to the carry flag, and a zero bit is placed in the least significant bit.  Other bits are simply shifted to the left.  This is the same as the regular shift left.  The number of bits to shift is either specified in immediate mode or is contained in the {\clReg} register. \\
sarl\index{sarl} & I/{\clReg}, R/M & C & Arithmetic shift right.  The least significant bit is shifted out to the carry flag.  The sign bit is shifted in, and kept as the sign bit.  Other bits are simply shifted to the right.  The number of bits to shift is either specified in immediate mode or is contained in the {\clReg} register. \\
shll\index{shll} & I/{\clReg}, R/M & C & Logical shift left.  This shifts all bits to the left (sign bit is not treated specially).  The leftmost bit is pushed to the carry flag.  The number of bits to shift is either specified in immediate mode or is contained in the {\clReg} register. \\
shrl\index{shrl} & I/{\clReg}, R/M & C & Logical shift right.  This shifts all bits in the register to the right (sign bit is not treated specially).  The rightmost bit is pushed to the carry flag.  The number of bits to shift is either specified in immediate mode or is contained in the {\clReg} register. \\
testl\index{testl} & I/R/M, R/M & O/S/Z/A/P/C & Does a logical and of both operands and discards the results, but sets the flags accordingly. \\
xorl\index{xorl} & I/R/M, R/M & O/S/Z/A/P/C & Does an exclusive or on the two operands, and stores the result in the second operand.  Sets the overflow and carry flags to false. \\
\end{tabular}
\caption{Logic Instructions}
\end{table}

\section{Flow Control Instructions}
\label{flowins}

These instructions may alter the flow of the program.

\begin{table}[h]
\begin{tabular}{l | l | l | p{2in}}

Instruction & Operands & Affected Flags \\
call\index{call} & destination address & O/S/Z/A/C & This pushes what would be the next value for {\eipReg} onto the stack, and jumps to the destination address.  Used for function calls.  Alternatively, the destination address can be an asterisk followed by a register for an indirect function call.  For example, \icode{call *{\eaxBare}} will call the function at the address in {\eaxReg}. \\
int\index{int} & I & O/S/Z/A/C & Causes an interrupt of the given number.  This is usually used for system calls and other kernel interfaces. \\
Jcc\index{Jcc} & destination address & O/S/Z/A/C & Conditional branch.  \icode{cc} is the \emph{condition code}.  Jumps to the given address if the condition code is true (set from the previous instruction, probably a comparison).  Otherwise, goes to the next instruction.  The condition codes are:
\begin{itemize}
\item \icode{[n]a[e]} - above (unsigned greater than).  An \icode{n} can be added for "not" and an \icode{e} can be added for "or equal to"
\item \icode{[n]b[e]} - below (unsigned less than)
\item \icode{[n]e} - equal to
\item \icode{[n]z} - zero
\item \icode{[n]g[e]} - greater than (signed comparison)
\item \icode{[n]l[e]} - less than (signed comparison)
\item \icode{[n]c} - carry flag set
\item \icode{[n]o} - overflow flag set
\item \icode{[p]p} - parity flag set
\item \icode{[n]s} - sign flag set
\item \icode{ecxz} - {\ecxReg}\index{\%ecx} is zero
\end{itemize} \\
jmp\index{jmp} & destination address & O/S/Z/A/C & An unconditional jump.  This simply sets {\eipReg} to the destination address.  Alternatively, the destination address can be an asterisk followed by a register for an indirect jump.  For example, \icode{jmp *{\eaxBare}} will jump to the address in {\eaxReg}. \\
ret\index{ret} &  & O/S/Z/A/C & Pops a value off of the stack and then sets {\eipReg} to that value.  Used to return from function calls. \\
\end{tabular}
\caption{Flow Control Instructions}
\end{table}

\section{Assembler Directives}
\label{dirins}

These are instructions to the assembler and linker, instead of instructions
to the processor.  These are used to help the assembler put your code together
properly, and make it easier to use.

\begin{table}[h]
\begin{tabular}{l | l | l}

Directive & Operands \\
.ascii\index{.ascii} & QUOTED STRING & Takes the given quoted string and converts it into byte data. \\
.byte\index{.byte} & VALUES & Takes a comma-separated list of values and inserts them right there in the program as data. \\
.endr\index{.endr} &  & Ends a repeating section defined with \icode{.rept}. \\
.equ\index{.equ} & LABEL, VALUE & Sets the given label equivalent to the given value.  The value can be a number, a character, or an constant expression that evaluates to a a number or character.  From that point on, use of the label will be substituted for the given value. \\
.globl\index{.globl} & LABEL & Sets the given label as global, meaning that it can be used from separately-compiled object files. \\
.include\index{.include} & FILE & Includes the given file just as if it were typed in right there. \\
.lcomm\index{.lcomm} & SYMBOL, SIZE & This is used in the \icode{.bss} section to specify storage that should be allocated when the program is executed.  Defines the symbol with the address where the storage will be located, and makes sure that it is the given number of bytes long. \\
.long\index{.long} & VALUES & Takes a sequence of numbers separated by commas, and inserts those numbers as 4-byte words right where they are in the program. \\
.rept\index{.rept} & COUNT & Repeats everything between this directive and the \icode{.endr} directives the number of times specified. \\
.section\index{.section} & SECTION NAME & Switches the section that is being worked on.  Common sections include \icode{.text} (for code), \icode{.data} (for data embedded in the program itself), and \icode{.bss} (for uninitialized global data). \\
.type\index{.type} & SYMBOL, @function & Tells the linker that the given symbol is a function. \\
\end{tabular}
\caption{Assembler Directives}
\end{table}

\section{Differences in Other Syntaxes and Terminology}

The syntax for assembly language used in this book is known at the
\emph{AT\&T syntax}\index{AT\&T syntax}.  It is the one supported by the
GNU tool chain that comes standard with every Linux distribution.  However,
the official syntax for x86 assembly language (known as the Intel\textregistered syntax\index{Intel syntax})
is different.  It is
the same assembly language for the same platform, but it looks different.
Some of the differences include:

\begin{itemize}\item In Intel syntax, the operands of instructions are often reversed.  The destination operand is listed before the source operand.
\item In Intel syntax, registers are not prefixed with the percent sign (\icode{\%}).
\item In Intel syntax, a dollar-sign (\icode{\$}) is not required to do immediate-mode addressing.  Instead, non-immediate addressing is accomplished by surrounding the address with brackets (\icode{[]}).
\item In Intel syntax, the instruction name does not include the size of data being moved.  If that is ambiguous, it is explicitly stated as \icode{BYTE}, \icode{WORD}, or \icode{DWORD} immediately after the instruction name.
\item The way that memory addresses are represented in Intel assembly language is much different (shown below).
\item Because the x86 processor line originally started out as a 16-bit processor, most literature about x86 processors refer to words as 16-bit values, and call 32-bit values double words.  However, we use the term "word" to refer to the standard register size on a processor, which is 32 bits on an x86 processor.  The syntax also keeps this naming convention - \icode{DWORD} stands for "double word" in Intel syntax and is used for standard-sized registers, which we would call simply a "word".
\item Intel assembly language has the ability to address memory as a segment/offset pair.  We do not mention this because Linux does not support segmented memory, and is therefore irrelevant to normal Linux programming.
\end{itemize}

Other differences exist, but they are small in comparison.  To show some
of the differences, consider the following instruction:

\begin{simpletyping}
\begin{lstlisting}
movl {\eaxBare}, 8({\ebxBare},{\ediBare},4)
\end{lstlisting}
\end{simpletyping}

In Intel syntax, this would be written as:

\begin{simpletyping}
\begin{lstlisting}
mov  [8 + {\ebxBare} + 1 * edi], eax
\end{lstlisting}
\end{simpletyping}

The memory reference is a bit easier to read than its AT\&T counterpart
because it spells out exactly how the address
will be computed.  However, but the order of operands in Intel syntax can be
confusing.

\section{Where to Go for More Information}

Intel has a set of comprehensive guides to their processors.  These
are available at http://www.intel.com/design/pentium/manuals/  Note that
all of these use the Intel syntax, not the AT\&T syntax.  The most important
ones are their \documentname{IA-32 Intel Architecture Software Developer's Manual}
in its three volumes::

\begin{itemize}\item Volume 1: System Programming Guide (http://developer.intel.com/design/pentium4/manuals/245470.htm)
\item Volume 2: Instruction Set Reference (http://developer.intel.com/design/pentium4/manuals/245471.htm)
\item Volume 3: System Programming Guide (http://developer.intel.com/design/pentium4/manuals/245472.htm)
\end{itemize}

In addition, you can find a lot of information in the manual for the GNU assembler, available online at http://www.gnu.org/software/binutils/manual/gas-2.9.1/as.html.  Similarly, the manual for the GNU linker is available online at http://www.gnu.org/software/binutils/manual/ld-2.9.1/ld.html.