X86 assembly language
x86 assembly language is the
assembly languagefor the x86class of processors, which includes Intel's Pentiumseries and AMD's Athlonseries. Like all assembly languages, it uses short mnemonicsto represent the fundamental operations that the CPUin a computer can perform. Compilers often produce assembly code as an intermediate step when translating a high level program into machine code. Regarded as a " programming language", assembly coding is "machine specific" and fairly low level. It is therefore mainly used for detailed or time critical applications such as bootloaders, operating systemkernels, and device drivers, as well as for real timeor small embedded systems.
Intel 8088and 8086 CPUs were 16-bit CPU's that first had an instruction setthat is now commonly referred to as x86. They were an evolutionary step up from the previous generation of 8-bit CPUs such as the 8080 and inherited many characteristics and instructions which were extended for the 16-bit era. Both CPUs contained a 20-bit address bus and 16-bit internal register width. The 8086 had a 16-bit data bus and 8-bit for the 8088 which was intended as a low-cost option targeted at the embedded market. The x86 assembly language also refers to the many different versions of CPUs that followed from Intel, such as 80188, 80186, 80286, 80386, 80486, Pentiumand non-Intel CPUs from AMDand Cyrix. The term x86 refers to all the CPUs that can run the same original assembly language.
x86 instruction setis really a series of extensions of instruction sets that began with the Intel 8008microprocessor. Nearly full binary backward compatibilitypresent between the Intel 8086 chip through to the modern Pentium 4, Intel Core, Athlon 64, Opteron, etc. processors. (There are certain unusual exceptions, such as the counted shift instructions, corrections to the original PUSHA instruction, some orphaned Intel 80286 semantics, the dropped LOADALLinstruction, and the Pentium 4 giving up on precise FPU operation counts.) This is accomplished through its use of two ISAs, something which is commonly criticized.
Mnemonics & Opcodes
Each x86 assembly instruction is represented by a
mnemonic, which in turn directly translates to a series of bytes which represent that instruction, called an opcode. For example, the NOPinstruction translates to 0x90 and the HLTinstruction translates to 0xF4. Some opcodes have no mnemonics named after them and are undocumented. However processors in the x86-family may interpret undocumented opcodes differently and hence might render a program useless. In some cases, invalid opcodes also generate processor exceptions.
x86 assembly language has two main syntax branches: "
Intelsyntax", originally used for documentation of the x86 platform, and " AT&Tsyntax".cite web|url=http://www.ibm.com/developerworks/library/l-gas-nasm.html|title=Linux assemblers: A comparison of GAS and NASM|date=2007-10-17|accessdate=2008-07-02|author=Ram Narayam] "Intel syntax" is dominant in the Windowsworld. In the Unix/Linux world, both are used because GCConly supported "AT&T-syntax" in former times.Fact|date=July 2008Here is a summarized list of the main differences between "Intel syntax" and "AT&T syntax":
* in "AT&T syntax", the source comes before the destination, in the opposite style from "Intel syntax"
* in "AT&T syntax", the opcodes are suffixed with a letter indicating the size of the operands (e.g. "l" for dword, "w" for word, and "b" for byte)
* in "AT&T syntax", immediate values must be prefixed with a "$", and registers must be prefixed with a "%"
* in "AT&T syntax", effective addresses use the general syntax "DISP(BASE,INDEX,SCALE)", whereas in "Intel syntax", effective addresses use variables, and need to be in square brackets; additionally, size keywords like 'byte', 'word' or 'dword' have to be used. For example, the following are equivalent:
** in "AT&T syntax": "movl mem_location(%ebx,%ecx,4), %eax"
** in "Intel syntax": "mov eax, dword [ebx + ecx*4 + mem_location] "
Most x86 assemblers use "Intel syntax" including
MASM, TASM, NASM, FASMand YASM. GAS supports both syntaxes now since version 2.10 via the ".intel_syntax" directive.cite web|url=http://webster.cs.ucr.edu/AsmTools/WhichAsm.html|title=Which Assembler is the Best?|accessdate=2008-05-18|author=Randall Hyde] [cite web|title = GNU Assembler News, v2.1 supports Intel syntax|url = http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/NEWS?rev=1.93&content-type=text/x-cvsweb-markup&cvsroot=src|date=2008-04-04|accessdate=2008-07-02]
x86 processors have a collection of registers available to be used as stores for binary data. Collectively the data and address registers are called the general registers.
With the general registers, there are additionally the:
* segment registers (CS, DS, ES, FS, GS, SS)
* other registers (IP instruction pointer, FLAGS)
* extra extension registers (MMX,
3DNow!, SSE, etc).
The IP register points to where in the program the processor is currently executing its code. The IP register cannot be accessed by the programmer directly.
The x86 registers can be used by using the MOV instructions. For example: mov ax, 1234h mov bx, axcopies the value 1234h into register ax and then copies the value of the ax register into the bx register. (Intel syntax)
x86 architecturein real and virtual 8086 mode uses a process known as segmentation to address memory, and not a linear method as used in other architectures. Segmentation involves decomposing a linear address into two parts - a "segment" and an "offset". The segment address points to the beginning of a 64K group of addresses and an offset from the base address of the specified segment. In real mode, to translate back into a linear address, the segment address is shifted four bits left (i.e. multiplied by 16) and then added to the offset.
Two registers are used for a memory address: one to hold the segment, and one to hold the offset.
In real mode only, for example, if DS contains the
hexadecimalnumber 0xDEAD and DX contains the number 0xCAFE they would together point to the memory address 0xDEAD * 0x10 + 0xCAFE = 0xEB5CE
In protected mode, the segment selector can be broken down into three parts: A 13-bit index, a TI (Table Indicator) bit that indicates whether the entry is in the GDT or LDT (which when loaded, looked up for the base), and a 2-bit RPL (Requested Privilege Level). See
x86 memory segmentation.
In referring to an address with a segment and an offset, the notation of "segment":"offset" is used, in the above example (for real mode only), the linear address 0xEB5CE can be written as 0xDEAD:0xCAFE, or if one has a segment and offset register pair, DS:DX.
There are some special combinations of segment registers and general registers that point to important addresses:
*CS:IP points to the address where the processor will fetch the next byte of code.
*SS:SP points to the location of the last item pushed onto the stack.
*DS:SI is often used to point to data that is about to be copied to ES:DI
The processor supports numerous modes of operation for x86 code in which some instructions are available and some are not. A 16-bit subset of instructions are available in "real mode" (available in all x86 processors), "16-bit protected mode" (available since the
80286), or "v86 mode" (available since the Intel 80386). In "32-bit protected mode" (available in processors starting with the Intel 80386) or " legacy mode" (available when 64 bit extensions are enabled), 32-bit instructions (plus SIMD instructions) are available. In "long mode" (available since the AMD Opteronprocessor) 64-bit instructions are available. The instruction set is based on similar ideas in each mode, but involves different ways of accessing memory and thus employs different programming strategies.
The modes in which x86 code can be executed in are:
Protected mode(16-bit and 32-bit)
Virtual 8086 mode(16-bit)
System Management Mode(16-bit)
By default, the processor starts in real mode; an
operating systemkernel, or other program, must explicitly switch to protected mode if it is to run in that mode, and, on x86-64processors, must then switch to long mode if it is to run in that mode. Switching modes can be accomplished by modifying certain bits of the processor's control registers.
In general, the features of the modern
x86 instruction setare:
*A compact encoding
** Variable length and alignment independent (encoded as
little endian, as is all data in the x86 architecture)
** Mainly one-address and two-address instructions, that is to say, the first
operandis also the destination.
** Memory operands as both source and destination are supported (frequently used to read/write stack elements addressed using small immediate offsets).
** Both general and implicit register usage; although all seven (counting ebp) general registers can be freely used as
accumulators or for addressing, most of them are also "implicitly" used by certain (more or less) special instructions; affected registers must therefore be temporarily preserved (normally stacked), if active during such instruction sequences.
* Produces conditional flags implicitly through most integer ALU instructions.
* Supports various
addressing modes including immediate, offset, and scaled index, but not PC-relative (except jumps) until x86-64.
floating pointto a stack of registers.
* Contains special support for atomic instructions (XCHG, CMPXCHG(8B), XADD, and integer instructions which combine with the LOCK prefix)
SIMDinstructions (instructions which perform parallel simultaneous single instructions on many operands encoded in adjacent cells of wider registers).
The x86 architecture has hardware support for an execution stack mechanism. Instructions such as push, call, pop, ret, etc are used with the properly set up stack to pass parameters, to allocate space for local data, and to save and restore call-return points. The ret "size" instruction is very useful for implementing space efficient (and thereby fast)
calling conventions where the callee is responsible for reclaiming stack space occupied by parameters.
When setting up a
stack frameto hold local data of a recursive procedure there are several choices; the high level enter instruction takes a "procedure-nesting-depth" argument as well as a "local size" argument, and may be faster than more explicit manipulations of the registers (such as push bp, mov bp,sp, sub sp,"size"). It depends on the particular x86 implementation (i.e. chip), as well as the calling convention and language compiled; the differences are not great however.
The full range of addressing modes (including "immediate" and "base+offset") even for instructions such as push and pop, makes direct usage of the stack for
integer, floating point, and address quantities simple. This also means that ABI specifications and mechanisms are fairly simple compared to some RISC architectures, which must be more explicit about call stack details.
Integer ALU instructions
x86 assembly has the standard mathematical operations, add, sub, mul, with idiv; the
logical operators and, or, xor, neg; bitshiftarithmetic and logical, sal/sar, shl/shr; rotate with and without carry, rcl/rcr, rol/ror, a complement of BCD arithmetic instructions, aaa, aad, daa and others.
Floating point instructions
x86 assembly language includes instructions for a stack-based floating point unit. They include addition, subtraction, negation, multiplication, division, remainder, square roots, integer truncation, fraction truncation, and scale by power of two. The operations also include conversion instructions which can load or store a value from memory in any of the following formats: Binary coded decimal, 32-bit integer, 64-bit integer, 32-bit floating point, 64-bit floating point or 80-bit floating point (upon loading, the value is converted to the currently used floating point mode). The x86 also includes a number of transcendental functions including sine, cosine, tangent, arctangent, exponentiation with the base 2 and logarithms to bases 2, 10, or e.
The stack register to stack register format of the instructions is usually F(OP) st, st(*) or F(OP) st(*), st. Where st is equivalent to st(0), and st(*) is one of the 8 stack registers (st(0), st(1), ..., st(7)) Like the integers, the first operand is both the first source operand and the destination operand. FSUBR and FDIVR should be singled out as first swapping the source operands before performing the subtraction or division. The addition, subtraction, multiplication, division, store and comparison instructions include instruction modes that will pop the top of the stack after their operation is complete. So for example FADDP st(1), st performs the calculation st(1) = st(1) + st(0), then removes st(0) from the top of stack, thus making what was the result in st(1) the top of the stack in st(0).
Modern x86 CPUs contain SIMD instructions, which largely perform the same operation in parallel on many values encoded in a wide SIMD register. Various instruction technologies support different operations on different register sets, but taken as complete whole (from MMX to SSE4.2) they include general computations on integer or floating point arithmetic (addition, subtraction, multiplication, shift, minimization, maximization, comparison, division or square root). So for example, PADDW MM0, MM1 performs 4 parallel 16-bit (indicated by the W) integer adds (indicated by the PADD) of mm0 values to mm1 and stores the result in mm0.
SSEalso includes a floating point mode in which only the very first value of the registers is actually modified (expanded in SSE2). Some other unusual instructions have been added including a sum of absolute differences(used for motion estimation in video compression, such as is done in MPEG) and a 16-bit multiply accumulation instruction (useful for software-based alpha-blending and digital filtering). SSE (since SSE3) and 3DNow!extensions include addition and subtraction instructions for treating paired floating point values like complex numbers.
These instruction sets also include numerous fixed sub-word instructions for shuffling, inserting and extracting the values around within the registers. In addition there are instructions for moving data between the integer registers and XMM (used in SSE)/FPU (used in MMX) registers.
Data manipulation instructions
The x86 processor also includes complex addressing modes for addressing memory with an immediate offset, a register, a register with an offset, a scaled register with or without an offset, and a register with an optional offset and another scaled register. So for example, one can encode mov eax, [Table + ebx + esi*4] as a single instruction which loads 32 bits of data from the address computed as (Table + ebx + esi * 4) offset from the DS selector, and stores it to the eax register. In general the x86 processor can load and use memory matched to the size of any register it is operating on. (The SIMD instructions also include half-load instructions.)
The x86 instruction set includes string load, store and move instructions (LODS, STOS, and MOVS) which perform each operation to a specified size (B for 8-bit byte, W for 16-bit word, D for 32-bit double word) then increments/decrements (depending on DF, direction flag) the implicit address register (SI for LODS, DI for STOS, and both for MOVS). For the load and store, the implicit target/source register is in the AL, AX or EAX register (depending on size). The implicit segment used is DS for LODS, ES for STOS and both for MOVS. In modern x86 processors, these complex instructions don't offer any performance advantage over more simply implemented separate load/store and address increment instructions.
The stack is implemented with an implicitly decrementing (push) and incrementing (pop) stack pointer. In 16-bit mode, this implicit stack pointer is addressed as SS: [SP] , in 32-bit mode it's SS: [ESP] , and in 64-bit mode it's [RSP] . The stack pointer actually points to the last value that was be stored, under the assumption that its size will match the operating mode of the processor (i.e., 16, 32, or 64 bits) to match the default width of the PUSH/POP/CALL/RET instructions. Also included are the instructions ENTER and LEAVE which reserve and remove data from the top of the stack while setting up a stack frame pointer in BP/EBP/RBP. However, direct setting, or addition and subtraction to the SP/ESP/RSP register is also supported, so the ENTER/LEAVE instructions are generally unnecessary. Other instructions for manipulating the stack include PUSHF/POPF for storing and retrieving the (E)FLAGS register. The PUSHA/POPA instructions will store and retrieve the entire integer register state to and from the stack.
Values for a SIMD load or store are assumed to be packed in adjacent positions for the SIMD register and will align them in sequential little-endian order. Some SSE load and store instructions require 16-byte alignment to function properly. The SIMD instruction sets also include "prefetch" instructions which perform the load but do not target any register, used for cache loading. The SSE instruction sets also include non-temporal store instructions which will perform stores straight to memory without performing a cache allocate if the destination is not already cached (otherwise it will behave like a regular store.)
Most generic integer and floating point (but no SIMD) instructions can use one parameter as a complex address as the second source parameter. Integer instructions can also accept one memory parameter as a destination operand.
The x86 assembly has an unconditional jump operation,
jmp, which can take an immediate address, a register or an indirect address as a parameter. (Note that most RISC processors only support a link register or short immediate displacement for jumping.)
Also supported are several conditional jumps, including
je(jump on equality),
jne(jump on inequality),
jg(jump on greater than, signed),
jl(jump on less than, signed),
ja(jump on above/greater than, unsigned),
jb(jump on below/less than, unsigned). These conditional operations are based on the state of specific bits in the (E)FLAGS register. Many arithmetic and logic operations set, clear or complement these flags depending on their result. The comparison
testinstructions set the flags as if they had performed a subtraction or a bitwise AND operation, respectively, without altering the values of the operands. There are also instructions such as
clc(clear carry flag) and
cmc(complement carry flag) which work on the flags directly. Floating point comparisons are performed via FCOM or FICOM instructions which eventually have to be converted to integer flags.
Each jump operation has three different forms, depending on the size of the operand. A "short" jump uses an 8-bit signed operand, which is a relative offset from the current instruction. A "near" jump is similar to a short jump but uses a 16-bit signed operand (in real or protected mode) or a 32-bit signed operand (in 32-bit protected mode only). A "far" jump is one that uses the full segment base:offset value as an absolute address. There are also indirect and indexed forms of each of these.
In addition to the simple jump operations, there are the
call(call a subroutine) and
ret(return from subroutine) instructions. Before transferring control to the subroutine,
callpushes the segment offset address of the instruction following the
callonto the stack;
retpops this value off the stack, and jumps to it, effectively returning the flow of control to that part of the program. In the case of a
far call, the segment base is pushed following the offset;
far retpops the offset and then the segment base to return.
There are also two similar instructions,
int(interrupt), which saves the current (E)FLAGS register value on the stack, then performs a
far call, except that instead of an address, it uses an "interrupt vector", an index into a table of interrupt handler addresses. Typically, the interrupt handler saves all other CPU registers it uses, unless they are used to return the result of an operation to the calling program (in software called interrupts). The matching return from interrupt instruction is
iret, which restores the flags after returning. "Soft Interrupts" of the type described above are used by some operating systems for system calls, and can also be used in debugging hard interrupt handlers. "Hard interrupts" are triggered by external hardware events, and must preserve all register values as the state of the currently executing program is unknown. In Protected Mode, interrupts may be set up by the OS to trigger a task switch, which will automatically save all registers of the active task.
Using the flags register
Flags are notably used in the x86 architecture for comparisons. A comparison is made between two registers, for example, and in comparison of their difference a flag is raised. A jump instruction then checks the respective flag and jumps if the flag has been raised: for example cmp eax, ebx jne do_something
Flags are also used in the x86 architecture to turn on and off certain features or execution modes. For example, to disable the processing of interrupts you can use the command: cli
The flags register can also be directly accessed. The low 8 bits of the flag register can be loaded into AH using the LAHF instruction. The entire flags register can also be moved on and off the stack using the instructions PUSHF, POPF, INT (including INTO) and IRET.
Using the instruction pointer register
There is also a 32-bit
instruction pointer, named EIP. The EIP register points to where in the program the processor is currently executing its code. The EIP register cannot be accessed by the programmer directly. Instead, a sequence like the following can be done to retrieve the address of "next_line" into EAX:
call next_line next_line: pop eax
This works even in
position-independent codebecause call takes an EIP-relative immediate operand.To write to EIP is simple: jmp eax
X86 instruction listings
List of assemblers
* [http://www.intel.com/products/processor/manuals/index.htm Intel 64 and IA-32 Software Developer Manuals]
* [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24592.pdf AMD64 Architecture Programmer's Manual Volume 1: Application Programming] (PDF)
* [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf AMD64 Architecture Programmer's Manual Volume 2: System Programming] (PDF)
* [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24594.pdf AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions] (PDF)
* [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26568.pdf AMD64 Architecture Programmer's Manual Volume 4: 128-Bit Media Instructions] (PDF)
* [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26569.pdf AMD64 Architecture Programmer's Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions] (PDF)
* [http://siyobik.info/index.php?document=x86_32bit_asm An Introduction to Writing 32-bit Applications Using the x86 Assembly Language]
Wikimedia Foundation. 2010.
Look at other dictionaries:
Assembly language — See the terminology section below for information regarding inconsistent use of the terms assembly and assembler. Motorola MC6800 Assembly Language An assembly language is a low level programming language for computers, microprocessors,… … Wikipedia
x86 — This article is about Intel microprocessor architecture in general. For the 32 bit generation of this architecture which is also called x86 , see IA 32. x86 Designer Intel, AMD Bits 16 bit, 32 bit, and/or 64 bit Introduced 1978 Design … Wikipedia
X86 architecture — The generic term x86 refers to the most commercially successful instruction set architecture [Unlike the microarchitecture (and the specific electronic and physical implementation) used for a specific chip design] in the history of personal… … Wikipedia
X86 instruction listings — The x86 instruction set has undergone numerous changes over time. Most of them were to add new functionality to the instruction set.x86 integer instructionsThis is the full 8086/8088 instruction set, but most, if not all of these instructions are … Wikipedia
X86 calling conventions — This article describes the calling conventions used on the x86 architecture.Calling conventions describe the interface of called code: * The order in which parameters are allocated * Where parameters are placed (pushed on the stack or placed in… … Wikipedia
Lenguaje ensamblador x86 — El lenguaje ensamblador x86 es la familia de los lenguajes ensambladores para los procesadores de la familia x86, que incluye desde los procesadores Intel 8086 y 8088, pasando por los Pentium de Intel y los Athlon de AMD y llegando hasta los… … Wikipedia Español
MOV (x86 instruction) — In the x86 assembly language, the MOV instruction is a mnemonic for the copying of data from one location to another. The x86 assembly language actually contains a number of different opcodes that perform a move. Depending on whether the… … Wikipedia
JMP (x86 instruction) — In the x86 assembly language, the JMP instruction is a mnemonic for an unconditional JuMP. Such an instruction transfers the flow of execution by changing the instruction pointer register. The x86 assembly language actually contains a number of… … Wikipedia
TEST (x86 instruction) — In the x86 assembly language, the TEST instruction performs a bitwise AND on two operands. The flags SF, ZF, PF, CF, OF and AF are modified while the result of the AND is discarded. There are 9 different opcodes for the TEST instruction depending … Wikipedia
High Level Assembly — Infobox Software name = High Level Assembly (HLA) Language developer = Randall Hyde latest release version = 1.102 Beta latest release date = release date|2008|05|02 operating system = Windows, Linux, FreeBSD, Mac OS X genre = Assembler license … Wikipedia