HASM Assembly Language Reference
Overview
HASM is a modern assembly language designed for 32-bit and 64-bit architectures. This documentation provides a comprehensive reference for the instruction set, register usage, memory addressing modes, and programming conventions.
This documentation assumes familiarity with basic computer architecture concepts. For beginners, we recommend starting with an introductory assembly programming guide before diving into this reference.
Key Features
- Support for both 32-bit and 64-bit operation modes
- Comprehensive set of arithmetic, logical, and control flow instructions
- Flexible memory addressing modes
- Standardized register usage conventions
- Support for system calls and interrupts
Immediate Values
Immediate values are constant values directly embedded in instructions. They can be represented in multiple formats:
| Format | Prefix | Example |
|---|---|---|
| Hexadecimal | 0x |
0x1A |
| Decimal | None | 26 |
| Binary | 0b |
0b11010 |
Immediate values must fit within the size constraints of the operation. Using a value that's too large for the target operand will result in truncation or an error.
Registers
Registers are small, fast storage locations within the CPU. HASM provides registers in various sizes for different purposes.
Register Categories
General-Purpose Registers (64-bit)
| Register | Purpose | 32-bit | 16-bit | 8-bit (low) | 8-bit (high) |
|---|---|---|---|---|---|
| RAX | Accumulator | EAX | AX | AL | AH |
| RBX | Base | EBX | BX | BL | BH |
| RCX | Counter | ECX | CX | CL | CH |
| RDX | Data | EDX | DX | DL | DH |
| RSI | Source Index | ESI | SI | - | |
| RDI | Destination Index | EDI | DI | - | |
| RBP | Base Pointer | EBP | BP | - | |
| RSP | Stack Pointer | ESP | SP | - | |
Special-Purpose Registers
| Register | Purpose |
|---|---|
| RIP | Instruction Pointer (not a valid operand in most instructions) |
| RFLAGS | Flags Register (not a valid operand in most instructions) |
Register Usage Conventions
When writing functions, follow these conventions for register preservation:
- Caller-saved: RAX, RCX, RDX, R8, R9, R10, R11
- Callee-saved: RBX, RBP, RDI, RSI, R12-R15
Memory Addressing
Memory in HASM is accessed through various addressing modes that calculate effective addresses.
Addressing Modes
| Mode | Syntax | Example |
|---|---|---|
| Direct | [address] |
[0x1000] |
| Register Indirect | [register] |
[RAX] |
| Base + Offset | [base + offset] |
[RBX + 8] |
| Indexed | [base + index * scale] |
[RAX + RBX * 4] |
| Complex | [base + index * scale + offset] |
[RBP + RCX * 8 + 16] |
Memory Size Specifiers
When accessing memory, you can explicitly specify the operand size:
| Keyword | Size | Example |
|---|---|---|
| BYTE | 8 bits | BYTE [RAX] |
| WORD | 16 bits | WORD [RBX] |
| DWORD | 32 bits | DWORD [RCX] |
| QWORD | 64 bits | QWORD [RDX] |
The base and index registers in memory addressing must be of the same size (both 32-bit or both 64-bit). Mixing sizes is not allowed.
Program Structure
GLOBAL Directive
Specifies the entry point of the program.
Description
The GLOBAL directive defines the ELF headers as well as the entry point into the program. It must appear before any SECTION directives.
Examples
; Single entry point
GLOBAL _start
For the program to run, a GLOBAL symbol is required as the program entry point. Without the GLOBAL symbol, the program will not have an ELF header.
SECTION Directive
Defines memory sections for organizing code and data.
Description
Sections are used to organize different parts of the program:
- TEXT: Contains executable code (instructions)
- DATA: Contains initialized data
- BSS: Contains uninitialized data (variables with no initial value)
The BSS section is for reserving space for variables that do not have an initial value. These variables will be zero-initialized at runtime.
Examples
SECTION TEXT
_start:
MOV RAX, 1
RET
SECTION DATA
message:
DB "Hello", 0
SECTION BSS
buffer:
TIMES 100 DB 0
Code must be placed in the TEXT section, while variables and constants belong in the DATA section. Uninitialized data should be placed in the BSS section.
Comments
Adds explanatory text that is ignored by the assembler.
Description
Comments begin with a semicolon (;) and continue to the end of the line. They can appear anywhere in the program.
Examples
; This is a full-line comment
MOV RAX, 1 ; This is an end-of-line comment
; Comments can describe sections:
SECTION DATA ; Start data section
Data Definitions
DB - Define Byte
Allocates and optionally initializes byte-sized data.
Description
DB allocates one or more bytes of storage, optionally initialized with specified values. Each value can be:
- A numeric constant (decimal, hex, or binary)
- A character or string in quotes
- An expression
Examples
byte_var:
DB 0x1A ; Single byte in hex
DB 27 ; Single byte in decimal
DB 'A' ; Character
DB "Hello", 0 ; String with null terminator
DB 1, 2, 3 ; Multiple bytes
DW - Define Word
Allocates and optionally initializes word-sized (2-byte) data.
Description
DW allocates one or more 16-bit words of storage. Values are stored in little-endian format.
Examples
word_var:
DW 0x1234 ; Single word
DW 1000 ; Decimal value
DW 'A', 'B' ; Two characters
DD - Define Doubleword
Allocates and optionally initializes doubleword-sized (4-byte) data.
Description
DD allocates one or more 32-bit doublewords of storage. Can be used for 32-bit integers.
Examples
dword_var:
DD 0x12345678 ; 32-bit hex value
DQ - Define Quadword
Allocates and optionally initializes quadword-sized (8-byte) data.
Description
DQ allocates one or more 64-bit quadwords of storage. Can be used for 64-bit integers.
Examples
qword_var:
DQ 0x123456789ABCDEF0 ; 64-bit value
TIMES Directive
Repeats data allocation or instructions a specified number of times.
Description
TIMES repeats the following data definition or instruction the specified number of times. Useful for allocating buffers or repeating instructions.
Examples
; Allocate a 100-byte buffer
buffer:
TIMES 100 DB 0
; Initialize array
array:
TIMES 10 DW 0xFFFF
As of now, TIMES can only be used with data definitions.
Data Movement Instructions
MOV - Move Data
Copies data from the source operand to the destination operand.
Description
The MOV instruction is the most fundamental data transfer operation. It copies the value from the source operand to the destination operand without modifying the source.
Operand Types
- Register to Register:
MOV RAX, RBX - Immediate to Register:
MOV RCX, 42 - Memory to Register:
MOV RDX, [RDI] - Register to Memory:
MOV [RSI], RAX
Examples
; Copy register to register
MOV RAX, RBX
; Load immediate value
MOV RCX, 0x1234
; Load from memory
MOV RDX, [RDI + 8]
; Store to memory
MOV [RSI], RAX
MOV cannot transfer data directly between two memory locations. Use a register as an intermediate step if needed.
PUSH - Push onto Stack
Decrements the stack pointer and stores the source operand on the top of the stack.
Description
PUSH decrements the stack pointer (RSP) by the operand size and then stores the source operand at the new top of stack location.
Operand Types
- Register:
PUSH RAX - Memory:
PUSH QWORD [RBP+8] - Immediate:
PUSH 42
Examples
; Push register
PUSH RBX
; Push memory value
PUSH QWORD [RDI]
; Push immediate value
PUSH 0xABCD
POP - Pop from Stack
Loads the value from the top of the stack into the destination operand and increments the stack pointer.
Description
POP loads the value at the current top of stack (pointed to by RSP) into the destination operand, then increments RSP by the operand size.
Operand Types
- Register:
POP RAX - Memory:
POP QWORD [RBP-8]
Examples
; Pop to register
POP RCX
; Pop to memory
POP QWORD [RDI+16]
Attempting to POP when the stack is empty will result in undefined behavior or a segmentation fault.
LEA - Load Effective Address
Computes the effective address of the source operand and stores it in the destination register.
Description
LEA calculates the memory address specified by the source operand (without actually accessing memory) and stores the computed address in the destination register.
Operand Types
- Register destination:
LEA RAX, [RBX+RCX*4] - Complex addressing:
LEA RDI, [RBP+R12*8+32]
Examples
; Array element address calculation
LEA RSI, [RAX+RBX*4]
; Structure field access
LEA RDX, [RDI+16]
LEA is often used for arithmetic operations since it can compute complex address calculations in a single instruction without memory access.
XCHG - Exchange
Exchanges the contents of two operands.
Description
XCHG swaps the values of its two operands. At least one operand must be a register.
Operand Types
- Register with Register:
XCHG RAX, RBX - Register with Memory:
XCHG RCX, [RDX]
Examples
; Swap two registers
XCHG RAX, RBX
; Swap register with memory
XCHG RCX, [RDX+8]
When one operand is memory, XCHG is performed as an atomic operation, making it useful for synchronization in multithreaded code.
Arithmetic Instructions
ADD - Addition
Adds the source operand to the destination operand and stores the result in the destination.
Description
ADD performs integer addition, setting flags based on the result. It supports the same operand combinations as MOV.
Examples
; Register to register
ADD RAX, RBX
; Immediate to register
ADD RCX, 42
; Memory to register
ADD RDX, [RSI+8]
SUB - Subtraction
Subtracts the source operand from the destination operand and stores the result in the destination.
Description
SUB performs integer subtraction, setting flags based on the result. It supports the same operand combinations as ADD.
Examples
; Register from register
SUB RAX, RBX
; Immediate from register
SUB RCX, 10
; Memory from register
SUB RDX, [RDI+16]
MUL - Unsigned Multiply
Performs unsigned multiplication of the accumulator by the source operand.
Description
MUL performs unsigned multiplication. The size of the source operand determines which register is used as the multiplicand and where the result is stored.
Operand Sizes
| Source Size | Multiplicand | Result |
|---|---|---|
| 8-bit | AL | AX (AH:AL) |
| 16-bit | AX | DX:AX |
| 32-bit | EAX | EDX:EAX |
| 64-bit | RAX | RDX:RAX |
Examples
; 64-bit multiplication
MOV RAX, 0x1234
MOV RBX, 0x5678
MUL RBX ; RDX:RAX = RAX * RBX
DIV - Unsigned Divide
Performs unsigned division of the dividend by the source operand.
Description
DIV performs unsigned division. The dividend is implicit and depends on the size of the source operand.
Operand Sizes
| Source Size | Dividend | Quotient | Remainder |
|---|---|---|---|
| 8-bit | AX | AL | AH |
| 16-bit | DX:AX | AX | DX |
| 32-bit | EDX:EAX | EAX | EDX |
| 64-bit | RDX:RAX | RAX | RDX |
Examples
; 64-bit division
MOV RAX, 100 ; Dividend low
MOV RDX, 0 ; Dividend high
MOV RBX, 3 ; Divisor
DIV RBX ; RAX = quotient, RDX = remainder
Division by zero or a quotient that exceeds the destination size will trigger a #DE (Divide Error) exception.
IMUL - Signed Multiply
Performs signed multiplication with more flexible operand options than MUL.
Description
IMUL has three forms:
- Single operand (like MUL but signed)
- Two operands: destination = destination * source
- Three operands: destination = source * immediate
Examples
; Single operand form (like MUL)
IMUL RBX ; RDX:RAX = RAX * RBX (signed)
; Two operand form
IMUL RAX, RBX ; RAX = RAX * RBX
; Three operand form
IMUL RCX, RDX, 5 ; RCX = RDX * 5
IDIV - Signed Divide
Performs signed division of the dividend by the source operand.
Description
IDIV works like DIV but performs signed division. The same size rules and register usage apply as with DIV.
Examples
; Signed 64-bit division
MOV RAX, -100 ; Dividend low
MOV RDX, -1 ; Dividend high (sign extension)
MOV RBX, 3 ; Divisor
IDIV RBX ; RAX = quotient (-33), RDX = remainder (-1)
ADC - Add with Carry
Adds the source operand, the destination operand, and the carry flag, storing the result in the destination.
Description
ADC is used for multi-word arithmetic. It adds the two operands plus the value of the carry flag from previous operations.
Examples
; 128-bit addition
ADD RAX, RCX ; Add low 64 bits
ADC RDX, R8 ; Add high 64 bits with carry
SBB - Subtract with Borrow
Subtracts the source operand and the carry flag from the destination operand.
Description
SBB is used for multi-word subtraction. It subtracts the source and the carry flag from the destination.
Examples
; 128-bit subtraction
SUB RAX, RCX ; Subtract low 64 bits
SBB RDX, R8 ; Subtract high 64 bits with borrow
NEG - Negate
Replaces the operand with its two's complement negation.
Description
NEG performs the equivalent of subtracting the operand from zero, effectively flipping the sign of a number.
Examples
; Negate register
NEG RAX
; Negate memory value
NEG DWORD [RBP-8]
INC - Increment
Adds 1 to the operand.
Description
INC is more efficient than ADD operand, 1 for single increment operations. It doesn't affect the carry flag.
Examples
; Increment register
INC RCX
; Increment memory
INC BYTE [RDI]
DEC - Decrement
Subtracts 1 from the operand.
Description
DEC is more efficient than SUB operand, 1 for single decrement operations. It doesn't affect the carry flag.
Examples
; Decrement register
DEC R8
; Decrement memory
DEC WORD [RSI+4]
Logical and Bitwise Instructions
AND - Logical AND
Performs bitwise AND between the operands and stores the result in the destination.
Description
AND compares each bit of the operands and produces a 1 in the result bit only if both corresponding bits were 1.
Examples
; Clear bits using AND
AND RAX, 0xFFFF0000 ; Keep only upper 16 bits
; Test if even
AND RCX, 1 ; ZF set if RCX even
OR - Logical OR
Performs bitwise OR between the operands and stores the result in the destination.
Description
OR compares each bit of the operands and produces a 1 in the result bit if either corresponding bit was 1.
Examples
; Set bits using OR
OR RAX, 0x80000000 ; Set sign bit
; Combine flags
OR RCX, RDX ; RCX = RCX | RDX
XOR - Logical XOR
Performs bitwise exclusive OR between the operands and stores the result in the destination.
Description
XOR compares each bit of the operands and produces a 1 in the result bit if the corresponding bits were different.
Examples
; Toggle bits
XOR RAX, 0xFFFFFFFF ; Flip lower 32 bits
; Clear register efficiently
XOR RCX, RCX ; Faster than MOV RCX, 0
NOT - Logical NOT
Performs bitwise inversion of the operand.
Description
NOT flips all bits in the operand (ones become zeros and zeros become ones).
Examples
; Invert register
NOT RAX
; Invert memory value
NOT BYTE [RDI]
TEST - Logical Compare
Performs bitwise AND without storing the result, only setting flags.
Description
TEST is like AND but doesn't store the result. It's commonly used to check if specific bits are set.
Examples
; Check if zero
TEST RAX, RAX
JZ IsZero
; Check if bit 3 is set
TEST RCX, 8
JNZ Bit3Set
Comparison and Jump Instructions
CMP - Compare
Compares two operands by subtracting them without storing the result.
Description
CMP performs operand1 - operand2 and sets flags based on the result, but doesn't store the result. It's used before conditional jumps.
Examples
; Compare register with immediate
CMP RAX, 42
JE EqualTo42
; Compare two registers
CMP RCX, RDX
JG GreaterThanControl Flow Instructions
JMP - Unconditional Jump
Transfers control unconditionally to the specified location.
Description
JMP can transfer control to a label (direct jump) or to an address stored in a register or memory (indirect jump).
Examples
; Direct jump
JMP LoopStart
; Indirect jump through register
JMP RAX
; Indirect jump through memory
JMP [RIP+JumpTable]
CALL - Call Procedure
Pushes the return address and transfers control to the specified procedure.
Description
CALL pushes the address of the next instruction onto the stack (RSP adjusted) and then jumps to the target address.
Examples
; Direct call
CALL MyFunction
; Indirect call through register
CALL RAX
; Indirect call through memory
CALL [RIP+FunctionPointer]
RET - Return from Procedure
Pops the return address from the stack and transfers control there.
Description
RET pops the return address from the stack (RSP adjusted) and jumps to that address. The optional immediate operand specifies additional bytes to pop from the stack (for cleaning up arguments).
Examples
; Simple return
RETWAIT - Wait
Waits for pending floating-point exceptions.
Description
WAIT checks for and handles pending unmasked floating-point exceptions before proceeding. In modern systems, it's often used as a no-op.
Examples
; Wait for FPU
WAIT
SYSCALL - Fast System Call
Transfers control to the operating system kernel.
Description
SYSCALL is the modern, fast method for making system calls in 64-bit mode. It saves minimal state and transfers control to the OS kernel.
The assembler only emits the SYSCALL opcode. It does not handle full Linux ABI compliance or argument passing conventions. You are responsible for setting up registers and arguments as required by your operating system.
Examples
; Linux exit syscall
MOV RAX, 60 ; syscall number
MOV RDI, 0 ; exit code
SYSCALL
System call numbers and parameter passing conventions vary by operating system. On Linux, RAX contains the syscall number and RDI, RSI, RDX, R10, R8, R9 contain the arguments.
Interrupt and System Instructions
INT - Software Interrupt
Generates a software interrupt.
Description
INT transfers control to an interrupt handler specified by the interrupt number. In real mode, it uses the interrupt vector table; in protected mode, it uses the IDT.
Examples
; DOS system call
MOV AH, 09h
MOV DX, message
INT 21h
In 64-bit mode, INT is largely superseded by SYSCALL/SYSENTER for system calls, but may still be used for debugging (INT 3) or virtualization.
Miscellaneous Instructions
NOP - No Operation
Performs no operation.
Description
NOP does nothing except occupy space and time. It's commonly used for padding, alignment, and debugging.
Examples
; Alignment padding
MyFunction:
NOP
NOPModern processors may optimize away NOPs or execute them in parallel with other instructions.