HASM Assembly Language Reference

Overview

HASM is a modern assembly language designed for 32-bit and 64-bit architectures. This documentation provides a comprehensive reference for the instruction set, register usage, memory addressing modes, and programming conventions.

Note

This documentation assumes familiarity with basic computer architecture concepts. For beginners, we recommend starting with an introductory assembly programming guide before diving into this reference.

Key Features

  • Support for both 32-bit and 64-bit operation modes
  • Comprehensive set of arithmetic, logical, and control flow instructions
  • Flexible memory addressing modes
  • Standardized register usage conventions
  • Support for system calls and interrupts

Immediate Values

Immediate values are constant values directly embedded in instructions. They can be represented in multiple formats:

Format Prefix Example
Hexadecimal 0x 0x1A
Decimal None 26
Binary 0b 0b11010
Important

Immediate values must fit within the size constraints of the operation. Using a value that's too large for the target operand will result in truncation or an error.

Registers

Registers are small, fast storage locations within the CPU. HASM provides registers in various sizes for different purposes.

Register Categories

General-Purpose Registers (64-bit)

Register Purpose 32-bit 16-bit 8-bit (low) 8-bit (high)
RAX Accumulator EAX AX AL AH
RBX Base EBX BX BL BH
RCX Counter ECX CX CL CH
RDX Data EDX DX DL DH
RSI Source Index ESI SI -
RDI Destination Index EDI DI -
RBP Base Pointer EBP BP -
RSP Stack Pointer ESP SP -

Special-Purpose Registers

Register Purpose
RIP Instruction Pointer (not a valid operand in most instructions)
RFLAGS Flags Register (not a valid operand in most instructions)

Register Usage Conventions

When writing functions, follow these conventions for register preservation:

  • Caller-saved: RAX, RCX, RDX, R8, R9, R10, R11
  • Callee-saved: RBX, RBP, RDI, RSI, R12-R15

Memory Addressing

Memory in HASM is accessed through various addressing modes that calculate effective addresses.

Addressing Modes

Mode Syntax Example
Direct [address] [0x1000]
Register Indirect [register] [RAX]
Base + Offset [base + offset] [RBX + 8]
Indexed [base + index * scale] [RAX + RBX * 4]
Complex [base + index * scale + offset] [RBP + RCX * 8 + 16]

Memory Size Specifiers

When accessing memory, you can explicitly specify the operand size:

Keyword Size Example
BYTE 8 bits BYTE [RAX]
WORD 16 bits WORD [RBX]
DWORD 32 bits DWORD [RCX]
QWORD 64 bits QWORD [RDX]
Note

The base and index registers in memory addressing must be of the same size (both 32-bit or both 64-bit). Mixing sizes is not allowed.

Program Structure

GLOBAL Directive

Specifies the entry point of the program.

GLOBAL label

Description

The GLOBAL directive defines the ELF headers as well as the entry point into the program. It must appear before any SECTION directives.

Examples

; Single entry point
GLOBAL _start
Note

For the program to run, a GLOBAL symbol is required as the program entry point. Without the GLOBAL symbol, the program will not have an ELF header.

SECTION Directive

Defines memory sections for organizing code and data.

SECTION name

Description

Sections are used to organize different parts of the program:

  • TEXT: Contains executable code (instructions)
  • DATA: Contains initialized data
  • BSS: Contains uninitialized data (variables with no initial value)
Note

The BSS section is for reserving space for variables that do not have an initial value. These variables will be zero-initialized at runtime.

Examples

SECTION TEXT
_start:
    MOV RAX, 1
    RET

SECTION DATA
message:
    DB "Hello", 0

SECTION BSS
buffer:
    TIMES 100 DB 0
Important

Code must be placed in the TEXT section, while variables and constants belong in the DATA section. Uninitialized data should be placed in the BSS section.

Comments

Adds explanatory text that is ignored by the assembler.

; This is a comment

Description

Comments begin with a semicolon (;) and continue to the end of the line. They can appear anywhere in the program.

Examples

; This is a full-line comment
MOV RAX, 1  ; This is an end-of-line comment
; Comments can describe sections:
SECTION DATA  ; Start data section

Data Definitions

DB - Define Byte

Allocates and optionally initializes byte-sized data.

label: DB value [, value...]

Description

DB allocates one or more bytes of storage, optionally initialized with specified values. Each value can be:

  • A numeric constant (decimal, hex, or binary)
  • A character or string in quotes
  • An expression

Examples

byte_var:
    DB 0x1A       ; Single byte in hex
    DB 27         ; Single byte in decimal
    DB 'A'        ; Character
    DB "Hello", 0 ; String with null terminator
    DB 1, 2, 3    ; Multiple bytes

DW - Define Word

Allocates and optionally initializes word-sized (2-byte) data.

label: DW value [, value...]

Description

DW allocates one or more 16-bit words of storage. Values are stored in little-endian format.

Examples

word_var:
    DW 0x1234     ; Single word
    DW 1000       ; Decimal value
    DW 'A', 'B'       ; Two characters

DD - Define Doubleword

Allocates and optionally initializes doubleword-sized (4-byte) data.

label: DD value [, value...]

Description

DD allocates one or more 32-bit doublewords of storage. Can be used for 32-bit integers.

Examples

dword_var:
    DD 0x12345678 ; 32-bit hex value

DQ - Define Quadword

Allocates and optionally initializes quadword-sized (8-byte) data.

label: DQ value [, value...]

Description

DQ allocates one or more 64-bit quadwords of storage. Can be used for 64-bit integers.

Examples

qword_var:
    DQ 0x123456789ABCDEF0 ; 64-bit value

TIMES Directive

Repeats data allocation or instructions a specified number of times.

TIMES count DB|DW|DD|DQ value

Description

TIMES repeats the following data definition or instruction the specified number of times. Useful for allocating buffers or repeating instructions.

Examples

; Allocate a 100-byte buffer
buffer:
    TIMES 100 DB 0
    
; Initialize array
array:
    TIMES 10 DW 0xFFFF
Note

As of now, TIMES can only be used with data definitions.

Data Movement Instructions

MOV - Move Data

Copies data from the source operand to the destination operand.

MOV destination, source

Description

The MOV instruction is the most fundamental data transfer operation. It copies the value from the source operand to the destination operand without modifying the source.

Operand Types

  • Register to Register: MOV RAX, RBX
  • Immediate to Register: MOV RCX, 42
  • Memory to Register: MOV RDX, [RDI]
  • Register to Memory: MOV [RSI], RAX

Examples

; Copy register to register
    MOV RAX, RBX
    
    ; Load immediate value
    MOV RCX, 0x1234
    
    ; Load from memory
    MOV RDX, [RDI + 8]
    
    ; Store to memory
    MOV [RSI], RAX
Restrictions

MOV cannot transfer data directly between two memory locations. Use a register as an intermediate step if needed.

PUSH - Push onto Stack

Decrements the stack pointer and stores the source operand on the top of the stack.

PUSH source

Description

PUSH decrements the stack pointer (RSP) by the operand size and then stores the source operand at the new top of stack location.

Operand Types

  • Register: PUSH RAX
  • Memory: PUSH QWORD [RBP+8]
  • Immediate: PUSH 42

Examples

; Push register
    PUSH RBX
    
    ; Push memory value
    PUSH QWORD [RDI]
    
    ; Push immediate value
    PUSH 0xABCD

POP - Pop from Stack

Loads the value from the top of the stack into the destination operand and increments the stack pointer.

POP destination

Description

POP loads the value at the current top of stack (pointed to by RSP) into the destination operand, then increments RSP by the operand size.

Operand Types

  • Register: POP RAX
  • Memory: POP QWORD [RBP-8]

Examples

; Pop to register
    POP RCX
    
    ; Pop to memory
    POP QWORD [RDI+16]
Warning

Attempting to POP when the stack is empty will result in undefined behavior or a segmentation fault.

LEA - Load Effective Address

Computes the effective address of the source operand and stores it in the destination register.

LEA destination, source

Description

LEA calculates the memory address specified by the source operand (without actually accessing memory) and stores the computed address in the destination register.

Operand Types

  • Register destination: LEA RAX, [RBX+RCX*4]
  • Complex addressing: LEA RDI, [RBP+R12*8+32]

Examples

; Array element address calculation
    LEA RSI, [RAX+RBX*4]
    
    ; Structure field access
    LEA RDX, [RDI+16]
Note

LEA is often used for arithmetic operations since it can compute complex address calculations in a single instruction without memory access.

XCHG - Exchange

Exchanges the contents of two operands.

XCHG operand1, operand2

Description

XCHG swaps the values of its two operands. At least one operand must be a register.

Operand Types

  • Register with Register: XCHG RAX, RBX
  • Register with Memory: XCHG RCX, [RDX]

Examples

; Swap two registers
    XCHG RAX, RBX
    
    ; Swap register with memory
    XCHG RCX, [RDX+8]
Atomic Operation

When one operand is memory, XCHG is performed as an atomic operation, making it useful for synchronization in multithreaded code.

Arithmetic Instructions

ADD - Addition

Adds the source operand to the destination operand and stores the result in the destination.

ADD destination, source

Description

ADD performs integer addition, setting flags based on the result. It supports the same operand combinations as MOV.

Examples

; Register to register
    ADD RAX, RBX
    
    ; Immediate to register
    ADD RCX, 42
    
    ; Memory to register
    ADD RDX, [RSI+8]

SUB - Subtraction

Subtracts the source operand from the destination operand and stores the result in the destination.

SUB destination, source

Description

SUB performs integer subtraction, setting flags based on the result. It supports the same operand combinations as ADD.

Examples

; Register from register
    SUB RAX, RBX
    
    ; Immediate from register
    SUB RCX, 10
    
    ; Memory from register
    SUB RDX, [RDI+16]

MUL - Unsigned Multiply

Performs unsigned multiplication of the accumulator by the source operand.

MUL source

Description

MUL performs unsigned multiplication. The size of the source operand determines which register is used as the multiplicand and where the result is stored.

Operand Sizes

Source Size Multiplicand Result
8-bit AL AX (AH:AL)
16-bit AX DX:AX
32-bit EAX EDX:EAX
64-bit RAX RDX:RAX

Examples

; 64-bit multiplication
    MOV RAX, 0x1234
    MOV RBX, 0x5678
    MUL RBX      ; RDX:RAX = RAX * RBX

DIV - Unsigned Divide

Performs unsigned division of the dividend by the source operand.

DIV source

Description

DIV performs unsigned division. The dividend is implicit and depends on the size of the source operand.

Operand Sizes

Source Size Dividend Quotient Remainder
8-bit AX AL AH
16-bit DX:AX AX DX
32-bit EDX:EAX EAX EDX
64-bit RDX:RAX RAX RDX

Examples

; 64-bit division
    MOV RAX, 100    ; Dividend low
    MOV RDX, 0      ; Dividend high
    MOV RBX, 3      ; Divisor
    DIV RBX         ; RAX = quotient, RDX = remainder
Division Error

Division by zero or a quotient that exceeds the destination size will trigger a #DE (Divide Error) exception.

IMUL - Signed Multiply

Performs signed multiplication with more flexible operand options than MUL.

IMUL destination, source [, immediate]

Description

IMUL has three forms:

  1. Single operand (like MUL but signed)
  2. Two operands: destination = destination * source
  3. Three operands: destination = source * immediate

Examples

; Single operand form (like MUL)
    IMUL RBX      ; RDX:RAX = RAX * RBX (signed)
    
    ; Two operand form
    IMUL RAX, RBX ; RAX = RAX * RBX
    
    ; Three operand form
    IMUL RCX, RDX, 5 ; RCX = RDX * 5

IDIV - Signed Divide

Performs signed division of the dividend by the source operand.

IDIV source

Description

IDIV works like DIV but performs signed division. The same size rules and register usage apply as with DIV.

Examples

; Signed 64-bit division
    MOV RAX, -100  ; Dividend low
    MOV RDX, -1    ; Dividend high (sign extension)
    MOV RBX, 3     ; Divisor
    IDIV RBX       ; RAX = quotient (-33), RDX = remainder (-1)

ADC - Add with Carry

Adds the source operand, the destination operand, and the carry flag, storing the result in the destination.

ADC destination, source

Description

ADC is used for multi-word arithmetic. It adds the two operands plus the value of the carry flag from previous operations.

Examples

; 128-bit addition
    ADD RAX, RCX    ; Add low 64 bits
    ADC RDX, R8     ; Add high 64 bits with carry

SBB - Subtract with Borrow

Subtracts the source operand and the carry flag from the destination operand.

SBB destination, source

Description

SBB is used for multi-word subtraction. It subtracts the source and the carry flag from the destination.

Examples

; 128-bit subtraction
    SUB RAX, RCX    ; Subtract low 64 bits
    SBB RDX, R8     ; Subtract high 64 bits with borrow

NEG - Negate

Replaces the operand with its two's complement negation.

NEG operand

Description

NEG performs the equivalent of subtracting the operand from zero, effectively flipping the sign of a number.

Examples

; Negate register
    NEG RAX
    
    ; Negate memory value
    NEG DWORD [RBP-8]

INC - Increment

Adds 1 to the operand.

INC operand

Description

INC is more efficient than ADD operand, 1 for single increment operations. It doesn't affect the carry flag.

Examples

; Increment register
    INC RCX
    
    ; Increment memory
    INC BYTE [RDI]

DEC - Decrement

Subtracts 1 from the operand.

DEC operand

Description

DEC is more efficient than SUB operand, 1 for single decrement operations. It doesn't affect the carry flag.

Examples

; Decrement register
    DEC R8
    
    ; Decrement memory
    DEC WORD [RSI+4]

Logical and Bitwise Instructions

AND - Logical AND

Performs bitwise AND between the operands and stores the result in the destination.

AND destination, source

Description

AND compares each bit of the operands and produces a 1 in the result bit only if both corresponding bits were 1.

Examples

; Clear bits using AND
    AND RAX, 0xFFFF0000  ; Keep only upper 16 bits
    
    ; Test if even
    AND RCX, 1           ; ZF set if RCX even

OR - Logical OR

Performs bitwise OR between the operands and stores the result in the destination.

OR destination, source

Description

OR compares each bit of the operands and produces a 1 in the result bit if either corresponding bit was 1.

Examples

; Set bits using OR
    OR RAX, 0x80000000  ; Set sign bit
    
    ; Combine flags
    OR RCX, RDX         ; RCX = RCX | RDX

XOR - Logical XOR

Performs bitwise exclusive OR between the operands and stores the result in the destination.

XOR destination, source

Description

XOR compares each bit of the operands and produces a 1 in the result bit if the corresponding bits were different.

Examples

; Toggle bits
    XOR RAX, 0xFFFFFFFF  ; Flip lower 32 bits
    
    ; Clear register efficiently
    XOR RCX, RCX         ; Faster than MOV RCX, 0

NOT - Logical NOT

Performs bitwise inversion of the operand.

NOT operand

Description

NOT flips all bits in the operand (ones become zeros and zeros become ones).

Examples

; Invert register
    NOT RAX
    
    ; Invert memory value
    NOT BYTE [RDI]

TEST - Logical Compare

Performs bitwise AND without storing the result, only setting flags.

TEST operand1, operand2

Description

TEST is like AND but doesn't store the result. It's commonly used to check if specific bits are set.

Examples

; Check if zero
    TEST RAX, RAX
    JZ  IsZero
    
    ; Check if bit 3 is set
    TEST RCX, 8
    JNZ Bit3Set

Comparison and Jump Instructions

CMP - Compare

Compares two operands by subtracting them without storing the result.

CMP operand1, operand2

Description

CMP performs operand1 - operand2 and sets flags based on the result, but doesn't store the result. It's used before conditional jumps.

Examples

; Compare register with immediate
        CMP RAX, 42
        JE  EqualTo42
        
        ; Compare two registers
        CMP RCX, RDX
        JG  GreaterThan

Control Flow Instructions

JMP - Unconditional Jump

Transfers control unconditionally to the specified location.

JMP target

Description

JMP can transfer control to a label (direct jump) or to an address stored in a register or memory (indirect jump).

Examples

; Direct jump
    JMP LoopStart
        
    ; Indirect jump through register
    JMP RAX
        
    ; Indirect jump through memory
    JMP [RIP+JumpTable]

CALL - Call Procedure

Pushes the return address and transfers control to the specified procedure.

CALL target

Description

CALL pushes the address of the next instruction onto the stack (RSP adjusted) and then jumps to the target address.

Examples

; Direct call
    CALL MyFunction
        
    ; Indirect call through register
    CALL RAX
        
    ; Indirect call through memory
    CALL [RIP+FunctionPointer]

RET - Return from Procedure

Pops the return address from the stack and transfers control there.

RET [n]

Description

RET pops the return address from the stack (RSP adjusted) and jumps to that address. The optional immediate operand specifies additional bytes to pop from the stack (for cleaning up arguments).

Examples

; Simple return
    RET

WAIT - Wait

Waits for pending floating-point exceptions.

WAIT

Description

WAIT checks for and handles pending unmasked floating-point exceptions before proceeding. In modern systems, it's often used as a no-op.

Examples

; Wait for FPU
    WAIT

SYSCALL - Fast System Call

Transfers control to the operating system kernel.

SYSCALL

Description

SYSCALL is the modern, fast method for making system calls in 64-bit mode. It saves minimal state and transfers control to the OS kernel.

Limitation

The assembler only emits the SYSCALL opcode. It does not handle full Linux ABI compliance or argument passing conventions. You are responsible for setting up registers and arguments as required by your operating system.

Examples

; Linux exit syscall
    MOV RAX, 60     ; syscall number
    MOV RDI, 0      ; exit code
    SYSCALL
System Call Conventions

System call numbers and parameter passing conventions vary by operating system. On Linux, RAX contains the syscall number and RDI, RSI, RDX, R10, R8, R9 contain the arguments.

Interrupt and System Instructions

INT - Software Interrupt

Generates a software interrupt.

INT n

Description

INT transfers control to an interrupt handler specified by the interrupt number. In real mode, it uses the interrupt vector table; in protected mode, it uses the IDT.

Examples

; DOS system call
    MOV AH, 09h
    MOV DX, message
    INT 21h
Modern Usage

In 64-bit mode, INT is largely superseded by SYSCALL/SYSENTER for system calls, but may still be used for debugging (INT 3) or virtualization.

Miscellaneous Instructions

NOP - No Operation

Performs no operation.

NOP

Description

NOP does nothing except occupy space and time. It's commonly used for padding, alignment, and debugging.

Examples

; Alignment padding
MyFunction:
    NOP
    NOP
Optimization

Modern processors may optimize away NOPs or execute them in parallel with other instructions.