System Overview

This operating system is designed from the ground up for x86-64 architecture, leveraging modern hardware capabilities while maintaining simplicity and performance. The system boots via UEFI and implements all critical operating system components including memory management, process scheduling, device drivers, and filesystems.

Note

This documentation assumes familiarity with operating system concepts and x86-64 architecture.

Design Philosophy

  • Modularity: Components are designed to be independent and replaceable
  • Performance: Optimized for modern hardware with attention to cache efficiency
  • Security: Memory protection, privilege separation, and careful validation
  • Extensibility: Designed to accommodate future features and hardware

Key Features

  • UEFI-based bootloader with ELF64 kernel loading
  • Advanced virtual memory system with 4KB and 2MB pages
  • Preemptive multitasking with round robin scheduling
  • Virtual filesystem layer with EXT2 implementation
  • TTY and framebuffer console support

Standard Library

The custom standard library provides implementations of common C standard library functions along with additional utilities specific to the operating system environment.

Existing C Standard Implementations

ctype

/**
 * @brief Check if character is alphanumeric
 * @param c Character to check
 * @return Non-zero if alphanumeric, zero otherwise
 */
int isalnum(int c);

/**
 * @brief Check if character is alphabetic
 * @param c Character to check
 * @return Non-zero if alphabetic, zero otherwise
 */
int isalpha(int c);

/**
 * @brief Check if character is a control character
 * @param c Character to check
 * @return Non-zero if control character, zero otherwise
 */
int iscntrl(int c);

/**
 * @brief Check if character is a decimal digit
 * @param c Character to check
 * @return Non-zero if decimal digit, zero otherwise
 */
int isdigit(int c);

/**
 * @brief Check if character is lowercase
 * @param c Character to check
 * @return Non-zero if lowercase, zero otherwise
 */
int islower(int c);

/**
 * @brief Check if character has graphical representation
 * @param c Character to check
 * @return Non-zero if graphical, zero otherwise
 */
int isgraph(int c);

/**
 * @brief Check if character is printable
 * @param c Character to check
 * @return Non-zero if printable, zero otherwise
 */
int isprint(int c);

/**
 * @brief Check if character is punctuation
 * @param c Character to check
 * @return Non-zero if punctuation, zero otherwise
 */
int ispunct(int c);

/**
 * @brief Check if character is whitespace
 * @param c Character to check
 * @return Non-zero if whitespace, zero otherwise
 */
int isspace(int c);

/**
 * @brief Check if character is uppercase
 * @param c Character to check
 * @return Non-zero if uppercase, zero otherwise
 */
int isupper(int c);

/**
 * @brief Check if character is hexadecimal digit
 * @param c Character to check
 * @return Non-zero if hexadecimal digit, zero otherwise
 */
int isxdigit(int c);

/**
 * @brief Check if character is blank (space or tab)
 * @param c Character to check
 * @return Non-zero if blank, zero otherwise
 */
int isblank(int c);

/* Case conversion functions */

/**
 * @brief Convert character to lowercase
 * @param c Character to convert
 * @return Lowercase equivalent if uppercase, otherwise unchanged
 */
int tolower(int c);

/**
 * @brief Convert character to uppercase
 * @param c Character to convert
 * @return Uppercase equivalent if lowercase, otherwise unchanged
 */
int toupper(int c);
                

stdio

/**
 * @brief Open a file
 * @param filename Path to the file to open
 * @param mode Opening mode ("r", "w", "a", etc.)
 * @return FILE pointer on success, NULL on failure
 */
FILE* fopen(const char* filename, const char* mode);

/**
 * @brief Close a file
 * @param stream FILE pointer to close
 * @return 0 on success, EOF on failure
 */
int fclose(FILE* stream);

/**
 * @brief Read data from a file
 * @param ptr Pointer to buffer where data will be stored
 * @param size Size of each element to read
 * @param nmemb Number of elements to read
 * @param stream FILE pointer to read from
 * @return Number of elements successfully read
 */
size_t fread(void* ptr, size_t size, size_t nmemb, FILE* stream);

/**
 * @brief Write data to a file
 * @param ptr Pointer to data to write
 * @param size Size of each element to write
 * @param nmemb Number of elements to write
 * @param stream FILE pointer to write to
 * @return Number of elements successfully written
 */
size_t fwrite(const void* ptr, size_t size, size_t nmemb, FILE* stream);

/**
 * @brief Reposition file position indicator
 * @param stream FILE pointer to seek on
 * @param offset Number of bytes to offset
 * @param whence Position from which offset is measured (SEEK_SET, SEEK_CUR, SEEK_END)
 * @return 0 on success, non-zero on failure
 */
int fseek(FILE* stream, long offset, int whence);

/**
 * @brief Get current file position
 * @param stream FILE pointer to examine
 * @return Current file position, or -1 on error
 * @note Currently unimplemented, always returns 0
 */
long ftell(FILE* stream);

/**
 * @brief Reset file position to beginning
 * @param stream FILE pointer to rewind
 */
void rewind(FILE* stream);

/**
 * @brief Print formatted output to a file
 * @param stream FILE pointer to write to
 * @param format Format string
 * @return Number of characters written, or negative on error
 * @note Currently unimplemented, always returns 0
 */
int fprintf(FILE* stream, const char* format, ...);

/**
 * @brief Print formatted output to stdout
 * @param format Format string
 * @return Number of characters written
 */
int printf(const char* format, ...);

/**
 * @brief Read formatted input from stdin
 * @param format Format string
 * @return Number of input items successfully matched and assigned
 */
int scanf(const char* format, ...);

/**
 * @brief Read a character from a file
 * @param stream FILE pointer to read from
 * @return Character read as unsigned char cast to int, or EOF on error
 * @note Currently unimplemented, always returns EOF
 */
int fgetc(FILE* stream);

/**
 * @brief Read a character from a file (macro version of fgetc)
 * @param stream FILE pointer to read from
 * @return Character read as unsigned char cast to int, or EOF on error
 */
int getc(FILE* stream);

/**
 * @brief Read a character from stdin
 * @return Character read as unsigned char cast to int, or EOF on error
 */
int getchar(void);

/**
 * @brief Write a character to a file
 * @param c Character to write
 * @param stream FILE pointer to write to
 * @return Character written as unsigned char cast to int, or EOF on error
 * @note Currently unimplemented, always returns EOF
 */
int fputc(int c, FILE* stream);

/**
 * @brief Write a character to a file (macro version of fputc)
 * @param c Character to write
 * @param stream FILE pointer to write to
 * @return Character written as unsigned char cast to int, or EOF on error
 */
int putc(int c, FILE* stream);

/**
 * @brief Write a character to stdout
 * @param c Character to write
 * @return Character written as unsigned char cast to int, or EOF on error
 */
int putchar(int c);

/**
 * @brief Read a line from a file
 * @param s Buffer to store the read line
 * @param size Maximum number of characters to read
 * @param stream FILE pointer to read from
 * @return s on success, NULL on error or when no characters are read
 */
char* fgets(char* s, int size, FILE* stream);

/**
 * @brief Initialize standard streams
 * @note Internal initialization function
 */
void __stdio_init(void);
                

stdlib

/**
 * @brief Terminate the program with the specified status code
 * @param __status The exit status code (0 for success, non-zero for error)
 */
void exit(int __status);

/**
 * @brief Convert a string to an integer
 * @param __nptr The string to convert
 * @return The converted integer value
 */
int atoi(const char* __nptr);

/**
 * @brief Allocate memory block
 * @param __size Size of the memory block to allocate (in bytes)
 * @return Pointer to allocated memory block, or NULL if allocation fails
 */
void* malloc(size_t __size);

/**
 * @brief Reallocate memory block
 * @param __ptr Pointer to previously allocated memory block
 * @param __size New size for the memory block (in bytes)
 * @return Pointer to reallocated memory block, or NULL if reallocation fails
 * @note If __ptr is NULL, behaves like malloc(). If __size is 0, behaves like free().
 */
void* realloc(void* __ptr, size_t __size);

/**
 * @brief Allocate and zero-initialize memory block
 * @param __nmemb Number of elements to allocate
 * @param __size Size of each element (in bytes)
 * @return Pointer to allocated memory block, or NULL if allocation fails
 * @note The allocated memory is initialized to zero
 */
void* calloc(size_t __nmemb, size_t __size);

/**
 * @brief Free allocated memory block
 * @param __ptr Pointer to memory block to free
 * @note If __ptr is NULL, no operation is performed
 */
void free(void* __ptr);
                

string

/**
 * @brief Calculate the length of a string
 * @param __s The string to measure
 * @return The length of the string in bytes
 */
size_t strlen(const char* __s);

/**
 * @brief Copy a string
 * @param __dest Destination buffer
 * @param __src Source string
 * @return Pointer to destination buffer
 */
char* strcpy(char* __restrict __dest, const char* __restrict __src);

/**
 * @brief Copy up to n characters from a string
 * @param __dest Destination buffer
 * @param __src Source string
 * @param __n Maximum number of characters to copy
 * @return Pointer to destination buffer
 * @note If src is shorter than n, the remainder is filled with null bytes
 */
char* strncpy(char* __restrict __dest, const char* __restrict __src, size_t __n);

/**
 * @brief Compare two strings
 * @param __s1 First string to compare
 * @param __s2 Second string to compare
 * @return Negative if s1 < s2, zero if equal, positive if s1 > s2
 */
int strcmp(const char* __s1, const char* __s2);

/**
 * @brief Compare first n characters of two strings
 * @param __s1 First string to compare
 * @param __s2 Second string to compare
 * @param __n Maximum number of characters to compare
 * @return Negative if s1 < s2, zero if equal, positive if s1 > s2
 */
int strncmp(const char* __s1, const char* __s2, size_t __n);

/**
 * @brief Concatenate two strings
 * @param __dest Destination buffer (must have enough space)
 * @param __src Source string to append
 * @return Pointer to destination buffer
 */
char* strcat(char* __restrict __dest, const char* __restrict __src);

/**
 * @brief Concatenate up to n characters from a string
 * @param __dest Destination buffer
 * @param __src Source string to append
 * @param __n Maximum number of characters to append
 * @return Pointer to destination buffer
 */
char* strncat(char* __restrict __dest, const char* __restrict __src, size_t __n);

/**
 * @brief Locate first occurrence of character in string
 * @param __s String to search
 * @param __c Character to locate
 * @return Pointer to first occurrence, or NULL if not found
 */
char* strchr(const char* __s, int __c);

/**
 * @brief Locate last occurrence of character in string
 * @param __s String to search
 * @param __c Character to locate
 * @return Pointer to last occurrence, or NULL if not found
 */
char* strrchr(const char* __s, int __c);

/**
 * @brief Locate substring
 * @param __haystack String to search
 * @param __needle Substring to find
 * @return Pointer to beginning of substring, or NULL if not found
 */
char* strstr(const char* __haystack, const char* __needle);

/**
 * @brief Copy memory area
 * @param __dest Destination buffer
 * @param __src Source buffer
 * @param __n Number of bytes to copy
 * @return Pointer to destination buffer
 * @note The memory areas must not overlap (use memmove if they do)
 */
void* memcpy(void* __restrict __dest, const void* __restrict __src, size_t __n);

/**
 * @brief Fill memory with a constant byte
 * @param __s Memory area to fill
 * @param __c Value to set (converted to unsigned char)
 * @param __n Number of bytes to fill
 * @return Pointer to the memory area
 */
void* memset(void* __s, int __c, size_t __n);

/**
 * @brief Compare memory areas
 * @param __s1 First memory area
 * @param __s2 Second memory area
 * @param __n Number of bytes to compare
 * @return Negative if s1 < s2, zero if equal, positive if s1 > s2
 */
int memcmp(const void* __s1, const void* __s2, size_t __n);

/**
 * @brief Copy memory area (handles overlapping regions)
 * @param __dest Destination buffer
 * @param __src Source buffer
 * @param __n Number of bytes to copy
 * @return Pointer to destination buffer
 */
void* memmove(void* __dest, const void* __src, size_t __n);

/**
 * @brief Scan memory for a character
 * @param __s Memory area to scan
 * @param __c Character to locate (converted to unsigned char)
 * @param __n Number of bytes to scan
 * @return Pointer to matching byte, or NULL if not found
 */
void* memchr(const void* __s, int __c, size_t __n);

/**
 * @brief Get length of initial segment consisting entirely of characters in accept
 * @param __s String to analyze
 * @param __accept Set of characters to accept
 * @return Length of initial segment containing only characters from accept
 */
size_t strspn(const char* __s, const char* __accept);

/**
 * @brief Get length of initial segment not containing reject characters
 * @param __s String to analyze
 * @param __reject Set of characters to reject
 * @return Length of initial segment without any characters from reject
 */
size_t strcspn(const char* __s, const char* __reject);

/**
 * @brief Locate first occurrence in string of any character from accept
 * @param __s String to search
 * @param __accept Set of characters to search for
 * @return Pointer to first matching character, or NULL if none found
 */
char* strpbrk(const char* __s, const char* __accept);
                

unistd

/**
 * @brief Duplicate a file descriptor
 * @param __fd Old file descriptor
 * @param __fd2 New file descriptor (if already open, it's closed first)
 * @return New file descriptor on success, -1 on error
 */
int dup2(int __fd, int __fd2);

/**
 * @brief Read from a file descriptor
 * @param __fd File descriptor to read from
 * @param __buf Buffer to store read data
 * @param __nbytes Maximum number of bytes to read
 * @return Number of bytes read, 0 on EOF, -1 on error
 * @note Currently unimplemented, always returns 0
 */
ssize_t read(int __fd, void* __buf, size_t __nbytes);

/**
 * @brief Write to a file descriptor
 * @param __fd File descriptor to write to
 * @param __buf Buffer containing data to write
 * @param __n Number of bytes to write
 * @return Number of bytes written, -1 on error
 * @note Currently unimplemented, always returns 0
 */
ssize_t write(int __fd, const void* __buf, size_t __n);

/**
 * @brief Close a file descriptor
 * @param __fd File descriptor to close
 * @return 0 on success, -1 on error
 * @note Currently unimplemented, always returns 0
 */
int close(int __fd);

/**
 * @brief Create a new process
 * @return 0 to child process, child PID to parent, -1 on error
 */
pid_t fork(void);

/**
 * @brief Get process ID
 * @return Process ID of calling process
 * @note Currently unimplemented
 */
pid_t getpid(void);

/**
 * @brief Set process group ID
 * @param __pid Process ID (0 means calling process)
 * @param __pgid Process group ID (0 means use PID)
 * @return 0 on success, -1 on error
 */
int setpgid(pid_t __pid, pid_t __pgid);

/**
 * @brief Get session ID
 * @param __pid Process ID (0 means calling process)
 * @return Session ID on success, -1 on error
 */
pid_t getsid(pid_t __pid);

/**
 * @brief Execute program
 * @param __path Path to executable
 * @param __argv Argument vector (NULL-terminated)
 * @param __envp Environment vector (NULL-terminated)
 * @return Does not return on success, -1 on error
 */
int execve(const char* __path, char* const __argv[], char* const __envp[]);

/**
 * @brief Suspend execution for interval
 * @param __seconds Number of seconds to sleep
 * @return 0 on success, unslept seconds if interrupted
 * @note Currently unimplemented, always returns 0
 */
unsigned int sleep(unsigned int __seconds);

/**
 * @brief Create pipe
 * @param __pipedes Array to store read (0) and write (1) file descriptors
 * @return 0 on success, -1 on error
 * @note Currently unimplemented, always returns 0
 */
int pipe(int __pipedes[2]);

/**
 * @brief Set terminal foreground process group
 * @param __fd Terminal file descriptor
 * @param __pgrp_id Process group ID
 * @return 0 on success, -1 on error
 */
int tcsetpgrp(int __fd, pid_t __pgrp_id);

/**
 * @brief Get terminal foreground process group
 * @param __fd Terminal file descriptor
 * @return Process group ID on success, -1 on error
 */
pid_t tcgetpgrp(int __fd);

/**
 * @brief Get current working directory
 * @param __buf Buffer to store path (NULL to let system allocate)
 * @param __size Size of buffer
 * @return Pointer to buffer containing path, NULL on error
 */
char* getcwd(char* __buf, size_t __size);

/**
 * @brief Change working directory
 * @param __path Path to new working directory
 * @return 0 on success, -1 on error
 */
int chdir(const char* __path);
                

wait

/**
 * @brief Wait for process state changes
 * @param pid  >0: specific child, -1: any child, 0: group, <-1: group -pid
 * @param stat_loc Pointer to store status (use W* macros to examine)
 * @param options WNOHANG|WUNTRACED|WCONTINUED
 * @return PID of changed child, 0 (WNOHANG), or -1 on error
 */
pid_t waitpid(pid_t pid, int* stat_loc, int options);
                    

Kernel Setup

UEFI Bootloader

The system begins execution in UEFI firmware environment which loads the operating system. The bootloader performs the following tasks:

  • Initializes basic graphics mode (GOP) for early display output
  • Sets up memory map and reserves necessary regions
  • Loads the 64-bit kernel ELF binary into memory
  • Prepares system tables (ACPI, SMBIOS) for kernel use
  • Transitions to 64-bit long mode before jumping to kernel

Boot Services

When UEFI Loads the operating system, various UEFI boot services are provided. Before doing anything with paging, the OS must exit the boot services, which disable all UEFI tools. Before doing so, the OS performs the following tasks:

  • Initializes Framebuffer: The framebuffer address and information is stored, so the framebuffer can be used at a later time.
  • Loads default system font: A system font is conveniently loaded before the filesystem is initialized.
  • Collects the UEFI memory map: The UEFI memory map is collected which includes the operating system's code, so the kernel knows which memory to reserve initially.

Early Memory Management

When the boot services first exit, all the kernel knows about memory is the size and the reserved pages. In the early memory stages, there is no memory allocator or page allocator, so the memory map has to be manually parsed to find places that are free. This step initially sets up memory for the page allocator, kernel stack, and more.

  • Kernel Stack: An initial stack is allocated so the kernel doesn't have to use the UEFI allocated stack anymore
  • Page Allocation Table: The page allocation table is used for storing allocation statuses of pages
  • Page Table: Memory for the kernel page table is allocated
  • Global Variables: Global variables are all allocated instead of defined, since the kernel code is moved to a higher address after setup

Core Systems

Global Descriptor Table (GDT)

The GDT is a fundamental x86-64 data structure that defines memory segments and their attributes. This implementation handles both segment descriptors and the Task State Segment (TSS) for privilege level transitions and interrupt handling.

Key Features:

  • Complete GDT Setup - Initializes 7 entries including null, kernel/user segments, and TSS
  • 64-bit TSS Support - Implements the x86-64 Task State Segment structure with interrupt stacks
  • Ring 0 & Ring 3 Segments - Separate code/data segments for kernel and user modes
  • Low-level Assembly Integration - Uses inline assembly for LGDT, segment register updates, and far jumps

Implementation Details:

  • Uses packed structures to match x86 memory layout requirements exactly
  • Sets up kernel stack pointer (RSP0) for privilege level transitions
  • Implements proper descriptor flags for 64-bit mode operation
  • Handles the 2-slot TSS descriptor requirement for 64-bit systems

Technical Highlights:

The GDT initialization carefully constructs each descriptor with proper access flags (0x9A for kernel code, 0x92 for kernel data, 0xFA for user code, and 0xF2 for user data). The TSS setup includes space for 7 interrupt stack table entries (IST1-7) which are crucial for handling exceptions and interrupts with separate stacks.

The assembly sequence after loading the GDT performs a far jump to reload CS with the new code segment descriptor and updates all data segment registers to point to the kernel data segment.

Interrupt Descriptor Table (IDT)

The IDT is a critical x86-64 structure that defines the processor's response to interrupts and exceptions. This comprehensive implementation handles hardware interrupts, CPU exceptions, and system calls with robust error recovery and signal delivery to user processes.

Key Features:

  • Complete Interrupt Handling - Supports all 256 interrupt vectors including CPU exceptions and hardware IRQs
  • Signal Delivery System - Maps exceptions to POSIX-style signals (SIGSEGV, SIGILL, etc.) for user processes
  • Dual-stack Architecture - Uses separate stacks for interrupts and exceptions via TSS IST mechanism
  • Hardware Abstraction - Integrates with PIC (8259A) for legacy interrupt handling

Technical Implementation:

  • Structured IDT entries with proper privilege separation (Ring 0-3)
  • Exception-specific handling with detailed error code analysis
  • Page fault handler with Copy-on-Write (COW) support
  • Context-aware interrupt processing with process scheduling integration
  • Low-level assembly integration for register preservation and control transfer

Notable Components:

  • idt_set_descriptor() - Configures interrupt gates with flexible privilege levels
  • exception_handler() - Detailed CPU exception processing with 20+ specific cases
  • interrupt_handler() - Hardware interrupt dispatcher (timer, keyboard, mouse)
  • check_signal() - POSIX-compatible signal delivery mechanism

Advanced Features:

The implementation includes sophisticated error handling like page fault analysis with CR2 register inspection, general protection fault error code decoding, and proper signal delivery for user-space processes. The timer interrupt handler integrates directly with the process scheduler for preemptive multitasking.

Special attention was given to security considerations, including proper privilege level separation (especially for system calls at vector 0x80) and interrupt stack isolation to prevent stack overflow scenarios during nested interrupts.

x86-64 Interrupt Service Routines (ISRs)

This low-level assembly implementation provides the foundation for the operating system's interrupt handling, featuring optimized context switching, robust error handling, and system call support. The code directly interfaces with CPU hardware features while maintaining compatibility with the C-based kernel architecture.

Core Features:

  • Complete ISR Stubs - Handlers for all 256 interrupt vectors (0-255) including CPU exceptions and hardware IRQs
  • Dual Macro System - isr_no_err and isr_err macros automate proper error code handling
  • Full Context Preservation - Saves/restores all 16 general-purpose registers (RAX-R15) plus CR3
  • Kernel Stack Switching - Automatic transition to dedicated kernel stack during interrupts
  • System Call Gateway - Specialized handler for syscall dispatch via system call table

Technical Highlights:

  • Register Preservation - Meticulous push/pop sequence maintains execution context
  • Error Code Handling - Special handling for exceptions that push error codes (e.g., Page Fault, GPF)
  • PIC EOI Signaling - Proper End-of-Interrupt notification to Programmable Interrupt Controller
  • CR3 Management - Page table base register preservation for process isolation
  • Signal Integration - Calls check_signal after each interrupt for POSIX signal delivery

Key Components:

Exception Handlers (0-31)

  • Divide-by-zero (#DE)
  • Page Fault (#PF)
  • General Protection Fault (#GP)
  • Double Fault (#DF)
  • All x86-64 defined exceptions

Hardware IRQs (32-47)

  • Timer interrupt (IRQ0)
  • Keyboard (IRQ1)
  • Cascade (IRQ2)
  • All legacy PIC interrupts

Special Handlers

  • System call gateway (0x80)
  • Generated stubs for unused vectors
  • GDT reload functionality

Optimization Techniques:

The implementation uses NASM macros to eliminate code duplication, generating 256 optimized ISR stubs with only 2 macro definitions. The system call handler features direct dispatch to kernel functions via a system call table, while maintaining full register state across the privilege boundary.

Special care was taken to handle the x86-64 interrupt stack frame requirements, including proper alignment and the use of iretq for 64-bit interrupt returns. The design supports nested interrupts while preventing stack overflow through dedicated interrupt stacks.

System Call Interface

A low-level x86-64 syscall implementation providing POSIX-compatible system services through hardware-accelerated SYSCALL/SYSRET instructions with full process isolation.

Core Architecture

  • Hardware Acceleration
    • SYSCALL/SYSRET instruction support
    • MSR-configured privilege transition
    • Automatic interrupt masking during execution
  • Process Isolation
    • User-space argument handling via process stack
    • Per-process context preservation
    • Signal-based security enforcement
  • System Call Table
    • 21 POSIX-compatible system calls
    • Default handler for unused syscalls (SIGSYS)
    • Macro-based registration (syscall_def)

Key System Calls

Process Management
  • exit (1): Terminate calling process with exit code
  • fork (8): Create duplicate process context
  • execve (2): Execute program with argument vector
  • waitpid (17): Wait for process termination
  • kill (20): Send signal to process/group
File Operations
  • open (12): Open file and return descriptor
  • close (14): Close file descriptor
  • dup2 (13): Duplicate file descriptor
  • seek (21): Reposition file offset
  • write (4): Write to file descriptor
  • input (3): Read from file descriptor
Memory Management
  • mmap (7): Map memory to process space (anonymous only)
Process Groups
  • setpgid (11): Set process group ID
  • getpgid (10): Get process group ID
  • setsid (18): Create new session
  • getsid (19): Get session ID
  • tcsetpgrp (15): Set terminal foreground group
  • tcgetpgrp (16): Get terminal foreground group
Filesystem Navigation
  • chdir (5): Change working directory
  • getcwd (6): Get current working directory

Technical Implementation

  • MSR Configuration
    • IA32_STAR (0xC0000081): Sets CS segments for privilege transition
    • IA32_LSTAR (0xC0000082): Syscall entry point (syscall_stub)
    • IA32_FMASK (0xC0000084): Masks RFLAGS during syscalls
  • Argument Handling
    • Six-parameter support via registers:
      • ARG1: RDI
      • ARG2: RSI
      • ARG3: RDX
      • ARG4: R10
      • ARG5: R8
      • ARG6: R9
    • SYS_ARG_1 to SYS_ARG_6 macros for access
  • Security Model
    • SIGSYS for invalid syscalls
    • SIGTTOU/SIGTTIN for terminal access violations
    • Process group verification for I/O operations

Device Drivers

Framebuffer Console (fbcon)

The framebuffer console provides high-performance text output with advanced features:

  • Supports multiple fonts loaded from PSF format
  • Hardware-accelerated scrolling and blitting
  • Multiple color schemes and customizable palettes
  • Backbuffer for flicker-free updates
  • Cursor rendering with customizable styles

Virtual Console (vcon)

The TTY driver provides traditional terminal functionality:

  • Line discipline for canonical mode processing
  • VT100-compatible escape sequence handling
  • Multiple virtual terminals with hotkey switching
  • Input/output queues with flow control
  • Session and process group management

PS/2 Keyboard Driver

A complete keyboard driver with scancode translation, modifier handling, and event queuing for a custom operating system.

Key Features

  • Scancode Translation
    • Handles standard PS/2 scancode set 1
    • Supports extended keys (0xE0 prefix)
    • Separate mappings for shifted/unshifted keys
  • Modifier Handling
    • Tracks Ctrl, Shift, Alt, AltGr states
    • Manages toggle keys (Caps Lock, Num Lock, Scroll Lock)
    • Automatic case conversion with Caps Lock
  • Event System
    • Circular buffer for key events
    • Press/release detection
    • Event queue with overflow protection
  • Hardware Interface
    • PS/2 controller communication
    • Interrupt-driven operation (IRQ1)
    • Proper EOI handling

Technical Details

  • Supports all standard alphanumeric keys
  • Handles function keys F1-F12
  • Processes navigation keys (arrows, home, end, etc.)
  • Supports keypad keys with Num Lock
  • Implements special keys (Print Screen, Pause, etc.)

Implementation Notes

  • Uses a state machine for extended key sequences
  • Maintains separate keycode and scancode representations
  • Provides device file interface for user-space access
  • Includes proper initialization sequence for PS/2 controller

Memory Management

Kernel Virtual Memory Map

The kernel maintains this virtual address space layout:

0x0000000000000000 - 0x00007FFFFFFFFFFF
User Space
Per-process address space for applications
128TB
0xFFF800000000000 - 0xFFFFFFFFFFFFFFFF
Kernel Space
Higher-half kernel mappings, identical across all address spaces
128TB
0xFFFF800000000000 - 0xFFFF80FFFFFFFFFF
Kernel Code/Data
Direct mapping of kernel ELF sections with appropriate permissions
1TB
0xFFFF810000000000 - 0xFFFF81FFFFFFFFFF
Kernel Stack
Kernel's Interrupt Stack, every Interrupt is sent to the beginning
1TB
0xFFFF820000000000 - 0xFFFF82FFFFFFFFFF
Kernel Heap
Kernel's Heap used for dynamic allocation
1TB
0xFFFF830000000000 - 0xFFFF83FFFFFFFFFF
Page Allocation Table
Memory Used for Memory Page Tracking
1TB
0xFFFF840000000000 - 0xFFFF84FFFFFFFFFF
Kernel Shared Page Table
The shared page table kernel entries are used commonly in every page table
1TB
0xFFFF860000000000 - 0xFFFF86FFFFFFFFFF
Global Variables
Memory for Global Variables (backwards allocation see below)
1TB
0xFFFF870000000000 - 0xFFFF87FFFFFFFFFF
Frame Buffer
Space for Frame Buffer
1TB
0xFFFF890000000000 - 0xFFFFFFFFFFFFFFFF
Memory Pools
Kernel Memory Reserved For Memory Pools (see below)
1TB

Global Variables Manager

A custom-designed global variables allocation system that provides safe, predictable access to kernel globals regardless of virtual memory movements, avoiding traditional pitfalls of global variables in kernel space.

Core Features

  • Safe Global Access
    • Fixed relative addressing prevents virtual memory relocation issues
    • Deterministic memory layout for all global variables
    • No dependency on runtime memory addresses
  • Reverse Allocation Strategy
    • Variables allocated from high memory downward
    • Automatic size calculation through macros
    • Single contiguous block for all globals
  • Comprehensive Coverage
    • Memory management structures
    • Process management variables
    • Graphics system state
    • Filesystem objects
    • Hardware state tracking

Key Advantages

  • Virtual Memory Safety
    • Traditional globals become unsafe when kernel code moves in virtual memory
    • This system maintains correct references regardless of virtual address changes
    • Essential for features like higher-half kernels and address space randomization
  • Deterministic Layout
    • Explicit control over variable placement
    • No compiler-dependent ordering
    • Precise knowledge of memory consumption
  • Single Source of Truth
    • All global declarations centralized in one header
    • Clear documentation of each variable's purpose
    • Easy to track dependencies between globals

Technical Implementation

  • Uses macro-based offset calculation:
    • createGlobal(type, last_global) for single variables
    • createGlobalArray(type, count, last_global) for arrays
  • Fixed end address (GLOBAL_VARS_END) serves as allocation anchor
  • Variables declared in reverse order of allocation
  • Automatic size tracking through chained references

Memory Organization

  • Structured into logical sections:
    • Memory management (bitmaps, page tables, pools)
    • Process management (PID tracking, hash tables)
    • Graphics system (framebuffer, layers)
    • Filesystem (VFS, device nodes)
    • Hardware state (keyboard, mouse)
  • Total size automatically calculated via GLOBALS_SIZE
  • Alignment handled naturally through type sizes

Design Rationale

  • Solves the kernel relocation problem for global variables
  • Provides better organization than scattered extern declarations
  • Enables precise control over kernel memory layout
  • Facilitates debugging through centralized variable tracking

Virtual Memory Manager

A complete implementation of x86-64 4-level paging with support for 4KB, 2MB, and 1GB pages, including advanced features like copy-on-write and page table forking.

Core Features

  • Full 4-Level Paging Support
    • PML4 (Page Map Level 4)
    • PDPT (Page Directory Pointer Table)
    • PD (Page Directory)
    • PT (Page Table)
  • Multiple Page Sizes
    • 4KB standard pages
    • 2MB large pages
    • 1GB huge pages
  • Advanced Memory Management
    • Copy-on-Write (COW) support
    • Page table forking
    • Kernel memory mapping
    • TLB management

Key Functions

  • pageTable_addPage - Maps physical pages into virtual address space
  • pageTable_addKernel - Maps kernel memory regions
  • pageTable_set - Activates page tables via CR3
  • pageTable_fork - Creates copy-on-write page table copies
  • pageTable_find_entry - Looks up page table entries

Technical Implementation

  • Uses canonical 48-bit virtual addresses
  • Implements proper flag handling (Present, Writable, User, etc.)
  • Handles all x86-64 paging structure levels
  • Includes TLB invalidation when modifying page tables
  • Maintains consistency between different page sizes

Optimizations

  • Large page support reduces TLB pressure
  • Copy-on-Write minimizes memory duplication
  • Recursive table copying for efficient forks
  • Careful CR3 management during operations

Physical Memory Manager

A comprehensive physical page allocator supporting both 4KB and 2MB pages, with bitmap tracking and stack-based free page management.

Core Features

  • Dual-Page Size Support
    • 4KB page allocations
    • 2MB large page allocations
    • Automatic consistency between page sizes
  • Memory Tracking
    • Bitmap-based allocation tracking
    • Stack-based free page management
    • Memory reservation capability
  • Management Functions
    • pages_initAllocTable - Initialize memory tracking structures
    • pages_allocatePage - Allocate physical pages
    • pages_free - Release allocated pages
    • pages_reservePage - Mark pages as reserved
    • pages_generateFreeStack - Rebuild free page lists

Technical Implementation

  • Uses two separate bitmaps for 4KB and 2MB pages
  • Maintains consistency between page sizes:
    • Allocating a 2MB page marks all 512 constituent 4KB pages
    • Freeing a 2MB page clears all 512 constituent 4KB pages
  • Free stacks provide O(1) allocation/free operations
  • Defensive checks against double-free and invalid operations

Memory Layout

  • Single contiguous allocation table contains:
    • 2MB page bitmap
    • 4KB page bitmap
    • 2MB free page stack
    • 4KB free page stack
  • Bitmaps use 1 bit per page
  • Free stacks contain page indices

Kernel Heap Implementation

A complete heap for the kernel, featuring allocation, deallocation, and memory operations with both basic and optimized implementations.

Core Features

  • Heap Management
    • Block-based allocation with free list
    • 8-byte aligned allocations
    • Block splitting for efficient memory use
    • Coalescing of adjacent free blocks
  • Allocation Functions
    • kmalloc - Basic memory allocation
    • kaligned_alloc - Alignment-aware allocation
    • krealloc - Memory block resizing
    • kfree - Memory deallocation

Technical Details

  • Uses block headers for metadata (size + next pointer)
  • Free list management with O(n) allocation time
  • Handles arbitrary alignment requirements
  • Includes special cases for NULL/zero-size operations
  • Provides both simple and optimized implementations

Optimizations

  • Automatic block splitting to reduce fragmentation
  • Alignment handling without wasted space
  • Fallback to byte-by-byte operations when needed

Kernel Memory Pool Allocator

A high-performance fixed-size object allocator designed for kernel use, featuring demand-paged backing storage and efficient object reuse.

Key Features

  • Fixed-Size Allocation
    • Optimized for same-sized object allocation
    • Guaranteed alignment of all objects
    • Efficient reuse of freed objects
  • Memory Management
    • 1TB virtual address space per pool
    • On-demand physical page allocation
    • Automatic page mapping when needed
    • LIFO free list for fast allocations
  • Core Functions
    • pool_create - Initialize new memory pool
    • pool_allocate - Allocate object from pool
    • pool_free - Return object to pool

Technical Implementation

  • Uses separate allocation and free paths:
    • Sequential allocation for new objects
    • Free stack for recycled objects
  • Handles page boundary crossings automatically
  • Tracks pool metadata at base of virtual space
  • Allocates free stack pages on demand

Performance Characteristics

  • O(1) allocation from free list
  • O(1) freeing of objects
  • Minimal fragmentation for fixed-size objects
  • Low memory overhead (only requires alignment padding)

Kernel Heap

A complete heap for the kernel, featuring allocation, deallocation, and memory operations with both basic and optimized implementations.

Core Features

  • Heap Management
    • Block-based allocation with free list
    • 8-byte aligned allocations
    • Block splitting for efficient memory use
    • Coalescing of adjacent free blocks
  • Allocation Functions
    • kmalloc - Basic memory allocation
    • kaligned_alloc - Alignment-aware allocation
    • krealloc - Memory block resizing
    • kfree - Memory deallocation

Technical Details

  • Uses block headers for metadata (size + next pointer)
  • Free list management with O(n) allocation time
  • Handles arbitrary alignment requirements
  • Includes special cases for NULL/zero-size operations
  • Provides both simple and optimized implementations

Optimizations

  • Automatic block splitting to reduce fragmentation
  • Alignment handling without wasted space
  • Fallback to byte-by-byte operations when needed

Kernel Memory Pool Allocator

A high-performance fixed-size object allocator designed for kernel use, featuring demand-paged backing storage and efficient object reuse.

Key Features

  • Fixed-Size Allocation
    • Optimized for same-sized object allocation
    • Guaranteed alignment of all objects
    • Efficient reuse of freed objects
  • Memory Management
    • 1TB virtual address space per pool
    • On-demand physical page allocation
    • Automatic page mapping when needed
    • LIFO free list for fast allocations
  • Core Functions
    • pool_create - Initialize new memory pool
    • pool_allocate - Allocate object from pool
    • pool_free - Return object to pool

Technical Implementation

  • Uses separate allocation and free paths:
    • Sequential allocation for new objects
    • Free stack for recycled objects
  • Handles page boundary crossings automatically
  • Tracks pool metadata at base of virtual space
  • Allocates free stack pages on demand

Performance Characteristics

  • O(1) allocation from free list
  • O(1) freeing of objects
  • Minimal fragmentation for fixed-size objects
  • Low memory overhead (only requires alignment padding)

Process Management

Process Structures

Comprehensive process control implementation handling process creation, memory management, signal delivery, and process group/session management.

Core Structures

  • process_t
    • Complete process control block (PCB)
    • Contains saved registers, memory maps, file descriptors
    • Maintains process relationships (parent/child, groups)
  • process_stack_layout_t
    • Saved register state for context switching
    • Matches hardware interrupt stack frame
  • process_group_t / process_session_t
    • Collections of related processes
    • Used for terminal control and signal delivery

Key Functionality

  • Process Lifecycle
    • process_fork: Creates child process with COW memory
    • process_execvp: Replaces process with new program
    • process_exit: Terminates process and cleans up resources
  • Memory Management
    • process_add_page: Maps physical memory into process space
    • vmm_fork: Implements copy-on-write page tables
    • Heap and shared memory region management
  • Signal Handling
    • process_signal: Delivers signals to individual processes
    • process_group_signal: Broadcasts signals to groups
    • Special handling for SIGKILL, SIGSTOP, etc.
  • Process Groups/Sessions
    • Hierarchical process organization
    • Terminal control via foreground/background groups

Implementation Details

  • Context Switching
    • CR3 register updates for address space switching
    • TSS stack pointer management
    • Register state preservation/restoration
  • Memory Regions
    • Heap grows upward from 1GB
    • Shared memory grows downward from 128GB
    • Stack fixed at 0x600000-0x7FFFFF (2MB)
  • Signal Delivery
    • SIGKILL forces immediate termination
    • SIGCONT/SIGSTOP manage blocking state
    • Pending signals stored in process control block

Process States

  • Running: Currently executing on CPU
  • Blocked: Waiting for event (PROCESS_BLOCKING)
  • Zombie: Terminated but not reaped (PROCESS_ZOMBIE)

PID Management

  • 64-bit process IDs
  • Global counter with no recycling
  • Hash table for PID lookup

Process Scheduler

A round-robin process scheduler with blocking support, implementing circular queue management and context switching for kernel processes.

Core Functions

  • scheduler_nextProcess
    • Finds next runnable process in circular queue
    • Skips blocked processes (PROCESS_BLOCKING flag)
    • Updates global PROCESSES pointer
  • scheduler_schedule
    • Adds process to scheduler queue
    • Handles first process case (self-referential)
    • Inserts new processes after current
    • Updates PID hash table
  • schedule_end
    • Removes process from scheduler queue
    • Handles last process case (queue cleanup)
    • Maintains neighbor process links
  • scheduler_currentProcess
    • Returns currently executing process
    • Direct access to global PROCESSES pointer
  • schedule_block/schedule_unblock
    • Sets/clears PROCESS_BLOCKING flag
    • Controls process eligibility for scheduling

Implementation Details

  • Data Structures
    • Circular doubly-linked process queue
    • Global PROCESSES pointer tracks current
    • PID hash table for process lookup
  • Scheduling Algorithm
    • Round-robin with blocking support
    • Processes skipped when blocked
    • Queue remains circular after removals
  • Edge Cases
    • Empty queue handling
    • Single process case
    • Removal of current process

PID Hash Table

A memory-efficient hash table implementation for fast process ID (PID) to process structure mapping, featuring custom page-based node allocation and O(1) average case operations.

Core Features

  • Fixed-Size Hash Table
    • 1024 buckets (2^10)
    • Chaining collision resolution
    • Simple bitmask hash function
  • Memory Management
    • Page-based node allocation (4KB pages)
    • Freelist for recycled nodes
    • Virtual memory reservation
  • Basic Operations
    • Insert: O(1) average case
    • Lookup: O(1) average case
    • Delete: O(1) average case

Key Structures

  • pid_hash_table_t
    • Array of 1024 bucket pointers
    • Freelist for node reuse
    • Memory management tracking
  • pid_hash_node_t
    • Contains PID key and process pointer
    • Next pointer for chaining
    • 32-byte structure (on 64-bit systems)

Implementation Details

  • Hashing
    • Simple bitmask: pid & (PID_HASH_SIZE - 1)
    • Assumes random PID distribution
  • Memory Allocation
    • Nodes allocated in 4KB pages (≈128 nodes/page)
    • Pages mapped to reserved virtual address space
    • Freelist tracks available nodes
  • Operation Flow
    • Insert: Hash PID → check duplicates → allocate node → add to chain
    • Lookup: Hash PID → search bucket chain
    • Delete: Hash PID → find node → remove from chain → add to freelist

Performance Characteristics

  • Average case O(1) for all operations
  • Worst case O(n) with many hash collisions
  • Memory overhead: ~8 bytes per node (next pointer)
  • No dynamic resizing - fixed 1024 buckets

Filesystem

Virtual File System (VFS)

A unified filesystem abstraction layer providing path resolution, directory management, and filesystem-agnostic file operations.

Core Features

  • Unified Filesystem Interface
    • Supports multiple filesystem types
    • Common operations table for all files
    • Path resolution independent of underlying FS
  • Hierarchical Structure
    • Tree-based directory organization
    • Lazy-loaded directory contents
    • Special directory handling (., ..)
  • Storage Integration
    • GPT partition table parsing
    • IDE disk driver support
    • EXT2 filesystem integration

Key Structures

  • vfs_entry_t
    • Inode pointer and metadata
    • Parent/child relationships
    • Filesystem operations table
    • Name hashing for fast lookup
  • list_head_t
    • Doubly-linked list for siblings
    • Circular list for children
    • Container_of macro for type safety

Core Functions

  • vfs_init - Initializes VFS and mounts root filesystem
  • vfs_find_entry - Resolves paths to VFS entries
  • vfs_open_file - Creates file descriptors
  • vfs_create_entry - Creates new filesystem objects
  • vfs_populate_directory - Lazy-loads directory contents

Technical Implementation

  • Path resolution:
    • Handles absolute and relative paths
    • Tokenizes path components
    • Special handling for . and ..
  • Memory management:
    • Kernel pools for VFS entries and inodes
    • FNV-1a hashing for efficient lookups
    • On-demand allocation of child entries
  • Filesystem operations:
    • Uniform read/write interface
    • Type-specific operations (regfile vs dir)
    • Block device abstraction layer

Initialization Process

  • GPT partition table parsing
    • Reads partition headers from disk
    • Identifies filesystem partitions
  • EXT2 filesystem mounting
    • Initializes superblock and group descriptors
    • Locates root inode (inode 2)
  • VFS tree construction
    • Creates root directory entry
    • Pre-creates /dev directory
    • Sets up initial operations tables

File Descriptor Manager

A hierarchical file descriptor system supporting 1024 descriptors per process, with sparse allocation, reference counting, and automatic cleanup.

Core Features

  • Two-Level Descriptor Tables
    • 32 top-level entries with 32 descriptors each
    • On-demand allocation of descriptor blocks
    • Total capacity of 1024 descriptors per process
  • File Instance Tracking
    • Current file position (offset) per descriptor
    • Reference counting for shared descriptors
    • Filesystem-specific operations table
  • Management Functions
    • fdm_open_file - Creates and initializes file instances
    • fdm_set/fdm_get - Manages descriptor allocation/lookup
    • fdm_copy - Duplicates tables during process fork
    • fdm_free - Releases all resources on process exit

Technical Implementation

  • Memory-efficient sparse allocation:
    • Only allocates descriptor blocks when needed
    • Uses kernel memory pools for allocations
    • Null-checking on all operations
  • Descriptor table operations:
    • Index calculation via division/modulus (32x32)
    • Automatic block allocation on first use
    • Bounds checking for all descriptor accesses

Process Integration

  • Per-process isolation:
    • Independent descriptor tables per process
    • Copy-on-fork semantics
    • Automatic cleanup during process termination
  • Filesystem interaction:
    • Uniform VFS interface for all filesystems
    • Support for filesystem-private data
    • Direct inode references for fast access

EXT2 Filesystem Implementation

A functional implementation of core EXT2 filesystem operations with support for basic file and directory management.

Implemented Features

  • Filesystem Initialization
    • Superblock reading and validation
    • Block group descriptor handling
    • Inode table access
  • File Operations
    • File creation and deletion
    • Reading and writing file data
    • File seeking and truncation
    • Inode metadata management
  • Directory Operations
    • Directory creation and deletion
    • Directory entry management
    • Directory iteration
    • Entry lookup by name
  • Block Management
    • Block allocation and freeing
    • Direct block pointers
    • Single indirect block pointers
    • Block bitmap handling

Key Technical Details

  • Inode structure with 12 direct blocks and 1 indirect block
  • Directory entries with variable-length names
  • Block group allocation strategy
  • Basic permission and file type handling
  • Filesystem metadata updates (timestamps, counters)

Implementation Notes

The code includes:

  • Complete inode operations (read/write/update)
  • Directory entry packing/unpacking
  • Block allocation with bitmap tracking
  • Error handling for filesystem operations
  • Basic filesystem consistency checks

Shell

Shell Process

The system shell provides a command-line interface with:

  • Command history and recall
  • Tab completion for commands and filenames
  • Input/output redirection
  • Pipeline support for chaining commands
  • Background process execution
  • Scripting capabilities