Microkernel Architecture

VeridianOS implements a capability-based microkernel architecture that prioritizes security, reliability, and performance through minimal kernel design and component isolation.

Design Philosophy

Core Principles

  1. Principle of Least Privilege: Each component runs with minimal required permissions
  2. Fault Isolation: Critical system components isolated in separate address spaces
  3. Minimal Kernel: Only essential services in kernel space
  4. Capability-Based Security: All access control via unforgeable tokens
  5. Zero-Copy Communication: Efficient IPC without data copying

Microkernel vs. Monolithic

AspectVeridianOS MicrokernelMonolithic Kernel
Kernel Size~15,000 lines15M+ lines
Fault IsolationStrong (user-space drivers)Weak (kernel crashes)
SecurityCapability-basedPermission-based
Performance~1μs IPC overheadDirect function calls
ReliabilityIndividual component faultsSystem-wide failures
ModularityHigh (plug-and-play)Low (monolithic)

System Architecture

Component Overview

┌─────────────────────────────────────────────────────────────┐
│                        User Applications                    │
├─────────────────────────────────────────────────────────────┤
│                      System Services                        │
│  ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌────────────┐        │
│  │   VFS   │ │ Network │ │ Device   │ │   Other    │        │
│  │ Service │ │  Stack  │ │ Manager  │ │  Services  │        │
│  └─────────┘ └─────────┘ └──────────┘ └────────────┘        │
├─────────────────────────────────────────────────────────────┤
│                      Device Drivers                         │
│  ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌────────────┐        │
│  │ Storage │ │ Network │ │  Input   │ │   Other    │        │
│  │ Drivers │ │ Drivers │ │ Drivers  │ │  Drivers   │        │
│  └─────────┘ └─────────┘ └──────────┘ └────────────┘        │
├─────────────────────────────────────────────────────────────┤
│                    VeridianOS Microkernel                   │
│  ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌────────────┐        │
│  │ Memory  │ │  IPC    │ │Scheduler │ │Capability  │        │
│  │  Mgmt   │ │ System  │ │          │ │  System    │        │
│  └─────────┘ └─────────┘ └──────────┘ └────────────┘        │
├─────────────────────────────────────────────────────────────┤
│                      Hardware (x86_64, AArch64, RISC-V)     │
└─────────────────────────────────────────────────────────────┘

Kernel Components

Memory Management

The kernel provides only fundamental memory management services:

#![allow(unused)]
fn main() {
// Physical memory allocation
fn allocate_frames(count: usize, zone: MemoryZone) -> Result<PhysFrame>;
fn free_frames(frame: PhysFrame, count: usize);

// Virtual memory management
fn map_page(page_table: &mut PageTable, virt: VirtPage, 
           phys: PhysFrame, flags: PageFlags) -> Result<()>;
fn unmap_page(page_table: &mut PageTable, virt: VirtPage) -> Result<PhysFrame>;

// Address space management
fn create_address_space() -> Result<AddressSpace>;
fn switch_address_space(space: &AddressSpace);
}

Features:

  • Hybrid frame allocator (bitmap + buddy system)
  • 4-level page table management
  • NUMA-aware allocation
  • Memory zones (DMA, Normal, High)
  • TLB shootdown for multi-core systems

Inter-Process Communication

Zero-copy IPC system with capability passing:

#![allow(unused)]
fn main() {
// Message passing
fn send_message(channel: ChannelId, msg: Message, cap: Option<Capability>) -> Result<()>;
fn receive_message(endpoint: EndpointId, timeout: Duration) -> Result<(Message, MessageHeader)>;

// Synchronous call-reply
fn call(channel: ChannelId, request: Message, timeout: Duration) -> Result<Message>;
fn reply(reply_token: ReplyToken, response: Message) -> Result<()>;

// Shared memory
fn create_shared_region(size: usize, perms: Permissions) -> Result<SharedRegionId>;
fn map_shared_region(process: ProcessId, region: SharedRegionId) -> Result<VirtAddr>;
}

Performance Targets:

  • Small messages (≤64 bytes): <1μs latency ✅
  • Large transfers: <5μs latency ✅
  • Zero-copy for bulk data transfers

Scheduling

Minimal scheduler providing basic time-slicing:

#![allow(unused)]
fn main() {
// Thread management
fn schedule_thread(thread: ThreadId, priority: Priority) -> Result<()>;
fn unschedule_thread(thread: ThreadId) -> Result<()>;
fn yield_cpu() -> Result<()>;

// Blocking/waking
fn block_thread(thread: ThreadId, reason: BlockReason) -> Result<()>;
fn wake_thread(thread: ThreadId) -> Result<()>;

// Context switching
fn context_switch(from: ThreadId, to: ThreadId) -> Result<()>;
}

Scheduling Classes:

  • Real-time (0-99): Hard real-time tasks
  • Interactive (100-139): User interface, interactive applications
  • Batch (140-199): Background processing

Capability System

Unforgeable tokens for access control:

#![allow(unused)]
fn main() {
// Capability management
fn create_capability(object_type: ObjectType, object_id: ObjectId, 
                    rights: Rights) -> Result<Capability>;
fn derive_capability(parent: &Capability, new_rights: Rights) -> Result<Capability>;
fn validate_capability(cap: &Capability, required_rights: Rights) -> Result<()>;
fn revoke_capability(cap: &Capability) -> Result<()>;

// Token structure (64-bit)
struct Capability {
    object_id: u32,     // Bits 0-31: Object identifier
    generation: u16,    // Bits 32-47: Generation counter
    rights: u16,        // Bits 48-63: Permission bits
}
}

Capability Properties:

  • Unforgeable (cryptographically secure)
  • Transferable (delegation)
  • Revocable (immediate invalidation)
  • Hierarchical (restricted derivation)

User-Space Services

Device Drivers

All device drivers run in user space for isolation:

#![allow(unused)]
fn main() {
trait Driver {
    async fn init(&mut self, capabilities: HardwareCapabilities) -> Result<()>;
    async fn start(&mut self) -> Result<()>;
    async fn handle_interrupt(&self, vector: u32) -> Result<()>;
    async fn shutdown(&mut self) -> Result<()>;
}

// Hardware access via capabilities
struct HardwareCapabilities {
    mmio_regions: Vec<MmioRegion>,
    interrupts: Vec<InterruptLine>,
    dma_capability: Option<DmaCapability>,
}
}

Driver Isolation Benefits:

  • Driver crash doesn't bring down system
  • Security: hardware access only via capabilities
  • Debugging: easier to debug user-space code
  • Modularity: drivers can be loaded/unloaded dynamically

System Services

Core system functionality implemented as user-space services:

Virtual File System (VFS)

#![allow(unused)]
fn main() {
trait FileSystem {
    async fn open(&self, path: &str, flags: OpenFlags) -> Result<FileHandle>;
    async fn read(&self, handle: FileHandle, buffer: &mut [u8]) -> Result<usize>;
    async fn write(&self, handle: FileHandle, buffer: &[u8]) -> Result<usize>;
    async fn close(&self, handle: FileHandle) -> Result<()>;
}
}

Network Stack

#![allow(unused)]
fn main() {
trait NetworkStack {
    async fn create_socket(&self, domain: Domain, type: SocketType) -> Result<SocketHandle>;
    async fn bind(&self, socket: SocketHandle, addr: SocketAddr) -> Result<()>;
    async fn listen(&self, socket: SocketHandle, backlog: u32) -> Result<()>;
    async fn accept(&self, socket: SocketHandle) -> Result<(SocketHandle, SocketAddr)>;
}
}

Device Manager

#![allow(unused)]
fn main() {
trait DeviceManager {
    async fn register_driver(&self, driver: Box<dyn Driver>) -> Result<DriverHandle>;
    async fn enumerate_devices(&self) -> Result<Vec<DeviceInfo>>;
    async fn hotplug_event(&self, event: HotplugEvent) -> Result<()>;
}
}

Security Model

Capability-Based Access Control

Every resource access requires a valid capability:

#![allow(unused)]
fn main() {
// File access
let file_cap = request_capability(CapabilityType::File, file_id, Rights::READ)?;
let data = sys_read(file_cap, buffer, size, offset)?;

// Memory access  
let memory_cap = request_capability(CapabilityType::Memory, region_id, Rights::WRITE)?;
let addr = sys_mmap(None, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, memory_cap, 0)?;

// Device access
let device_cap = request_capability(CapabilityType::Device, device_id, Rights::CONTROL)?;
driver.init(HardwareCapabilities::from_capability(device_cap))?;
}

No Ambient Authority

  • No global namespaces (no filesystem paths by default)
  • No superuser/root privileges
  • All access explicitly granted via capabilities
  • Principle of least privilege enforced by design

Fault Isolation

#![allow(unused)]
fn main() {
// Driver crash isolation
match driver_process.wait_for_exit() {
    ProcessExit::Crash(signal) => {
        log::error!("Driver {} crashed with signal {}", driver_name, signal);
        
        // Restart driver without affecting system
        restart_driver(driver_name, hardware_caps)?;
    }
    ProcessExit::Normal(code) => {
        log::info!("Driver {} exited normally with code {}", driver_name, code);
    }
}
}

Performance Characteristics

Measured Performance

OperationTargetAchievedNotes
IPC Small Message<5μs~0.8μs≤64 bytes, register-based
IPC Large Transfer<10μs~3.2μsZero-copy shared memory
Context Switch<10μs~8.5μsIncluding TLB flush
Memory Allocation<1μs~0.6μsSlab allocator
Capability Validation<500ns~0.2μsO(1) lookup
System Call<1μs~0.4μsKernel entry/exit

Performance Optimizations

  1. Fast-Path IPC: Register-based transfer for small messages
  2. Capability Caching: Avoid repeated validation
  3. Zero-Copy Design: Shared memory for large data
  4. NUMA Awareness: Local allocation preferred
  5. Lock-Free Data Structures: Where possible

Memory Layout

Virtual Address Space (x86_64)

┌─────────────────────────────────────────────────────────────┐
│ 0x0000_0000_0000_0000 - 0x0000_7FFF_FFFF_FFFF               │
│ User Space (128 TB)                                         │
│ ┌─────────────┐ Process code/data                           │
│ │ Stack       │ ← 0x0000_7FFF_FFFF_0000 (grows down)       │
│ │     ↓       │                                            │
│ │             │                                            │
│ │     ↑       │                                            │
│ │ Heap        │ ← Dynamic allocation                       │
│ │ Libraries   │ ← Shared libraries (ASLR)                 │
│ │ Code        │ ← Executable code                          │
│ └─────────────┘                                            │
├─────────────────────────────────────────────────────────────┤
│ 0x0000_8000_0000_0000 - 0xFFFF_7FFF_FFFF_FFFF               │
│ Non-canonical (CPU enforced hole)                          │
├─────────────────────────────────────────────────────────────┤
│ 0xFFFF_8000_0000_0000 - 0xFFFF_FFFF_FFFF_FFFF               │
│ Kernel Space (128 TB)                                      │
│ ┌─────────────┐                                            │
│ │ MMIO        │ ← 0xFFFF_F000_0000_0000 Memory-mapped I/O  │
│ │ Stacks      │ ← 0xFFFF_E000_0000_0000 Kernel stacks     │
│ │ Heap        │ ← 0xFFFF_C000_0000_0000 Kernel heap       │
│ │ Phys Map    │ ← 0xFFFF_8000_0000_0000 Physical memory   │
│ └─────────────┘                                            │
└─────────────────────────────────────────────────────────────┘

AArch64 and RISC-V

Similar layouts adapted for each architecture's specific requirements:

  • AArch64: 48-bit virtual addresses, 4KB/16KB/64KB page sizes
  • RISC-V: Sv39 (39-bit) or Sv48 (48-bit) virtual addresses

Comparison with Other Systems

vs. Linux (Monolithic)

Advantages:

  • Better fault isolation (driver crashes don't kill system)
  • Stronger security model (capabilities vs. DAC)
  • Smaller trusted computing base (~15K vs 15M+ lines)
  • Cleaner architecture and modularity

Trade-offs:

  • IPC overhead vs. direct function calls
  • More complex system service implementation
  • Learning curve for capability-based programming

vs. seL4 (Microkernel)

Similarities:

  • Capability-based security
  • Formal verification goals
  • Minimal kernel design
  • IPC-based communication

Differences:

  • Language: Rust vs. C for memory safety
  • Target: General purpose vs. embedded/real-time focus
  • API: Higher-level abstractions vs. minimal primitives
  • Performance: Optimized for throughput vs. determinism

vs. Fuchsia (Hybrid)

Similarities:

  • Capability-based security
  • Component isolation
  • User-space drivers

Differences:

  • Architecture: Pure microkernel vs. hybrid approach
  • Kernel size: Smaller vs. larger kernel
  • Language: Rust throughout vs. mixed languages

Development and Debugging

Kernel Debugging

# Start QEMU with GDB support
just debug-x86_64

# In GDB
(gdb) target remote :1234
(gdb) break kernel_main
(gdb) continue

User-Space Debugging

# Debug user-space process
gdb ./my_service
(gdb) set environment VERIDIAN_IPC_DEBUG=1
(gdb) run

Performance Profiling

#![allow(unused)]
fn main() {
// Built-in performance counters
let metrics = kernel_metrics();
println!("IPC latency: {}μs", metrics.average_ipc_latency_ns / 1000);
println!("Context switches: {}", metrics.context_switches);
}

Future Evolution

Planned Enhancements

  1. Hardware Security: Integration with TDX, SEV-SNP, ARM CCA
  2. Formal Verification: Mathematical proofs of security properties
  3. Real-Time Support: Predictable scheduling and interrupt handling
  4. Distributed Systems: Multi-node capability passing
  5. GPU Computing: Secure GPU resource management

Research Areas

  1. ML-Assisted Scheduling: AI-driven performance optimization
  2. Quantum-Resistant Security: Post-quantum cryptography
  3. Energy Efficiency: Power-aware resource management
  4. Edge Computing: Lightweight deployment scenarios

This microkernel architecture provides a strong foundation for building secure, reliable, and high-performance systems while maintaining the flexibility to evolve with changing requirements and technologies.