Crate archibald

Crate archibald 

Source
Expand description

§Archibald - High-Performance Instruction Decoder Generator

Archibald is a procedural macro that generates optimized instruction decoders with compile-time branch elimination. It allows you to define instruction patterns using intuitive bit patterns (like "1011xxyy") and automatically generates a dispatcher function that calls const-generic handlers.

§Features

  • Compile-time branch elimination: Variable bits are expanded into const generics, allowing the compiler to eliminate branches and inline specialized code paths
  • Declarative syntax: Easy-to-read instruction tables that resemble hardware specs
  • Flexible mappings: Support for enum mappings, const function decoders or just passing the bits through

§Pattern Syntax

Patterns use the following characters:

  • 0, 1 - Fixed bits that must match exactly
  • a-z - Variable bits (e.g., rr, mm) that are extracted and passed as const generics
  • _, . - Wildcard bits - useful for immediate values embedded in opcodes
  • ' - Visual separator (ignored) - use for readability like "1100'xxyy"

§Usage

This example demonstrates all major features:

use archibald::instruction_table;

#[derive(Debug, PartialEq, Eq, core::marker::ConstParamTy)]
enum Register { R0, R1, R2, R3 }

#[derive(Debug, PartialEq, Eq, core::marker::ConstParamTy)]
enum Mode { Direct, Indirect }

// Const function for decoding mode bit
const fn decode_mode(bit: u8) -> Mode {
    match bit { 0 => Mode::Direct, 1 => Mode::Indirect, _ => unreachable!() }
}

// Const function for converting bit to bool
const fn bit_to_bool(bit: u8) -> bool { bit != 0 }

instruction_table! {
    type Opcode = u8;

    dispatcher = dispatch;
    context = Cpu;

    // Fixed pattern - no variables
    "0000'0000" => nop;

    // Manual enum mapping with multiple variables
    "0001'ddss" => move_reg<{d}, {s}> where {
        d: Register = { 0b00 => R0, 0b01 => R1, 0b10 => R2, 0b11 => R3 },
        s: Register = { 0b00 => R0, 0b01 => R1, 0b10 => R2, 0b11 => R3 }
    };

    // Const function mapping - enum
    "0010'mrr_" => load<{m}, {r}> where {
        m: Mode = decode_mode(m),
        r: Register = { 0b00 => R0, 0b01 => R1, 0b10 => R2, 0b11 => R3 }
    };

    // Const function mapping - bool
    "0011'i___" => store<{i}> where {
        i: bool = bit_to_bool(i)
    };

    // Passthrough u8 - no where clause (currently always u8 regardless of Opcode type)
    "0100'00oo" => alu<{o}>;
}

// Handler implementations, all branches will be eliminated at compile time due to specialization
fn nop(_cpu: &mut Cpu, _opcode: u8) { }

fn move_reg<const DST: Register, const SRC: Register>(cpu: &mut Cpu, _opcode: u8) {
    cpu.regs[DST as usize] = cpu.regs[SRC as usize];
}

fn load<const MODE: Mode, const REG: Register>(cpu: &mut Cpu, opcode: u8) {
    let addr = match MODE {
        Mode::Direct => opcode & 0x1,
        Mode::Indirect => cpu.regs[opcode as usize & 0x1],
    };
    cpu.regs[REG as usize] = cpu.memory[addr as usize];
}

fn store<const IMMEDIATE: bool>(cpu: &mut Cpu, opcode: u8) {
    if IMMEDIATE {
        cpu.memory[0] = opcode & 0x0F;
    } else {
        cpu.memory[0] = cpu.regs[0];
    }
}

fn alu<const OP: u8>(cpu: &mut Cpu, _opcode: u8) {
    cpu.regs[0] = match OP {
        0 => cpu.regs[0] << 1,  // SHL
        1 => cpu.regs[0] >> 1,  // SHR
        2 => cpu.regs[0] + 1,   // INC
        3 => cpu.regs[0] - 1,   // DEC
        _ => unreachable!()
    };
}

§Dispatcher Function

The dispatcher function generated by the macro is what you want to call to decode and execute instructions. It takes a mutable reference to the provided context. The expanded function will look similar to this:

#[inline]
pub fn dispatch(ctx: &mut (), opcode: u8) {
    match opcode {
        op if op & 240u8 == 0u8 => handler::<{ decode_mode(0u8) }>(ctx, opcode),
        op if op & 240u8 == 16u8 => handler::<{ decode_mode(1u8) }>(ctx, opcode),
        op if op & 240u8 == 32u8 => handler::<{ decode_mode(2u8) }>(ctx, opcode),
        op if op & 240u8 == 48u8 => handler::<{ decode_mode(3u8) }>(ctx, opcode),
        op if op & 240u8 == 64u8 => handler::<{ Mode::A }>(ctx, opcode),
        op if op & 240u8 == 80u8 => handler::<{ Mode::B }>(ctx, opcode),
        op if op & 240u8 == 96u8 => handler::<{ Mode::C }>(ctx, opcode),
        op if op & 240u8 == 112u8 => handler::<{ Mode::D }>(ctx, opcode),
        _ => {
            ::core::panicking::panic_fmt(
                format_args!("Unhandled opcode: 0x{0:02X}", opcode),
            );
        }
    }
}

§Notes

  • Requires nightly Rust with #![feature(adt_const_params)] for enum const generics
  • Enums used as const generics must implement ConstParamTy, PartialEq, and Eq
  • Const functions must be marked with const fn
  • Patterns must be exactly 8, 16, 32, or 64 bits (after removing ' separators)
  • Variables without where clauses (passthrough) always generate u8 literals

Macros§

instruction_table
The main procedural macro for generating instruction decoders.