3 pointsby blockmandev5 hours ago1 comment
  • blockmandev5 hours ago
    Pure Rust ternary inference engine based on BitNet b1.58-2B-4T. No Python, no CUDA, no external ML frameworks. Single executable + model weights = portable AI that runs on any machine.

    Zero-multiplication inference — ternary weights {-1, 0, +1} mean the inner GEMV loop uses only addition and subtraction, no floating-point multiply. Smart system awareness — detects RAM and CPU at startup and adjusts generation limits automatically.