1 pointby gpu_systems7 hours ago1 comment
  • gpu_systems7 hours ago
    I built PascalVal to validate whether NVIDIA GPUs are healthy and correctly integrated on Linux systems.

    Rather than chasing peak benchmark numbers, it runs three focused validation stages that directly test GPU health and system integration:

    • PCIe link validation (negotiated width, generation, and effective efficiency) • Memory bandwidth validation (including Unified Memory paths) • Compute validation using SGEMM GFLOPS

    PascalVal supports multi-GPU systems and evaluates each GPU independently, so per-device faults are exposed rather than averaged away.

    The goal is not tuning or optimization, but answering a single question with confidence: “Is this GPU, and its integration into this system, behaving correctly?”