GPU benchmarks (RTX 5060 Ti, kernel-level): • ECDSA Sign: 4.88 M/s (204.8 ns/op, RFC 6979 + low-S) • ECDSA Verify: 2.44 M/s (Shamir + GLV) • Schnorr Sign: 3.66 M/s (BIP-340) • Schnorr Verify: 2.82 M/s (BIP-340 + GLV) • Field Mul: 4,142 M/s (0.2 ns/op)
4 GPU backends: CUDA, OpenCL, Metal, ROCm (HIP). 12+ platform targets: x86-64, ARM64, RISC-V, WASM, iOS, Android, ESP32-S3, ESP32, STM32, plus all 4 GPU backends.
Key features: • Dual-layer security: variable-time FAST path for throughput, constant-time CT path for secret-key operations (no secret-dependent branches) • Stable C ABI (ufsecp) with 45 exported functions — FFI bindings for C#, Python, Go, Rust, Java, Node.js, Dart, PHP, Ruby, Swift, React Native • Full protocol coverage: ECDSA, Schnorr/BIP-340, ECDH, BIP-32/44, MuSig2/BIP-327, Taproot/BIP-341, FROST t-of-n, Pedersen commitments, adaptor signatures, batch verification • 5×52 field representation with __int128 lazy reduction (2.76× faster than 4×64) • Montgomery batch inverse on GPU: 1 inversion + 3(N-1) muls for N elements • Runs on ESP32-S3 in 2.5ms per scalar×G — fast enough for IoT signing
Packages available via npm, NuGet, RubyGems, Maven, and direct downloads for all other languages (pip-ready, cargo-ready, pub-ready, CocoaPods-ready).
Not audited — this is a research project. For production, use bitcoin-core/secp256k1. But if you need raw throughput across heterogeneous hardware, this might be useful.
AGPL-3.0 licensed.