Show HN: RFX-Fuse: Breiman and Cutler's Random Forest + Explainable Similarity(github.com)

1 pointby ck3310 days ago1 comment

ck3310 days ago
Hi HN, I'm the author.
Breiman and Cutler's original Random Forest implementation (early 2000s) included far more than what modern libraries provide — classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization. When scikit-learn implemented Random Forests they only included classification, regression, and overall permutation importance.
I wanted to fix that, so I implemented the full original vision and extended it with something new: Native Explainable Similarity. The model can now answer "what makes this sample similar to it's neighbors?" directly via Proximity Importance — as far as I am aware this is the first native explainable similarity in ML.
The practical result is that a single trained model (one set of trees grown once) is now a comparative alternative to situations that require 3-5 separate tools. For example, a recommender system that would normally need FAISS + XGBoost + SHAP + Isolation Forests + custom code can be done with 1-2 RFX-Fuse models.
Special thanks to Dr. Adele Cutler for sharing original Breiman-Cutler source materials, which made this possible.
Written in C++/CUDA with Python bindings. GPU and CPU versions available.
Happy to answer questions about the implementation or methodology.