1 pointby ck336 hours ago1 comment
  • ck336 hours ago
    Hi HN, I'm the author.

    Breiman and Cutler's original Random Forest implementation (early 2000s) included far more than what modern libraries provide — classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization. When scikit-learn implemented Random Forests they only included classification, regression, and overall permutation importance.

    I wanted to fix that, so I implemented the full original vision and extended it with something new: Native Explainable Similarity. The model can now answer "what makes this sample similar to it's neighbors?" directly via Proximity Importance — as far as I am aware this is the first native explainable similarity in ML.

    The practical result is that a single trained model (one set of trees grown once) is now a comparative alternative to situations that require 3-5 separate tools. For example, a recommender system that would normally need FAISS + XGBoost + SHAP + Isolation Forests + custom code can be done with 1-2 RFX-Fuse models.

    Special thanks to Dr. Adele Cutler for sharing original Breiman-Cutler source materials, which made this possible.

    Written in C++/CUDA with Python bindings. GPU and CPU versions available.

    Happy to answer questions about the implementation or methodology.