2 pointsby pberlizov6 hours ago1 comment
  • pberlizov6 hours ago
    This is ARL, a system I've built to automatically mitigate model drift. If you're building a model on temporal data where patterns will change over time, your model's performance will naturally drift, which means you must occasionally retrain. While tools exist to detect drift by observing changing patterns in the data, they don't provide teams with recourse beyond fully retraining their models and are simply an observability layer. Many still default to a once-a-[insert time interval here] approach to retraining, which is inefficient and expensive (in both time and direct costs).

    This system is designed to deal with model drift without blindly fully retraining: after detecting shifts, it learns from delayed labels and takes the smallest bounded steering step. Retrain only occurs when such a shift is unavoidable. This makes your system smarter, less expensive to run, and, as you will see below, outperforms raw periodic retraining.

    So far, it's been tested on public benchmarks in several industries and has performed well, delivering gains in utility/risk reduction on almost every one. On 3 core public fraud streams, it beats scheduled retraining on utility and reduces proxy risk by 7.2%, 8.7%, and 6.0% versus frozen. I encourage you to test it by running pip install "adaptive-reliability-layer[torch,serving]>=0.3.4" arl-demo. arl-demo is a quick-running toy dataset demo you can look at, or you can run a larger test suite with arl-hn-launch. The code is source-available under BUSL.

    If you deal with training models on time-series data, I'd love to hear from you at pberlizov@college.harvard.edu. Feel free to reach out with any tips, criticism, collaboration proposals, or if you'd be interested in trying it on your data! Thank you very much!