1 pointby sentinelowl7 hours ago1 comment
  • sentinelowl7 hours ago
    Hey HN,

    I built LILITH, an open source ML weather prediction system that runs on consumer hardware. The model trains in 15 minutes on an RTX 3060, the checkpoint is 22MB, and inference takes under a second.

    THE PROBLEM

    GraphCast, Pangu-Weather, and similar models are impressive but require: - ERA5 reanalysis data (controlled by ECMWF) - 80GB+ VRAM for inference - Institutional-scale compute

    Meanwhile, NOAA’s GHCN dataset has 100K+ weather stations, 150+ years of data, completely public domain.

    THE APPROACH

    Instead of requiring gridded reanalysis, LILITH learns directly from sparse station observations:

    Transformer encoder on 30 days of historical data Autoregressive decoder for multi-day prediction Multi-timescale rollout: 6h steps for days 1-14, daily for 15-42, weekly for 43-90 Climate signal injection (ENSO, MJO) for extended range Total parameters: 1.87M. You could email the checkpoint.

    RESULTS

    Trained on 915K sequences from 300 US stations: - Temperature RMSE: 3.96C - Temperature MAE: 3.01C - Climatology baseline is ~7C RMSE

    For context, this beats just predicting historical averages, though it is not GraphCast-accurate for short range. The value is accessibility, not beating ECMWF.

    HONEST LIMITATIONS

    Days 1-7 are worse than operational models 90-day “forecasts” are really climate outlooks, not weather predictions Currently US stations only No ensemble/uncertainty quantification yet TECH STACK

    PyTorch 2.x with Flash Attention FastAPI backend Next.js 14 frontend with glassmorphism UI Trains on 8GB VRAM with mixed precision The frontend has interactive 90-day charts, a station command center showing all 300 stations with predicted vs actual temps, and historical data exploration.

    WHY IT MATTERS

    Weather prediction has been an institutional monopoly. The data is public, consumer GPUs are powerful enough, and transformer architectures are well understood. There is no reason useful forecasting should be locked behind institutional walls.

    Would love feedback on the station-native approach vs requiring ERA5, and whether the multi-timescale rollout makes sense for extended range.