3 pointsby pw8 hours ago2 comments
  • pw8 hours ago
    HHS released a massive dataset of every Medicaid payment to every provider in the US: 227 million rows covering $1.09 trillion in spending across 617,000 billing providers. The data was released explicitly to crowdsource fraud detection.

    The raw data is a 2.9 GB Parquet file. I built MedicaidSpending.org to make it searchable and browsable.

    You can search by provider name or NPI, browse by state/city/specialty, and see individual provider pages with monthly spending trends, billing code breakdowns, and automated billing flags for statistical outliers.

    Some of the patterns are striking. Brooklyn alone accounts for $31.8 billion in personal care services (code T1019) _ more than most states spend on all Medicaid combined. Some authorized officials control hundreds of billing entities. Early analysts scanning just 0.16% of providers flagged $90 billion in likely fraudulent payments.

    Technical details: - Go single binary, ~15 MB - 3.3 GB SQLite database (read-only, pre-aggregated from the 227M rows using DuckDB) - 900,000+ indexable pages generated from 13 templates - No JavaScript framework _ server-rendered HTML, Chart.js for one chart per provider page - Runs on a single VPS behind Caddy

    Data sources: HHS Medicaid Provider Spending dataset, NPPES provider registry, HCPCS code descriptions, OIG exclusion list, NUCC taxonomy codes.

    All public data, no login required.

    • floxy8 hours ago
      Thanks for doing this. I really like the idea of open/transparent government.
  • johng2 hours ago
    Site appears to be down but based on your description this is amazing. If even 10% of this is fraud it needs to be fixed. I'd be surprised if fraud isn't actually somewhere closer to 30%.