1 pointby hunterleaman7 hours ago2 comments
  • hunterleaman3 hours ago
    Every spirits label sold in the US needs a Certificate of Label Approval (COLA) from the TTB. Those records are public but locked behind a session-gated search interface — detail page URLs require an active server-side session or they return empty HTML. I built this to make the data navigable. The dataset covers 12,000+ vodka label approvals consolidated into 9,000+ product groups across 6,000+ brands and 2,400+ producers.

    The interesting part is DSP permit cross-referencing. Every label lists statements like "Distilled by DSP-IN-15012" or "Bottled by DSP-KY-354." Mapping those permits to facilities reconstructs who actually produces what. About 1,035 producers show up as contract distillers — making products for brands that don't operate their own stills. The "distilled by" vs. "bottled by" distinction is legally meaningful, and the data reveals the real production topology of the US spirits market.

    Stack: Python pipeline pulls TTB CSV exports and scrapes HTML detail pages for status, formula IDs, producer names, and label images. Statically generated with Astro 5 on Cloudflare Pages (Lighthouse 100). Worker + KV handles TTB ID redirects.

    Starting with vodka, expanding to other categories. Entirely public government data. No accounts, no paywall.

    Contact: team@buy.vodka

  • hunterleaman7 hours ago
    [dead]