1 pointby MelStan7 hours ago1 comment
  • MelStan7 hours ago
    OP here. I built this out of sheer frustration with how slow traditional financial wires are.

    If a refinery gets hit or a tanker gets targeted in the Middle East, Bloomberg and Reuters usually take 20 to 45 minutes to get a verified headline out. By the time it hits retail screens, the algorithms have already moved the Brent Crude (UKOIL) market.

    I wanted to get as close to the actual event as possible (T+0), so I built a pipeline that pulls raw OSINT chatter from Middle Eastern defense wires and intelligence nodes every 60 seconds, parses it, and spits it out as a machine-readable JSON.

    The hardest part of this wasn't the scraping - it was the echo chamber. Middle East OSINT is insanely noisy. One missile gets fired, and 5 different channels report the exact same thing phrased slightly differently within a 2-minute window. If you plug a trading bot directly into that, it fires 5 times and you get liquidated.

    I initially tried throwing the feeds at an LLM to filter duplicates, but it added a 4-second delay and hallucinated correlations that weren't there. I ended up ripping that out and writing a strict Jaccard semantic overlap filter. It just strips noise words, compares core nouns against a rolling memory of the last 100 events, and quietly burns duplicates in about 40ms.

    To actually prove the data is useful, I added a background sweeper. When a major energy strike is flagged, it logs the live UKOIL price. Exactly two hours later, it wakes up, fetches the T+2h price, and maps the impact so you can backtest the geopolitical risk premium.

    I put a heavy edge-cache on the API so the backend doesn't melt. If you want to bypass the UI and just look at the raw payload in your terminal, you can hit the demo key here:

    curl -X GET "https://kineticalpha.io/api/v1/stream?key=HN-DEMO" | jq

    Happy to talk about the filtering math, the latency challenges, or how messy raw war-zone data actually is.