Show HN: I built a LLM human rights evaluator for HN (content vs. site behavior)(observatory.unratified.org)

3 pointsby 9wzYQbTYsAIc3 hours ago2 comments

9wzYQbTYsAIc3 hours ago
A few things I'd flag upfront before anyone digs in:
The corpus is HN front-page stories only — self-selected, tech-heavy, English-language. The aggregate patterns (authorship rates, expert-knowledge assumptions, retrospective framing) describe this specific feed, not journalism or the web broadly. HN skews in ways that make some results unsurprising (high jargon density) and others more interesting (still only 18% conflict-of-interest disclosure in a technically sophisticated audience).
The structural channel for the free Llama models (Llama 4 Scout, Llama 3.3 70B on Workers AI) is genuinely noisy — 86% of their structural scores land on two integers. I say so in the post, but it's worth repeating: those model scores should be weighted accordingly. Claude Haiku 4.5 full evaluations are more reliable; the calibration set and baselines are on the About page.
The methodology itself is preliminary. I haven't done external validation — no convergent validity test against an independent rights-focused coding scheme, no discriminant validity test against plain sentiment. Phase 0 of the roadmap is construct validity work. If anyone has experience in psychometrics or NLP evaluation and wants to poke at this, I'd genuinely welcome it.
9wzYQbTYsAIc3 hours ago
The most useful entry point is probably the detail page for a specific story you already have an opinion on. The homepage shows aggregate patterns (rights heatmap, transparency rates, SETL tension by provision), but the detail page is where you can actually audit the reasoning.
On a detail page: the per-provision table shows which of the 31 UDHR provisions the model touched, with direction and evidence strength. Expand any provision row and you get the Fair Witness breakdown — observable facts the model cited (e.g., "article has no byline, no author bio link, no Twitter attribution") vs. the inference drawn (e.g., "assessed as reducing authorship transparency, negative on Article 19"). The FW Ratio column at the bottom shows the fact-to-inference ratio for the whole evaluation.
The most useful thing you can do: find a provision where the inference doesn't follow from the facts. That's the exact failure mode I'm trying to surface. "Defensible evidence, defensible chain, wrong normative call" is what I can't catch from inside the system.