2 pointsby kreerc3 hours ago1 comment

kreerc3 hours ago
Introducing WebAccessBench, a novel benchmark for AI language models to assess accessibility quality and WCAG conformance in generated web interfaces under realistic prompting conditions.
I did a bit of research and found that LLMs are incredibly bad at basic digital accessibility tasks. You can compare models and read the full white paper at conesible.de/wab.
Overall data shows that guiding a model with expert-grade prompts has very little effect over a small nudge. The benchmark results suggest that objective error count is too high to rely on LLM technology at all in digital accessibility work, even under explicit expert guidance. It also suggests massive implications for society at large, and major discrimination of people with disabilities.