This was experimental tech... while I admire cities attempting to implement AI, it seems they did not spend enough tax dollars on it!
[0] https://abc7ny.com/post/ai-artificial-intelligence-eric-adam...
Or was it perhaps one of those cases where they found issues, but the only way to really know for sure that the deleterious impact is significant enough by pushing it to prod?
I'm sure they QA'd it, but QA was probably "does this give me good results" (almost certainly 'yes' with an LLM), not "does this consistently not give me bad results".
LLMs can handle search because search is intentionally garbage now and because they can absorb that into their training set.
Asking highly specific questions about NYC governance, which can change daily, is almost certainly 'not' going to give you good results with an LLM. The technology is not well suited to this particular problem.
Meanwhile if an LLM actually did give you good results it's an indication that the city is so bad at publishing information that citizens cannot rightfully discover it on their own. This is a fundamental problem and should be solved instead of layering a $600k barely working "chat bot" on top the mess.
The thing is (and maybe this is what parent meant by non-determinism, in which case I agree it's a problem), in this brave new technological use-case, the space of possible interactions dwarfs anything machines have dealt with before. And it seems inevitable that the space of possible misunderstandings which can arise during these interactions will balloon similarly. Simply because of the radically different nature of our AI interlocutor, compared to what (actually, who) we're used to interacting with in this world of representation and human life situations.
It's the training data that matters. Your "AI interlocutor" is nothing more than a lossy compression algorithm.
Considering Louis Rossmann's videos on his adventures with NYC bureaucracy (e.g. [0]), the QAers might not have known the laws any better than the chat bot.
When is the last time there was positive news involving Microsoft? This bot could've easily been on AWS or GCP but I find it hilarious that here they are, getting dragged yet again
Journalism works.