If you follow generative AI news at all, you know that LLM chatbots have a tendency to “fabricate” false information while presenting it as authoritative truth. This trend could pose serious problems, as a chatbot run by the New York City government is fabricating inaccurate answers to some important questions about local laws and city policies. It seems so.
New York City's MyCity chatbot was launched as a “pilot” program last October. The announcement says ChatBot helps business owners “save their time by instantly providing actionable, reliable information from more than 2,000 New York business web pages and articles on topics such as code and regulatory compliance, business availability, and more.” and how to save money. Incentives and best practices to avoid violations and fines. ”
But a new report from The Markup and local nonprofit news site The City found that the MyCity chatbot was providing dangerously incorrect information about some fairly basic city policies. To name just one example, the New York City government's information page clearly states that Section 8 housing subsidies are one of many legitimate sources of income that landlords must accept. , Bott said New York City buildings “do not have to accept Section 8 vouchers.” Without discrimination. The Markup also received incorrect information in response to chatbot queries about industry-specific information such as employee pay and working hour regulations, and funeral home pricing.
Further testing by BlueSky user Kathryn Tewson found that the MyCity chatbot gave some dangerously wrong answers regarding the treatment of whistleblowers in the workplace, and some hilariously bad answers regarding the need to pay rent. I see that you have answered.
this will continue to happen
When we take a closer look at the token-based predictive models that power these types of chatbots, the results aren't all that surprising. MyCity's Microsoft Azure-powered chatbot uses a complex process of statistical association across millions of tokens to determine the most It basically guesses the next most likely word.
This can cause problems if a single factual answer to a question is not accurately reflected in the training data. In fact, The Markup says that at least one of its tests yielded the correct answer to the same question about Section 8 housing voucher acceptance (“10 separate Markup staffers” asked the same question repeatedly). (even though it turned out to be incorrect at the time).
The MyCity chatbot, which is prominently labeled as a “beta” product, warns users who bother to read the warning that it “may produce inaccurate, harmful, or biased content” and that users ” That response should not be relied upon as a substitute.” Professional advice. ” But the page also goes out of its way to say that it is “trained to provide official business information for New York City” and that it is sold as a way to “help business owners navigate government.” Are listed.
Andrew Riggy, executive director of the New York City Hospitality Alliance, told The Markup that he has encountered bot inaccuracies himself and has received similar reports from at least one local business owner. However, Leslie Brown, a spokeswoman for the New York City Office of Innovation, told The Markup that the bot has “already provided timely and accurate answers to thousands of people,” adding, “We We will continue to focus on upgrading this tool to better support small businesses across the country.” City. “
The Markup report highlights the dangers of governments and companies releasing chatbots to the public before they have been thoroughly vetted for accuracy and reliability. Last month, a court forced Air Canada to comply with a fraudulent refund policy devised by a chatbot available on its website. Chatbots integrated into major tax preparation software are providing “random, misleading, or inaccurate answers” to many tax questions, according to a recent Washington Post report. It turns out. And some crafty prompt engineers Reportedly able to fool a car dealer's chatbot I decided to accept a “legally binding offer – no cancellation'' for a car for $1.
These types of issues have already led some companies to move away from more generalized LLM-powered chatbots and towards search augmented generative models that are trained more specifically and tailored based on only a small number of relevant pieces of information. is moving to. This kind of focus could become even more important if the FTC's efforts to hold chatbots accountable for “false, misleading, or defamatory” information are successful.