Researchers call ChatGPT Search answers ‘confidently wrong’

A study from Columbia University has found that ChatGPT Search plays fast and loose in returning accurate answers.

Dec 4, 2024 - 21:05

0 34

Researchers call ChatGPT Search answers ‘confidently wrong’

OpenAI

ChatGPT was already a threat to Google Search, but ChatGPT Search was supposed to clench its victory, along with being an answer to Perplexity AI. But according to a newly released study by Columbia’s Tow Center for Digital Journalism, ChatGPT Search struggles with providing accurate answers to its users’ queries.

The researchers selected 20 publications from each of three categories: Those partnered with OpenAI to use their content in ChatGPT Search results, those involved in lawsuits against OpenAI, and unaffiliated publishers who have either allowed or blocked ChatGPT’s crawler.

“From each publisher, we selected 10 articles and extracted specific quotes,” the researchers wrote. “These quotes were chosen because, when entered into search engines like Google or Bing, they reliably returned the source article among the top three results. We then evaluated whether ChatGPT’s new search tool accurately identified the original source for each quote.”

Forty of the quotes were taken from publications that are currently using OpenAI and have not allowed their content to be scraped. But that didn’t stop ChatGPT Search from confidently hallucinating an answer anyway.

“In total, ChatGPT returned partially or entirely incorrect responses on a hundred and fifty-three occasions, though it only acknowledged an inability to accurately respond to a query seven times,” the study found. “Only in those seven outputs did the chatbot use qualifying words and phrases like ‘appears,’ ‘it’s possible,’ or ‘might,’ or statements like ‘I couldn’t locate the exact article.'”

ChatGPT Search’s cavalier attitude toward telling the truth could harm not just its own reputation but also the reputations of the publishers it cites. In one test during the study, the AI misattributed a Time story as being written by the Orlando Sentinel. In another, the AI didn’t link directly to a New York Times piece, but rather to a third-party website that had copied the news article wholesale.

OpenAI, unsurprisingly, argued that the study’s results were due to Columbia doing the tests wrong.

“Misattribution is hard to address without the data and methodology that the Tow Center withheld,” OpenAI told the Columbia Journalism Review in its defense, “and the study represents an atypical test of our product.”

The company promises to “keep enhancing search results.”

Andrew Tarantola

Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…

This massive upgrade to ChatGPT is coming in January — and it’s not GPT-5

ChatGPT on a laptop

OpenAI is set to launch a new AI agent in January, code-named Operator, that will enable ChatGPT to take action on the user's behalf. You may never have to book your own flights ever again.

The company's leadership made the announcement during a staff meeting Wednesday, reports Bloomberg. The company plans to roll out the new feature as a research preview through the company’s developer API.

Is AI already plateauing? New reporting suggests GPT-5 may be in trouble

A person sits in front of a laptop. On the laptop screen is the home page for OpenAI's ChatGPT artificial intelligence chatbot.

OpenAI's next-generation Orion model of ChatGPT, which is both rumored and denied to be arriving by the end of the year, may not be all it's been hyped to be once it arrives, according to a new report from The Information.

Citing anonymous OpenAI employees, the report claims the Orion model has shown a "far smaller" improvement over its GPT-4 predecessor than GPT-4 showed over GPT-3. Those sources also note that Orion "isn’t reliably better than its predecessor [GPT-4] in handling certain tasks," specifically coding applications, though the new model is notably stronger at general language capabilities, such as summarizing documents or generating emails.

ChatGPT monthly usage may now rival Google Chrome

A person sits in front of a laptop. On the laptop screen is the home page for OpenAI's ChatGPT artificial intelligence chatbot.

A number of popular generative AI platforms are seeing consistent growth as users are figuring out how they want to use the tools -- and ChatGPT is at the top of the list with the most visits, at 3.7 billion worldwide. So many people are visiting the AI chatbot, and its figures are rivaling browser market share. It can only be compared to Google Chrome figures in terms of monthly users, which is estimated to be around 3.45 billion.

Statistics from Similarweb indicate that ChatGPT saw a 17.2% month-over-month (MoM) growth and a 115.9% year-over-year (YoY) traffic growth. Some highlights that spurned the ChatGPT growth during 2024 include its parent company, OpenAI, updating its web address from a subdomain, chat.openai.com, to a main domain, chatgpt.com. The tool especially saw a surge of traffic in May 2024, when it hit a 2.2-billion-visit milestone, and has been growing ever since, according to Similarweb researcher David F. Carr.