A recent study from Columbia University highlights significant inaccuracies in OpenAI’s ChatGPT Search regarding news citations, with 153 of 200 evaluated responses found incorrect.
OpenAI’s ChatGPT Search is reportedly facing significant challenges in accurately citing news publishers, as highlighted in a recent study conducted by Columbia University’s Tow Center for Digital Journalism. This scrutiny arrives shortly after OpenAI’s launch of ChatGPT Search, which had previously claimed extensive collaboration with the news industry to ensure proper attribution and visibility for publishers.
The study’s findings raise concerns regarding the potential misquoting and incorrect attributions that could undermine the credibility of news organisations. With 200 queries evaluated across 20 publications, the results reveal that 153 responses were incorrect. This misrepresentation can jeopardise brand visibility and erode the control publishers have over their own content, a critical factor in an increasingly competitive media landscape.
OpenAI’s push into news publishing follows its rollout of ChatGPT in 2022, where many publishers expressed concern over their content being used to train AI models without consent. In a bid to rectify these issues, OpenAI now allows publishers to dictate their participation in ChatGPT Search results through the robots.txt file. However, the Tow Center’s research indicates that whether a publisher opts in or out, the risks of misattribution and misrepresentation still loom large.
The accuracy problems identified during the evaluation are particularly concerning. The AI’s frequent inability to acknowledge its mistakes raises questions about user trust and the quality of information being disseminated. Instances where phrases like “possibly” were used occurred in merely seven responses, suggesting a tendency for the model to prioritise user satisfaction over factual correctness. Additionally, ChatGPT Search exhibited considerable inconsistency: identical questions yielded different answers, highlighting a potential flaw in its underlying language model.
The scope of the problem extends beyond flaws in attribution. The report concluded that the chatbot sometimes cites copied or syndicated articles instead of the original source, which can induce confusion and diminish the authority of the original publishers. For instance, queries pertaining to content from the New York Times revealed instances where ChatGPT linked to unauthorised versions of articles, even while facing an ongoing lawsuit against OpenAI and imposing restrictions on its crawlers. Likewise, a request for quotes from MIT Technology Review yielded citations from syndicated content rather than direct links to the original pieces.
These challenges illuminate a broader issue regarding OpenAI’s content filtration process. As the study indicated, enabling crawlers to access content does not guarantee visibility, nor does disabling them completely prevent material from appearing in search results. This reality presents significant implications for publishers, who may find their brand misrepresented or their content inadequately credited in an environment increasingly dominated by AI-driven tools.
In response to the findings, an OpenAI spokesperson reiterated the organisation’s commitment to supporting publishing partners by facilitating the access of ChatGPT’s 250 million weekly users to quality content through improved summaries and attribution methods. While OpenAI has acknowledged the difficulties in resolving specific attribution errors, they say they have an ongoing commitment to refining their search product.
As the news industry watches these developments keenly, the outcomes of ongoing legal challenges could lead to more robust control for publishers over their content.
Source: Noah Wire Services
- https://researchlibrary.lanl.gov/posts/beware-of-chat-gpt-generated-citations – This article discusses how ChatGPT can generate false or ‘ghost’ citations, which is relevant to the broader issue of citation accuracy and the potential for misattribution.
- https://subjectguides.uwaterloo.ca/chatgpt_generative_ai/incorrectbibreferences – This guide highlights the importance of verifying the existence and accuracy of sources provided by ChatGPT, which aligns with the concerns about misquoting and incorrect attributions in the study.
- https://blogs.library.duke.edu/blog/2023/03/09/chatgpt-and-fake-citations/ – This blog post warns about ChatGPT’s tendency to fabricate or ‘hallucinate’ citations, which is consistent with the findings of the study regarding incorrect attributions and misrepresentation.
- https://www.sciencedirect.com/science/article/abs/pii/S0165178123002846 – This article discusses the low accuracy of ChatGPT in generating citations, which supports the study’s findings on the inaccuracies in ChatGPT Search’s citations.
- https://www.noahwire.com – Although the specific article is not provided, this link is the source mentioned in the query, which discusses the challenges faced by OpenAI’s ChatGPT Search in accurately citing news publishers.
- https://researchlibrary.lanl.gov/posts/beware-of-chat-gpt-generated-citations – This article also mentions the importance of double-checking citations provided by AI tools, which is a crucial point in the context of the study’s findings on misattribution and misrepresentation.
- https://subjectguides.uwaterloo.ca/chatgpt_generative_ai/incorrectbibreferences – The guide provides methods for verifying the existence of bibliographic references, which is essential given the study’s results on the frequency of incorrect citations.
- https://blogs.library.duke.edu/blog/2023/03/09/chatgpt-and-fake-citations/ – This post advises against using ChatGPT for generating sources or citations, aligning with the study’s concerns about the reliability of ChatGPT Search’s citations.
- https://researchlibrary.lanl.gov/posts/beware-of-chat-gpt-generated-citations – The article highlights a case where an attorney was sanctioned for using non-existent court decisions cited by ChatGPT, illustrating the real-world impact of inaccurate citations.
- https://subjectguides.uwaterloo.ca/chatgpt_generative_ai/incorrectbibreferences – The guide mentions the use of ISBN and ISSN to verify the existence of sources, which is a practical approach to addressing the issue of misattribution identified in the study.
- https://blogs.library.duke.edu/blog/2023/03/09/chatgpt-and-fake-citations/ – The post discusses the limitations of ChatGPT in providing accurate summaries and literature reviews, which is relevant to the study’s findings on the inconsistencies in ChatGPT Search’s responses.