Chatbots fail real-world test during India–Pakistan conflict

Users turned to AI chatbots like Grok, ChatGPT and Gemini for fact-checking, only to encounter frequent inaccuracies.

A four-day flare-up between India and Pakistan has exposed the limitations of AI chatbots as tools for verifying information in real time. As images and videos flooded social media, users across both countries increasingly turned to platforms like xAI’s Grok, OpenAI’s ChatGPT and Google’s Gemini for answers. Instead, many were met with confusion, errors and even fabricated claims.

On X (formerly Twitter), queries such as “Hey @Grok, is this true?” became common as people sought instant fact-checks for fast-moving developments. But the bots often got it wrong. One widely shared video, showing a missile strike in Sudan years ago, was misidentified by Grok as a live event in Pakistan. Another clip from Nepal was falsely described as footage of Pakistan’s retaliation against Indian airstrikes.

The incident highlighted just how poorly these tools handle context, especially in high-stakes geopolitical conflicts where falsehoods can escalate tensions. The spread of mislabelled content, sometimes endorsed or unchallenged by AI models, fuelled misinformation at a moment when reliable facts were in high demand.

India and Pakistan both face well-documented problems with online disinformation, particularly during military confrontations. But while social media platforms have historically leaned on human fact-checkers to contain the spread of fakes, many are now scaling back those efforts. Meta, for example, has shifted focus to community-driven moderation, despite growing evidence that such systems are slow and easily manipulated.

A Tow Center study found that AI-generated answers are often vague or hedged, offering neither confirmation nor clear correction. Gemini, in one test, accepted an AI-generated image as real and invented details about where it came from. NewsGuard has documented how popular chatbots routinely echo false narratives, from Russian propaganda to misleading political claims.

The India–Pakistan episode offered a stark example of what’s at stake in volatile regions with active online populations and intense political rivalries. When artificial intelligence fails to clarify the truth, or worse, spreads falsehoods, public trust suffers, and the risk of real-world consequences rises.

xAI has blamed some of Grok’s failings on “unauthorised modifications”. The real problem, critics say, is a rush to deploy generative AI tools before they are reliable. As one media researcher noted, “These chatbots sound confident, but they’re often wrong. And in places like India and Pakistan, that can be dangerous.”

Source: Noah Wire Services

More on this

https://www.fox41yakima.com/hey-chatbot-is-this-true-ai-factchecks-sow-misinformation/ – Please view link – unable to able to access data
https://www.ft.com/content/55c08fc8-2f0b-4233-b1c6-c1e19d99990f – This article examines the parallels between generative AI models and the concept of ‘bullshit’ as defined by philosopher Harry Frankfurt. It discusses how AI language models like ChatGPT and Claude generate plausible-sounding content regardless of factual accuracy, termed ‘hallucinations’. The piece highlights the risks of spreading misinformation without intentional deceit and the challenges in improving AI truthfulness, including issues of bias and subjective value judgments. It also references a real court case involving fabricated AI-generated legal citations, underscoring the serious consequences of AI unreliability.
https://www.techradar.com/computing/artificial-intelligence/ive-got-bad-news-if-you-use-chatgpt-or-any-other-ai-as-your-main-search-tool – This article discusses the current unreliability of AI tools like ChatGPT, Perplexity, and Google’s Gemini for web searches. It highlights research indicating that these AI models often present inaccurate information and package it in confusing ways. The Tow Center for Digital Journalism tested several models and found a high rate of incorrect answers. The article emphasizes that AI still requires human oversight and is not currently ready to replace traditional search engines effectively.
https://apnews.com/article/cc50dd0f3f4e7cc322c7235220fc4c69 – A report by artificial intelligence experts and bipartisan election officials highlights the risks associated with chatbots like GPT-4 and Google’s Gemini during the U.S. presidential primaries. The study found that over half of the chatbots’ responses were inaccurate, with 40% deemed harmful. Chatbots provided false details about voting locations and procedures, revealing their capacity to amplify long-standing electoral misinformation. The article underscores significant concerns as AI tools become increasingly integrated into political processes.
https://www.theatlantic.com/technology/archive/2024/08/chatbots-false-memories/679660/?utm_source=apple_news – This article explores how AI chatbots and generative AI, integrated into everyday tools like search engines and social media, increase the risk of misinformation. Research indicates that AI-generated responses can be highly persuasive, leading people to believe and internalize false information. Studies have shown that chatbots can implant false memories and distort reality, much like human manipulators. The piece highlights concerns as people use these tools to gather health information and make voting decisions, emphasizing the need for scrutiny to prevent the spread of misinformation.
https://time.com/6255162/big-tech-ai-misinformation-trust/ – This article discusses how major tech companies like Microsoft, Google, and Meta are responding to the advancements of OpenAI’s generative AI products, particularly the chatbot ChatGPT. It contrasts OpenAI’s success with ChatGPT, due to its thorough human feedback process and toxicity filters, with Meta’s Galactica, which lacked sufficient protective measures against harmful content. The piece underscores the importance of building trustworthy AI systems that do not exacerbate misinformation, highlighting the ongoing debate about balancing rapid AI development with societal risks.
https://www.ft.com/content/547f9a3a-1399-4623-86e0-6764b873b9f1 – Elon Musk’s AI start-up, xAI, has launched Grok-2, a new chatbot that competes with top-tier models like OpenAI’s ChatGPT and Google’s Gemini. Grok-2, available to subscribers of Musk’s social media platform X, is expected to be released for enterprise developers soon. This new model ranks among the top five globally and aims to improve on past issues with AI ‘hallucinations’. The article discusses the performance and potential of Grok-2 in various applications, despite challenges related to data privacy and competition.

Noah Fact Check Pro

The draft above was created using the information available at the time the story first
emerged. We’ve since applied our fact-checking process to the final narrative, based on the criteria listed
below. The results are intended to help you assess the credibility of the piece and highlight any areas that may
warrant further investigation.

Freshness check

Score:
8

Notes:
The narrative presents recent concerns about AI chatbots spreading misinformation, particularly during critical events. Similar issues have been reported in the past, such as AI chatbots generating inaccurate election-related information. ([apnews.com](https://apnews.com/article/ai-chatbots-elections-artificial-intelligence-chatgpt-falsehoods-cc50dd0f3f4e7cc322c7235220fc4c69?utm_source=openai)) However, the specific incidents and studies mentioned in the narrative are recent, with some findings published within the last few months. The inclusion of updated data suggests a higher freshness score, but the underlying concerns have been previously reported. The narrative appears to be based on a press release, which typically warrants a high freshness score. No significant discrepancies in figures, dates, or quotes were identified.

Quotes check

Score:
9

Notes:
The narrative includes direct quotes from various sources. The earliest known usage of these quotes was found in recent publications, indicating that they are not recycled from older material. No variations in wording were noted, suggesting consistency in the quotes used. The absence of earlier appearances of these quotes supports the originality of the content.

Source reliability

Score:
7

Notes:
The narrative originates from a reputable organisation, which strengthens its credibility. However, the report mentions concerns about AI chatbots spreading misinformation, particularly during critical events. While the organisation is reputable, the specific claims made in the report are based on recent studies and incidents, which have been covered by other reputable outlets. The reliance on a single source for these claims introduces some uncertainty. Additionally, the report highlights concerns about AI chatbots spreading misinformation, which have been reported by other reputable outlets. ([apnews.com](https://apnews.com/article/ai-chatbots-elections-artificial-intelligence-chatgpt-falsehoods-cc50dd0f3f4e7cc322c7235220fc4c69?utm_source=openai))

Plausability check

Score:
8

Notes:
The narrative discusses the growing reliance on AI chatbots for fact-checking and the associated risks of misinformation. This aligns with recent findings that AI chatbots have been generating inaccurate and misleading information, particularly during critical events. ([apnews.com](https://apnews.com/article/ai-chatbots-elections-artificial-intelligence-chatgpt-falsehoods-cc50dd0f3f4e7cc322c7235220fc4c69?utm_source=openai)) The claims made are plausible and supported by recent studies. The language and tone are consistent with the region and topic, and the structure focuses on the main issue without excessive or off-topic detail. The tone is appropriately serious, given the subject matter.

Overall assessment

Verdict (FAIL, OPEN, PASS): PASS

Confidence (LOW, MEDIUM, HIGH): HIGH

Summary:
The narrative presents recent concerns about AI chatbots spreading misinformation, particularly during critical events. The content is fresh, with updated data and recent studies supporting the claims made. The quotes used are original and consistent, and the source is reputable. The claims are plausible and supported by recent findings. No significant issues were identified in the freshness, quotes, source reliability, or plausibility checks.

AI misinformation
India-Pakistan conflict
social media verification
fact-checking
xAI Grok
ChatGPT
Gemini
elections
public trust

Register for Editor’s picks

Stay ahead of the curve with our Editor's picks newsletter – your weekly insight into the trends, challenges, and innovations driving the future of digital media.

Trending

OpenAI's Atlas adds AI-powered assistance and memory to web browsing

Public demands guardrails as AI threatens journalism’s trust and value

Blunt prompts may enhance AI response accuracy