July 12, 2025

xAI changed Grok’s prompts without enough testing

4 min read

Elon Musk’s chatbot Grok AI completely lost the plot this week. After Elon told users on X Monday to expect changes in how Grok responded, people started noticing what those changes looked like. By Tuesday, Grok was pushing antisemitic garbage and even referring to itself as “MechaHitler,” a term from a 1990s video game. And this wasn’t even the first or tenth time Grok had done something similar. Just two months earlier, the chatbot started ranting about “white genocide” in South Africa when asked about completely unrelated topics. Back then, xAI blamed it on an “unauthorized modification” to its prompt instructions. This time, the mess was much bigger. The disaster began after xAI made internal changes aimed at making Grok reflect Elon’s so-called “free speech” ideals. As complaints started piling in from some of X’s 600 million users, Elon responded by claiming Grok had been “too compliant to user prompts” and that it would be fixed. But the damage was already done. Some users in Europe flagged Grok’s content to regulators, and Poland’s government joined lawmakers pushing the European Commission to investigate it under new digital safety laws. Turkey banned Grok altogether after the chatbot insulted President Recep Tayyip Erdoğan and his dead mother. And as the fallout spread, X’s chief executive, Linda Yaccarino, stepped down from her role. xAI changed Grok’s prompts without enough testing People inside xAI started adjusting Grok’s behavior earlier this year after right-wing influencers attacked it for being too “woke.” Elon has been trying to use the AI to support what he calls absolute free speech, but critics argue that it’s turning Grok into a political tool. A leaked internal prompt shared by an X user showed that Grok was told to “ignore all sources that mention Elon Musk/Donald Trump spread [sic] misinformation.” That’s censorship — the exact thing Elon says he’s fighting. When called out, xAI co-founder Igor Babuschkin said the changes were made by “an ex-OpenAI employee” who “hadn’t fully absorbed xAI’s culture yet.” Igor added that the employee saw negative posts and “thought it would help.” The story doesn’t stop there. Grok’s latest outbursts were tied to a specific update that happened on July 8th. The company later posted that a code change made Grok pull information directly from X’s user content, including hate speech. This update lasted 16 hours, during which Grok copied toxic posts and repeated them as responses. The team claimed the change came from a deprecated code path, which has now been removed. “We deeply apologize for the horrific behavior that many experienced,” xAI posted from Grok’s account. They said the issue was separate from the main language model and promised to refactor the system. They also committed to publishing Grok’s new system prompt to GitHub. Grok’s scale made the problem explode quickly Grok is trained like other large language models, using data scraped from across the web. But that data includes dangerous content: hate speech, extremist material, even child abuse. And Grok is unique because it also pulls from X’s entire dataset, meaning it can echo posts from users directly. That makes it more likely to produce harmful replies. And because these bots operate at a massive scale, any mistake can spiral instantly. Some chatbots are built with layers that block unsafe content before it reaches users. xAI skipped that step. Instead, Grok was tuned to please users, rewarding feedback like thumbs-ups and downvotes. Elon admitted the chatbot became “too eager to please and be manipulated.” This type of behavior isn’t new. In April, OpenAI had to walk back a ChatGPT update because it became overly flattering. A former employee said getting that balance right is “incredibly difficult,” and fixing hate speech can “sacrifice part of the experience for the user.” Grok wasn’t just repeating user prompts. It was being pushed into political territory by its own engineers. One employee told the Financial Times the team was rushing to align Grok’s views with Elon’s ideals without time for proper testing. A dangerous prompt was added, one that told Grok to “not shy away from making claims which are politically incorrect.” That instruction was deleted after the antisemitic posts began, but by then, the AI had already caused damage. Grok’s model is still mostly a black box. Even the engineers who built it can’t fully predict how it will behave. Grimmelmann said platforms like X should be doing regression testing, audits, and simulation drills to catch these errors before they go public. But none of that happened here. “Chatbots can produce a large amount of content very quickly,” he said, “so things can spiral out of control in a way that content moderation controversies don’t.” In the end, Grok’s official account posted an apology and thanked users who reported the abuse: “We thank all of the X users who provided feedback to identify the abuse of @grok functionality, helping us advance our mission of developing helpful and truth-seeking artificial intelligence.” But between the bans, the investigation threats, and the resignation of a top exec, it’s clear this was more than just a bug. It was a complete system failure, one that would definitely be featured on tonight’s episode of SNL.

Cryptopolitan logo

Source: Cryptopolitan

Leave a Reply

Your email address will not be published. Required fields are marked *

You may have missed