xAI unveils Grok 4.1 with major leap in accuracy and speed
Elon Musk’s artificial intelligence company, xAI, has rolled out Grok 4.1, its newest and most refined version of the Grok model. Announced by Musk through his platform X, the upgrade promises notable improvements in speed, accuracy, and overall user experience—marking another significant step in xAI’s growing rivalry with established AI giants.
Musk shared the update directly with users, writing, “Grok 4.1 just released. You should notice a significant increase in speed and quality.” His post set the stage for a wave of anticipation, as xAI detailed the model’s capabilities and the extensive work that went into reducing errors and boosting reliability.
According to xAI, Grok 4.1 not only enhances performance but also deepens the model’s emotional and creative responsiveness. The company says the new version excels in collaborative conversations while preserving the intelligence and stability of earlier releases. Internal evaluations reportedly show that Grok 4.1 is “exceptionally capable in creative, emotional, and collaborative interactions,” a sign that xAI is aiming for a model that feels more human while remaining technically competent.
One of the biggest challenges AI companies face today is factual hallucination—when a model confidently presents made-up information as fact. xAI claims to have tackled this issue head-on. The company focused its post-training efforts on minimising hallucinations specifically in real-world, information-seeking prompts. Their testing involved production-level traffic and the widely used FActScore benchmark, which measures accuracy across 500 biographical questions.
The results show a clear improvement. While the earlier Grok 4 Fast model had a hallucination rate of 12 per cent, Grok 4.1 cut that number down to just 4 per cent, making it nearly three times more reliable. The FActScore benchmark further reinforced this progress, with Grok 4.1 scoring 2.97 per cent, a significant drop from the previous model’s 9.89 per cent.
Beyond accuracy, Grok 4.1 has also made waves in competitive rankings. On LMArena, a respected platform for evaluating large language models, Grok 4.1 in quasarflux mode secured the top position with an Elo score of 1483—31 points ahead of the closest non-xAI competitor. Even the non-reasoning tensor mode outperformed many full-reasoning setups from other companies, securing the second-highest rank overall.
Before its official launch, xAI conducted a two-week silent deployment from November 1 to 14, gradually introducing the upgraded model to users. During this time, continuous blind evaluations were carried out to measure user preference. In direct comparisons with the previous production model, Grok 4.1 achieved a win rate of 64.78 per cent, indicating a strong positive response from users.
Grok 4.1 is now widely available. Users can access it on grok.com, the X platform, and through both iOS and Android apps. It can be selected directly through the model picker or used in Auto mode for an optimized experience.
With its renewed focus on accuracy, responsiveness, and real-world reliability, Grok 4.1 positions xAI as a formidable player in the rapidly evolving AI landscape.