Back to Blog

xAI Grok 4.1: #1 in Text Arena, #1 in EQ-bench, and better Creative Writing

xAI Grok 4.1: #1 in Text Arena, #1 in EQ-bench, and better Creative Writing

**xAI** launched **Grok 4.1**, achieving a #1 rank on the LM Arena Text Leaderboard with an Elo score of **1483**, showing improvements in creative writing and anti-hallucination. **OpenAI's GPT-5.1 "Thinking"** demonstrates efficiency gains with ~60% less "thinking" on easy queries and strong ARC-AGI performance. **Google DeepMind** released **WeatherNext 2**, an ensemble generative model that is **8× faster** and more accurate for global weather forecasts, integrated into multiple Google products. **Sakana AI** raised **¥20B ($135M)** in Series B funding at a **$2.63B** valuation to focus on efficient AI for resource-constrained enterprise applications in Japan. New evaluations highlight tradeoffs between hallucination and knowledge accuracy across models including **Claude 4.1 Opus** and **Anthropic** models.

Read original post

Turn insight into implementation

Want help turning this idea into a production system?

xAGI Labs helps teams scope, build, and deploy AI products, agent workflows, voice systems, and enterprise rollouts.

If this topic is relevant to your roadmap, we can translate "xAI Grok 4.1: #1 in Text Arena, #1 in EQ-bench, and better Creative Writing" into a concrete build plan and launch path.