1. OpenAI Launches GPT-5.4 with 1M-Token Context and Autonomous Workflows
OpenAI released GPT-5.4 on March 5, combining advanced coding and reasoning with a 1,000,000-token context window. The model scored 75% on the OSWorld-V benchmark — slightly above the human baseline of 72.4% — and can autonomously execute multi-step workflows across software environments with fewer user interventions. The release signals a fundamental shift from AI as a chat tool to AI as an autonomous digital coworker capable of driving long-running tasks.
2. Google DeepMind Ships Gemini 3.1 Pro with 77.1% on ARC-AGI-2
Google DeepMind's Gemini 3.1 Pro launched this month with a 1-million-token context window and multimodal reasoning across text, images, audio, video, and code. The model achieved 77.1% on the ARC-AGI-2 benchmark, placing it among the top frontier models for general reasoning. Google also expanded its "AI Mode" in Search on March 6, allowing users to draft documents, generate code, and build tools directly within the search interface.
3. Anthropic Enables Memory for All Claude Users
By early March 2026, Anthropic rolled out persistent memory features to all Claude users, allowing the assistant to remember preferences, context, and prior conversations across sessions. The company also released Claude Sonnet 4.6 (February 17) and Claude Opus 4.6 (February 5), both with a 1-million-token context window in beta. Separately, MiniMax's new M2.5 model is drawing attention for reportedly rivaling Claude Opus 4.6 in quality at a significantly lower cost.
4. OpenAI Deploys GPT-5.3-Codex-Spark on Cerebras Wafer-Scale Chips
OpenAI launched GPT-5.3-Codex-Spark, its first production AI model running on Cerebras wafer-scale chips rather than traditional Nvidia GPUs. The switch delivers improved throughput and lower latency for interactive coding experiences, reducing dependence on Nvidia hardware. This move marks a broader industry trend of diversifying AI inference infrastructure as demand for real-time agentic workloads grows.
5. MIT Research: Inference-Time Scaling Lets Smaller Models Match Larger Ones
MIT researchers presented new findings showing that "inference-time scaling" — giving models more time to reason at inference rather than training larger networks — allows LLMs to use as little as half the compute of existing methods while achieving comparable accuracy. The approach enables smaller models to match or outperform larger ones on complex problems. A parallel study from Google and MIT also outlined a predictive framework for scaling multi-agent systems, identifying a tool-coordination trade-off useful for selecting optimal agentic architectures.
6. US Military Confirms AI Use in Iran War Operations
US Central Command head Brad Cooper confirmed this week that American forces are using "a variety" of AI tools in the ongoing conflict with Iran to process large volumes of intelligence data. The disclosure has intensified debate over AI's role in lethal decision-making and the transparency of autonomous systems in warfare. Human rights groups are raising concerns about mounting civilian casualties and the opacity of AI-assisted targeting.
7. Anthropic Puts $20M Into AI Regulation Political Push
Anthropic donated $20 million to Public First Action, a bipartisan political group supporting candidates who back AI safety legislation, ahead of the 2026 US midterm elections. The move puts Anthropic at odds with the White House's current "innovation-first" deregulatory stance. The funding underscores a growing rift between safety-focused labs and the federal government's hands-off approach to AI oversight.
8. Trump Executive Order Moves to Preempt State AI Laws
A December 2025 Executive Order titled "Ensuring a National Policy Framework for Artificial Intelligence" is reshaping the regulatory landscape: it tasks federal agencies with sustaining US AI dominance via a "minimally burdensome" national framework and directs the Commerce Secretary to identify state AI laws deemed "onerous," making those states ineligible for BEAD broadband funding. Colorado's AI Act (effective June 30, 2026) and California's Transparency in Frontier AI Act — both requiring algorithmic discrimination safeguards — are among the laws most at risk of federal challenge.
// KEY TAKEAWAYS
This week confirms that 2026 is the year AI moves from experimentation to deployment: frontier models (GPT-5.4, Gemini 3.1 Pro) are crossing into human-level performance on agentic benchmarks, while hardware diversification and inference-time efficiency research are democratizing access to powerful AI. At the same time, the governance gap is widening — the US federal government is actively dismantling state-level AI safeguards even as safety-focused labs pour money into the political arena, and the military's acknowledged use of AI in active combat underscores how urgently the world needs clear rules for autonomous systems.