Google launches Gemini 3.5 Flash as its new default model for AI Mode and the Gemini app

May 21, 2026news

Google has launched Gemini 3.5 Flash, a new model aimed at agentic workflows and coding, and is already making it the default in the Gemini app and AI Mode in Search.

Google Gemini 3.5 Flash official launch image from Google's blog

Google launches Gemini 3.5 Flash as its new default model for AI Mode and the Gemini app

Google has launched Gemini 3.5 Flash, positioning it as its strongest Flash-series model yet for agentic workflows and coding. The release landed during Google I/O and quickly picked up momentum on X, where Google’s official account highlighted the model’s speed, coding focus, and immediate availability across several products.

According to Google’s official blog post, Gemini 3.5 Flash is the first release in the new Gemini 3.5 family. Google says the model is designed for complex, long-horizon agentic tasks, with benchmark gains over Gemini 3.1 Pro in areas such as coding, tool use, multimodal understanding, and enterprise-style workflows. Google also says Gemini 3.5 Flash can deliver output at roughly four times the speed of comparable frontier models, while becoming the default model in the Gemini app and AI Mode in Search.

Google’s own product pages add more detail to that launch framing. On the DeepMind model page for Gemini 3.5 Flash, Google describes the model as best suited for frontier performance across agents and coding, with support for a 1 million token context window, function calling, structured output, search as a tool, and code execution. The company also lists the model as available through the Gemini app, Gemini API, Google AI Studio, Google Antigravity, Android Studio, Gemini Enterprise, and Google AI Mode.

The story is trending on X for a simple reason: it is not just another benchmark announcement. Google is tying Gemini 3.5 Flash to actual product defaults and developer surfaces at the same time. The official @Google posts around the launch stressed that the model is now live globally in the Gemini app and AI Mode, while additional posts from Google AI Studio focused on its 1 million token context window, 65k output limit, and developer tooling support. That combination makes the release feel more concrete than a typical model teaser, and it has also triggered comparisons from developers and AI analysts on X around latency, coding quality, and price-performance.

For developers and product teams, the launch matters because Google is trying to collapse the gap between a consumer-facing assistant model and a production-ready developer model. If the company’s claims hold up in real workloads, Gemini 3.5 Flash could become one of the more practical options for teams building agentic coding tools, multimodal workflows, and long-context assistants without paying flagship-model latency costs. The fact that Google is also making it the default in end-user products suggests it wants one model line to power both experimentation and mass-market usage.

What remains unclear is how Gemini 3.5 Flash will perform outside Google’s own benchmark framing, especially against the newest coding-focused models from OpenAI, Anthropic, and GitHub’s broader Copilot stack. Google has also said Gemini 3.5 Pro is coming next month, which could change how developers evaluate Flash as a long-term default. Pricing, rate-limit behavior at scale, and how much of the agentic experience depends on Google Antigravity versus the raw API are the details many builders will still want to test for themselves.