Google’s CEO Sundar Pichai still loves the web. He wakes up every morning and reads Techmeme, a news aggregator resplendent with links, accessible only via the web. The web is dynamic and resilient, he says, and can still—with help from a search engine—provide whatever information a person is looking for.
Yet the web and its critical search layer are changing. We can all see it happening: Social media apps, short-form video, and generative AI are challenging our outdated ideals of what it means to find information online. Quality information online. Pichai sees it, too. But he has more power than most to steer it.
The way Pichai is rolling out Gemini, Google’s most powerful AI model yet, suggests that much as he likes the good ol’ web, he’s much more interested in a futuristic version of it. He has to be: The chatbots are coming for him.
Today Google announced that the chatbot it launched to counter OpenAI’s ChatGPT, Bard, is getting a new name: Gemini, like the AI model it’s based on that was first unveiled in December. The Gemini chatbot is also going mobile, and inching away from its “experimental” phase and closer to general availability. It will have its own app on Android and prime placement in the Google search app on iOS. And the most advanced version of Gemini will also be offered as part of a $20 per month Google One subscription package.
In releasing the most powerful version of Gemini with a paywall, Google is taking direct aim at the fast-ascendant ChatGPT and the subscription service ChatGPT Plus. Pichai is also experimenting with a new vision for what Google offers—not replacing search, not yet, but building an alternative to see what sticks.
“This is how we’ve always approached search, in the sense that as search evolved, as mobile came in and user interactions changed, we adapted to it,” Pichai says, speaking with WIRED ahead of the Gemini launch. “In some cases we’re leading users, as we are with multimodal AI. But I want to be flexible about the future, because otherwise we’ll get it wrong.”
Sensory Overload
“Multimodal” is one of Pichai’s favorite things about the Gemini AI model—one of the elements that Google claims sets it apart from the guts of OpenAI’s ChatGPT and Microsoft’s Copilot AI assistants, which are also powered by OpenAI technology. It means that Gemini was trained with data in multiple formats—not just text, but also imagery, audio, and code. As a result, the finished modal is fluent in all those modes, too, and can be prompted to respond using text or voice or by snapping and sharing a photo.
“That’s how the human mind works, where you’re constantly seeking things and have a real desire to connect to the world you see,” Pichai enthuses, saying that he has long sought to add that capability to Google’s technology. “That’s why in Google Search we added multi-search, that’s why we did Google Lens [for visual search]. So with Gemini, which is natively multimodal, you can put images into it and then start asking it questions. That glimpse into the future is where it really shines.”