How Google’s Multimodal AI is Changing Global User Behaviour

In an attempt to make search more visual, intelligent and context-aware for everyday users, Google has expanded access to its AI Mode to more users via Labs across the US.
The feature was originally launched for Google One AI Premium subscribers and offered fast responses to complex and open-ended questions.
Now, Google has combined Gemini with Google Lens to offer multimodal capabilities, meaning users can snap or upload photos, ask questions about the image and then receive context-aware answers.
Google’s AI Mode offers highly relevant and detailed responses by using query fan-out techniques and visual search to understand the materials, context and relationships of objects.
The feature is available for testing in the Google app on iOS and Android through Labs sign-up.
Evolving user search behaviour
User behaviour continues to evolve with the rise of tools like Google’s AI Mode.
By integrating AI into search, such as multimodal models like Gemini, people can interact with search in a conversational manner and ask more complex and nuanced questions.
Robby Stein, VP of Product at Google Search, explains: “With AI Mode’s new multimodal understanding, you can snap a photo or upload an image, ask a question about it and get a rich, comprehensive response with links to dive deeper.
“This experience brings together powerful visual search capabilities in Lens with a custom version of Gemini, so you can easily ask complex questions about what you see.
“AI Mode builds on our years of work on visual search and takes it a step further. With Gemini’s multimodal capabilities, AI Mode can understand the entire scene in an image, including the context of how objects relate to one another and their unique materials, colors, shapes and arrangements.”
According to Google, queries in AI Mode are, on average, twice as long as those in traditional search– indicating a change in user behaviour where people are relying on search engines to understand context and act as virtual assistants.
Users are beginning to rely on AI more for open-ended inquiries, exploratory searches and task-specific assistance. This points to a new era of human-computer interaction, where AI is becoming a co-pilot for complex tasks, search is becoming more personalised and users are demanding more contextual intelligence (rather than just accurate answers).
The power behind AI Mode
Google’s AI Mode offers a single, unified experience by combining several advanced AI techniques. By doing so, the technology can offer intelligent and nuanced responses that traditional search fails to offer.
AI can assist users effectively by adopting a multimodal approach (as this is similar to human understanding). Multimodal AI (via Gemini) allows Google to interpret both images and text in combination, allowing users to experience richer interactions that are ideal for real-world use cases such as understanding scenes.
Google AI Mode can also understand the context of the user’s question and precisely identify objects and materials by combining Google Lens with natural language processing.
The query fan-out technique will allow AI Mode to automatically generate multiple, layered queries from a single user input and return a comprehensive response that unlocks further depth and breadth.
Combining these techniques will ensure Google AI Mode can become a deeply intelligent system that offers contextually aware results, mirrors human understanding and anticipates user needs.
Robby continues to explain: “AI Mode builds on our years of work on visual search and takes it a step further. With Gemini’s multimodal capabilities, AI Mode can understand the entire scene in an image, including the context of how objects relate to one another and their unique materials, colors, shapes and arrangements.
“Drawing on our deep visual search expertise, Lens precisely identifies each object in the image. Using our query fan-out technique, AI Mode then issues multiple queries about the image as a whole and the objects within the image, accessing more breadth and depth of information than a traditional search on Google.
“The result is a response that’s incredibly nuanced and contextually relevant, so you take the next step.”
What impact will AI Mode have on the AI industry?
Google’s AI Mode points to a new shift in how AI is deployed, developed and experienced by mainstream users.
Not only will it define the search experience by combining contextual reasoning with visual search and natural language understanding, but it will also accelerate the shift to Multimodal AI.
It will democratise advanced AI capabilities by making cutting-edge accessible to the mainstream. This will continue to help reduce barriers for AI adoption and allow AI to enhance productivity across several industries.
Google’s AI Mode will catalyse a new wave of innovation, set new design and performance benchmarks and reshape user experiences.
Explore the latest edition of AI Magazine and be part of the conversation at our global conference series, Tech & AI LIVE.
Discover all our upcoming events and secure your tickets today.
AI Magazine is a BizClik brand


