Visual search engines: the role of AI and machine vision

Visual searches are already being conducted billions of times a month. As technology gets more sophisticated, how consumers use search engines is changing

The search engine is undergoing a transformation.

Thanks to advances in artificial intelligence (AI) and machine vision, users can increasingly use more natural and more visual ways for people to input searches and get results.

Since the 1970s, scientists have been training algorithms to decipher visual imagery. But in the last 50 years, the sophistication of what a computer can perceive has increased dramatically.

Leading the pack, Google recently unveiled its ‘multisearch’ functionality, which combines text and visual search capabilities through its Lens tool. 

And, with Alphabet CEO Sundar Pichai saying visual searches are already being conducted more than eight billion times a month, use is growing. According to research by Insider Intelligence, among US adults aged 34 and younger, 30% had used visual search for shopping as of August 2022, and 12% used it regularly.

AI and machine vision: powering search engines

“Visual search is helpful when you’re uncertain of the name of what you’re searching for,” explains Amanda Milberg, Data Scientist at Dataiku. “The power of visual search enables an individual to simply screenshot an image, upload it to a search engine, and retrieve results on where to purchase.

“Given the new generation of customers, this type of search is powerful for seamless product discovery. Today’s consumers shop with their mobile devices and are influenced to purchase through video reels on social media.”

The biggest role for AI intelligent search is going to be contextualisation and personalisation, adds Prashant Natarajan, Vice President Strategy and Products at H2O.ai. 

“Contextualisation is connecting your need for information or your access to information through the various ways in which that can be used responsibly and used to drive benefit,” he says. “Personalisation is being able to recognise that instead of presenting the same image of the same video or the same document to everyone, the platform will be able to personalise it based on their permissions to create a better experience. 

“After all, people don't come to a search engine to waste their time; often, they want to find something, so we should want their search to be as useful as possible.”

“The question really is how we can improve our AI to make that a better experience – how can we personalise things more, how can we present things as the next best actions and what are options that haven’t been considered yet,” he adds.

“With machine vision, you can get meaningful information out of digital images, videos, and other visual inputs using artificial intelligence (AI),” says Kam Star, VP Product Portfolio at SS&C Blue Prism. “Using this information, computers can find specific insights, objects, entities or even actions within still or moving imaging. Machine vision lets computers see, observe, and understand the world as living things can, enabling humans to then quickly find that information using search.

“But applications are not just in retail or security; there are many use cases in industries ranging

from manufacturing to automotive, energy, finance or utilities.”

AI and machine vision are powering search engines in a few different ways, adds Adnan Masood, Chief Architect – AI & Machine Learning at UST, with AI able to help identify the best match for users’ needs using computer vision capabilities along with the text results.

“With the growing compute power and advent of advanced computer vision approaches around transfer learning, image classification, object detection and tracking as well semantic segmentation, AI can help sift through large amounts of multi-modal data to find what you’re looking for more quickly,” he comments. 

“Machine vision also plays a role in recommendations and product image search; search engines use machine vision algorithms to analyse and understand the contents of the image so they can provide best suitable and pertinent results.”

The uses of visual search

Visual search has a range of uses, but is particularly useful for a new generation of consumers who search in different ways.

“Visual search is helpful when you’re uncertain of the name of what you’re searching,” comments Milberg. “The power of visual search enables an individual to simply screenshot an image, upload it to a search engine, and retrieve results on where to purchase.

“Given the new generation of customers, this type of search is powerful for seamless product discovery. Today’s consumers shop with their mobile devices and are influenced to purchase through video reels on social media.”

“Already, multiple machine learning algorithms are running in the background to determine the optimal and safest way to serve  the results wanted,” adds Natarajan. “For example, when you turn on a filter versus when you turn it off, it's machine learning that's looking at patterns and looking at publicly available data to decide what information should be presented.”

“As online stores multiply, E-commerce businessmen are seeking ways to outshine their competitors,” adds Star. “Visual search is one option. By presenting similar results based on an uploaded image, the technology massively simplifies the search process.”

Better understanding users with NLP

By combining computer vision (CV) and natural language processing (NLP), visual search can overcome the inherent limitations of traditional keyword search.

“As humans, we use language to describe and understand the world,” adds Star. “Assigning attributes to visual information such as images or video is translating CV into higher-level descriptions of words and sentences. This is the combination of NLP and CV.

“By combining NLP with CV, summaries of the properties of the image or video can be extracted, and then later used to search. This enables users to more readily find what they’re looking for by describing it in natural language.”

A machine's ability to understand human language allows it to better understand users, Milberg explains. 

“We can illustrate this by trying to understand customers’ purchasing patterns,” she says. “Often, we construct a customer view by analysing structured data - quantitative, highly organised data - like transaction history. 

“But what if we also analysed customer call centre data to understand customer’s sentiment prior to that purchase? Is there a correlation between good or poor customer service engagement and a given transaction? Can we analyse the words which have more positive influence on the customer, and coach our representatives to use that specific language? 

“The power of unstructured, qualitative data allows us to harness new insights to complement the structured data that already exists. That is the power of NLP.”

Masood adds: “Additionally, NLP can be used to automatically generate tags for images, which can improve the accuracy of image search results. NLP can also be used to analyse user feedback in order to constantly improve the quality of visual search results.”

Share

Featured Articles

Andrew Ng Joins Amazon Board to Support Enterprise AI

In the wake of Andrew Ng being appointed Amazon's Board of Directors, we consider his career from education towards artificial general intelligence (AGI)

GPT-4 Turbo: OpenAI Enhances ChatGPT AI Model for Developers

OpenAI announces updates for its GPT-4 Turbo model to improve efficiencies for AI developers and to remain competitive in a changing business landscape

Meta Launches AI Tools to Protect Against Online Image Abuse

Tech giant Meta has unveiled a range of new AI tools to filter out unwanted images via its Instagram platform and is working to thwart threat actors

Microsoft in Japan: Investing in AI Skills to Boost Future

Cloud & Infrastructure

Microsoft to Open New Hub to Advance State-of-the-Art AI

AI Strategy

SAP Continues to Develop its Enterprise AI Cloud Strategy

AI Applications