How Google fought search spam using AI in 2020

By Tilly Kenyon
Google has said that Artificial Intelligence (AI) offers ‘unprecedented potential to revolutionise’ spam fighting...

Last year Google was able to build their very own spam-fighting AI, that can catch both known and new spam trends. 

Hacked spam was still widespread in 2020 as the number of vulnerable websites remained quite large, although Google has said they have improved their detection capability by more than 50% and removed most of the hacked spam from search results. They have also reduced sites with auto-generated and scraped content by more than 80% compared to a couple of years ago.

What is search engine spam? 

Search engine spam refers to measures that try to influence the position a website has in search engines, often for pages that contain little or no relevant content.

How Google prevents spam from reaching you 

Before Google delivers a set of search results, there is a lot that happens. Every day they are discovering, crawling, and indexing billions of web pages of which they discover 40 billion spammy pages. 


This diagram shows how Google defends against spam.

Firstly, they have systems that can detect spam when they crawl pages or other content. Crawling is when their automatic systems visit content and consider it for inclusion in the index they use to provide search results. 

These systems also work for the content they discover through sitemaps and Search Console. For example, Search Console has a 'request indexing' feature so creators can let Google know about new pages that should be added quickly. Google has previously observed spammers hacking into vulnerable sites, pretending to be the owners of these sites, verifying themselves in the Search Console, and using the tool to ask Google to crawl and index the many spammy pages they created. Using AI, Google was able to pinpoint suspicious verifications and prevented spam URLs from getting into the index this way.

Next, they have systems that analyse the content that is included in the index. When you issue a search, they work to double-check if the content that matches might be spam. If so, that content won’t appear in the top search results. 

The result is that very little spam actually makes it into the top results anyone sees for a search, thanks to the automated systems that are aided by AI. Google has estimated that these automated systems help keep more than 99% of visits from Search completely spam-free. As for the percentage left, their teams take manual action to further improve the automated systems.

(Image: Google)


Featured Articles

Andrew Ng Joins Amazon Board to Support Enterprise AI

In the wake of Andrew Ng being appointed Amazon's Board of Directors, we consider his career from education towards artificial general intelligence (AGI)

GPT-4 Turbo: OpenAI Enhances ChatGPT AI Model for Developers

OpenAI announces updates for its GPT-4 Turbo model to improve efficiencies for AI developers and to remain competitive in a changing business landscape

Meta Launches AI Tools to Protect Against Online Image Abuse

Tech giant Meta has unveiled a range of new AI tools to filter out unwanted images via its Instagram platform and is working to thwart threat actors

Microsoft in Japan: Investing in AI Skills to Boost Future

Cloud & Infrastructure

Microsoft to Open New Hub to Advance State-of-the-Art AI

AI Strategy

SAP Continues to Develop its Enterprise AI Cloud Strategy

AI Applications