Rankbrain and Google Algorithm: latest developments
L’objectif de Google est de répondre aux internautes, à ses usages (utilisation du mobile, recherches locales ou vocales…) et à ses intentions de recherche : informationnelles ou transactionnelles.
Google's technical challenge is also that of supporting, organizing and prioritizing a number of pages that are growing exponentially.
Quand on fait appel à un consultant seo, faut il suivre les mises à jour de Google ?
Un bon consultant a normalement déjà anticipé les mises à jour futures et respecter les standards attendus. Il va en revanche mesurer les impactes et adopter des actions si besoin.
Machine Learning, the algorithm that constantly learns
Machine learning is an application of artificial intelligence (AI) that enables systems to learn and improve automatically from experience without being explicitly programmed.
This means that the behavior of Internet users is measured and interpreted : CTR, pogostiking, dwell time… these elements constitute the SXO approach.
Machine learning focuses on the development of computer programs that can access data and use it to learn on their own.
What are the advantages of machine learning in computer programming?
Google's algorithms decide whether your website, including your blog posts, will rank first or last in search results. They can make your business thrive, or expose it to significant financial risk if its listing is penalized.
The various updates of the google algorithm classified by date of deployment since 2010 are:
- Top heavy
- Exact Match Domain (EMD)
- Hummingbird – Colibri
- Mobile friendly
- Phantom (or Quality)
- Google doubles the length of descriptions
- August 2018 Core Update
- June 2019 Core Update
- Site Diversity Update
- September 2019 Core Update
Below is some information on Google's algorithms because the totality of algorithms is too much of a secret.
However, parts of Google's algorithm are regularly updated. In addition, knowing them allows you to acquire a global vision of SEO, to analyze the impact that some of these updates have and, perhaps, also to understand a little better what the webmaster really means or the SEO expert when it comes to penguins or pandas. In addition, it might surprise more than one, but Google's ranking is not based on a single algorithm. In reality, several algorithms work in parallel.
Deployed in June 2010 , Caffeine is a redesign of Google's indexing system. The Caffeine algorithm allows you to crawl and then index a page instantly. Before its implementation, Google could only index pages after extracting, analyzing and understanding their content. A process that could take up to 30 days.
Indeed, Carrie Grimes, software engineer at Google, tells us in an article that the old index was based on different layers that were not updated simultaneously. This caused delays between the discovery of a new page and its presentation in the search results. The new search index crawls the web in small portions and steadily making it possible to add new resources without the use of layers.
Thanks to this new system which facilitates and increases the speed of indexing, Google is now able to offer in its search results, content and articles 50% more recent than before the implementation of the algorithm.
Impact of Caffeine on SEO
Officially, since Caffeine is not a modification of the search engine alogorithm, no impact on SEO. However, when data is indexed faster and in another form, the presentation of search results changes and with it the placement of websites on the SERPs.
The Panda algorithm is a “search filter”. Introduced for the first time in February 2011 , it penalizes the referencing of websites with low quality content. This filter is primarily aimed at combating content sites, created purely for SEO and spam.
The Panda algorithm is updated regularly to allow previously penalized sites to recover their SEO after improving the quality of their content, or on the contrary, to penalize sites that no longer comply with Google's guidelines. During its first deployment, Panda had a major impact on the configuration of search results by altering 12% of SERPs in the United States!
After the 4.2 update on July 18, 2015, the quality of the content became a SEO factor and the integration of Panda into Google's main algorithm was confirmed in January 2016. Google therefore no longer announces the updates. Panda update: this algorithm is constantly taken into account to define the ranking of a website in the search results pages.
Pandas impact on SEO
Some categories of sites that have been very affected by the Panda filter:
- Price comparison
- Company Directory
- Listing of merchants (hotels, events, restaurants)
The Top Heavy algorithm was deployed in January 2012 , in order to penalize the referencing of sites abnormally overloaded with advertisements, in particular above a certain line.
However, this minor update only had a 1% impact on search results.
The Penguin algorithm is the bête noire of webmasters, each update of which was very heavily discussed on the web and led to waves of panic and questioning on social networks. Experts will be able to confirm it: with each fluctuation of the SERPs, the SEO community panicked by fearing a new change in the Google algorithm due to Penguin.
Like Panda, this algorithm is a search filter that was first introduced in April 2012 . It penalizes the referencing of websites that do not respect Google's guidelines for creating, buying or networking links.
Webmasters penalized by Penguin had to clean up their link portfolio by disavowing the offending links. If this cleanup was done correctly, they could expect to get their original SEO back in the next update. However, this tedious cleanup is not that simple: it can sometimes take months or even years before you can hope to escape Google's SEO penalties.
On September 23, 2016 , during the release of Update 4.0 , Google announced that this update would be the last. Like Panda, the Penguin algorithm has been added to Google's core algorithm and it now works in real time.
Now, monitoring the link portfolio must therefore be a constant job to ensure a healthy link portfolio, which does not risk penalizing the SEO of certain pages.
What's more, the addition to Google's core algorithm is good news, as webmasters won't have to wait for a new update to get their SEO back. Indeed, almost two years have passed between the penultimate update of the algorithm and the deployment of version 4.0 of Penguin.
The Pirate algorithm is a search filter deployed in August 2012 . It aims to remove from the SERPs sites that have received copyright infringement complaints sent through Google's DMCA system.
This filter is updated on a regular basis in order to remove the pages which offer illegal downloads of films, series or music.
The Exact Match Domain algorithm was deployed in September 2012 . It prevents low-quality sites from being referenced in the first search results, simply because their domain name corresponds to a query highly sought after by Internet users.
Indeed, the domain name has a strong influence on SEO and some webmasters have found a solution to improve their SEO, by creating domain names that are excessively optimized.
For example, before the implementation of this algorithm, taking "www.logiciel-marketing-pas-cher.com" as the domain name, there was a good chance that the home page of this website was referenced. in the first search results for the query "Cheap marketing software", even if the content of its pages did not necessarily meet the needs of Internet users. The deployment of this algorithm made it possible to avoid such situations.
This algorithm was deployed in June 2013 . It aims to improve the relevance of SERPs by suppressing the results for requests very strongly assimilated to spam (online gaming sites, adult content, credits, counterfeiting, etc.).
Hummingbird was deployed in September 2013 . This algorithm is one of the most important of Google. It has had a strong impact on the way we frame our research. Google chose to name this algorithm Hummingbird because, thanks to it, the search became precise and fast.
Thanks to this algorithm, Google can now understand a query or a phrase as a whole and no longer based on one or a few keywords. The proposed results are therefore of much better quality and the research could become more human, thanks to the understanding of conversational research.
Since the implementation of the new algorithm, it is possible to obtain precise answers for queries such as: "What is the nearest bakery" or "Who is the doctor on call today". This type of research was unthinkable before ... Would Hummingbird have opened the door to artificial intelligences and voice assistants such as Alexa or Siri? We will see in the following that this update contains the artificial intelligence of google which has completely optimized the way of understanding google requests (Things, not strings).
The Pigeon algorithm was deployed in July 2014 , in the United States and in June 2015 internationally. This algorithm favors local search results to provide more precise solutions to user queries. The changes made by this algorithm are visible on Google and Google Maps.
The Pigeon algorithm has had an impact above all on local businesses and businesses such as restaurants, bars or doctors' offices ...
On April 21, 2015 , Google rolled out its Mobile Friendly algorithm, which favors referencing mobile-friendly websites.
This algorithm had an even greater impact than those of Penguin or Panda and it was even renamed "mobilegeddon" (in reference to armageddon) by some SEO experts: the Armageddon of mobile compatibility.
This algorithm was deployed in real time and page by page. A site could therefore maintain good overall SEO, even if some of its pages were not adapted to the mobile format.
Since 2015, mobile compatibility has been a priority for Google and a very important SEO factor. Moreover, in November 2016 , Google announced that it would launch its Mobile First Index in the course of 2017.
What is Mobile First Index? Until now, Google ranked websites based on their desktop version. But user behavior is changing and they spend more time surfing the Internet with a mobile than a computer. Google has therefore decided to take into account the mobile version of a website, to the detriment of the desktop version in order to carry out its referencing.
Rankbrain, launched in early 2015 , is actually part of the Hummingbird search algorithm. Rankbrain is quite peculiar and mysterious, as it is said to be an artificial intelligence that would be able to understand the meaning of similar requests, but formulated differently.
For example, this artificial intelligence could understand, in the course of its learning, that the queries "Barack" and "Michelle Obama's husband" must provide a similar response which is "Barack Obama".
In the extension of Hummingbird, Rankbrain aims to interpret and understand the most abstract research of Internet users. More importantly, Google claimed that Rankbrain was among the top three SEO factors (along with content quality and links).
Rankbrain learning is applied to all research, but it is done offline. Google feeds it with historical research files so that it learns how to make predictions. These predictions are then tested and then applied, if they prove to be correct.
An entire chapter will be dedicated to Rankbrain in the sequel. We will see in detail what it actually is, how it works, etc.
In May 2015 , the SEO planet was panicking, as many webmasters noticed significant changes in the SERPs. However, when the members of the Google team, in charge of search engine quality, were questioned on Twitter (as is very often the case), they replied that they had no stake. up to date to be announced.
The webmasters, convinced that something was happening, decided to name this update Phantom, due to no response from Google, but clear signs of change.
A few weeks later, Google confirmed that an update had indeed been deployed and that it related to the quality of the content of the websites. The Phantom update was then renamed "Quality" by Google. However, Google never wanted to clarify how this update was different from the Panda algorithm.
Periodically, updates are noticed by SEO experts, but denied by Google. There are therefore several versions of the Phantom algorithm, baptized for lack of a better name, by Phantom 1, 2 or 3. However, their importance, their mechanisms and their scope remain more or less unknown.
In November 2017 , Google doubled the number of characters displayed in results descriptions from a limit of 160 characters to a limit of 320 characters.
With this update, Google continues to promote full sentences and descriptions that contain enough information to give the link context, in order to better guide the Internet user in his searches. It is therefore possible that the search engine ignores your meta-description tag and cuts or supplements certain descriptions.
Reminder: Meta descriptions do not count in the rankings of search engines, but remain essential to encourage your visitors to visit your site.
Launched on August 1, 2018 in the middle of the summer, this Core Update is also called “Medic Update”, for several reasons. This is a general algorithm update format, the modifications of which can be more or less important depending on the points it processes / optimizes.
Here, Google has not given more precise indications on what has been brought to the engine. The only press release on this subject mentions the fact of following the same generic advice as for the previous Core Update of March 2018. Several specialists have studied the question of this update, because the ranking has been greatly modified for several sites , mainly in:
- Health in the vast majority.
- Finance and business.
- The e-commerce sector.
Later, Google affirmed that this Core Update does not only concern YMYL (Your Money Your Life) pages and the themes mentioned above, suggesting that it indeed concerns everyone.
A new major update of the algorithm, that of June 2019 is more precisely the first to have been officially announced to specialists through a tweet on Twitter. In fact, this change in the search engine was effective on June 3, 2019 .
This is an update whose objective is to strengthen the requirements in terms of overall quality regarding the results displayed in the SERPs, in particular concerning the following points:
- Loading speed and smooth navigation.
- Global and relevant coverage of the topic concerned.
- Switching to HTTPS or to a UX / UI full responsive design.
This update was another milestone in the emphasis on quality blog post content. Poor quality sites now see their overall ranking decrease in favor of efficient sites, regularly fed with well-constructed, quality content that always responds more to the Internet user's request.
On the other hand, this update differs by its revaluation in the SERPs of the suggestions of YouTube videos above the search results.
This algorithm update was announced very soon after the June 2019 Core Update and also released during that same month of June 2019 . Its name explains the very principle of the new rules it brings: to reinforce the diversity of results in the search pages.
In fact, this update greatly limits the possibility of having several pages from the same domain in the first search results. Thus, without clearly stating it, Google promotes competition between sites, but also facilitates the cross-checking of sources from individuals in order to obtain increasingly reliable information.
This last point should therefore be linked to the ranking selection criteria of the two previous updates:
- A site with structured, relevant and reliable content.
- An optimal browsing experience (loading speed, etc.).
- A coherent and intuitive site tree structure.
With all these elements, Google definitely buries more or less fair competition techniques, like doorway pages. This method consisted of building a page just to attract traffic from a single keyword or query.
Less impactful than its predecessors, this change in the engine was more precisely announced on Google's dedicated Twitter on September 24, 2019 .
Among the main fluctuations in positioning, changes were noted on sites that were previously less well ranked. In other words, this update is a revaluation of results previously reclusive to low rankings in the SERPs.
As a result, Google increasingly considers each existing result to continue to offer relevant SERPs with quality and safe content. These poorly positioned sites may have experienced a slowdown at the time of more abusive techniques not yet penalized by the engine.
Acronym for Bidirectional Encoder Representations from Transformers is announced as the most important update for Google's search engine for 5 years, BERT was officially deployed in France on December 9, 2019 , in parallel with the launch in many other countries.
BERT truly represents the beginnings of artificial intelligence, ultimately in the engine. This results in the contextualization of the keywords resulting from a query, no longer considered individually by the engine, but as a whole.
BERT tends to prioritize the terms and expressions of a query by importance in order to gain in understanding of what is expected by the Internet user. This one, using voice search more than ever or in the form of a written question, will then see the results offered in the SERPs always closer to what he initially expected.
In more detail BERT is also used by google for the following tasks:
- Understand textual cohesion and remove any ambiguities from expressions or sentences, particularly when polysemous nuances could modify the meaning of the research
- Understanding which entities of pronouns refer to is especially useful in long paragraphs with multiple entities. Voice search is a concrete application;
- Predict the next sentence
- Answer questions directly in the SERPs
- Resolve disambiguation issues
BERT's algorithm is based on neural networks and it was released as open source in November 2018 by Google. Several more or less improved variants of the algorithm have thus emerged:
- RoBERTa by Facebook
- CamemBERT a French version developed by INRIA and derived from RoBERTa
- XLNet and ALBERT by Google and Toyota. Released in September 2019, ALBERT is already considered as the successor to BERT, which it surpasses in all areas (especially in terms of score on SQuAD 2.0)
- DistilBERT is a smaller, lighter and faster version of BERT
Impact of BERT on SEO
As google had indicated for Rankbrain it is not possible to optimize for BERT. That's why a lot of SEOs think BERT is more of a step forward for Google than SEO.
According to Google, BERT has an impact on 10% of searches (this figure dates from the launch with queries made in English in the US). This is probably less relevant for high volume queries consisting of few words.
The impact on the ranking for keywords is probably much less than 10% impact, because the queries you are monitoring are probably not formulated in natural language.
You now know a little better all the elements that can influence your research or the SEO of your website. However, this list is not exhaustive, because there are also Big Daddy, Florida or Bourbon, even older algorithm updates.
Google uses a machine-learning artificial intelligence system called "Rankbrain" to help it sort its search results. Its existence was publicly announced in a Bloomberg article on October 26, 2015, although its exact deployment date is not known. Wondering how it works and how it fits into Google's overall ranking system? Here's what we know about Rankbrain.
The information presented below is from several original sources and has been updated over time, with notes indicating where the updates have taken place. These sources are:
First, the Bloomberg article that brought Rankbrain to the fore. Second, additional information that Google has now provided directly to Search Engine Land. Third, our own knowledge and best guesses in places where google doesn't provide answers and also twitter and linkedin articles of industry referrals and google engineers. We will clearly indicate where these sources are used, when deemed necessary, apart from general information.
Rankbrain is Google's name for a machine learning (or Machine Learning) system that is used to help process search results, as Bloomberg reported and as Google also confirmed to us.
Machine learning enables computer programs to perform tasks that only humans are capable of performing with their intelligences or mental processes.
According to Larousse: artificial intelligence is the set of theories and techniques implemented to create machines capable of simulating intelligence
More simply, artificial intelligence, or AI for short, allows a computer to be as intelligent as a human being, at least in the sense that it acquires knowledge both by being taught and by relying on it. knowing it and making new connections. Such AI only exists in science fiction novels, of course. In practice, AI is used to refer to computer systems designed to learn and make connections.
In other words, the goal of AI is to enable computers to become as intelligent as humans through mathematical and statistical approaches.
In other words, they will be able:
- To learn through experience;
- To organize their memory;
- To reason in order to solve problems on their own.
To put it simply, everything starts from a model that the machine will use for its training. Usually, this model is introduced by a human from certain data. The machine will use the model and the data to train or solve practical tasks that are not outside the scope of its model. Depending on the feedback on the quality of its answers or results, the program readjusts the parameters and then the model.
Ultimately, machine learning is where a computer or automatic program learns to do something on its own , rather than being taught by humans or following detailed programming. It often happens that designers don't fully understand how things work afterwards. This is what happened according to google engineers who no longer fully understand how Rankbrain works.
You read it instead Rankbrain is a part of the Google Hummingbird global search "algorithm". Just like a car has an overall engine. The engine itself can consist of different parts, like an oil filter, fuel pump, radiator, etc. Likewise, Hummingbird encompasses different parts, Rankbrain being one of the more recent.
This conclusion is taken from the Bloomberg article in which Greg Corrado (the senior author of the article on the existence of Rankbrain) made it clear that Rankbrain only supported the 15% of queries that the Google system did. never treated yet.
It is therefore interesting to ask why Google launched its machine learning?
Hummingbird also contains other parts whose names are familiar to those in the SEO space like Panda, Penguin and Payday designed to fight spam, Pigeon designed to improve local results, Top Heavy designed to demote pages to strong advertising, Mobile Friendly designed to reward mobile-friendly pages and Pirate designed to fight copyright infringement.
Rankbrain is different from PageRank
PageRank is part of the general algorithm that covers a specific way of giving pages credit based on links from other pages that point to them.
PageRank is special because it is the first name that Google gave to one of the parts of its ranking algorithm, when the search engine started, in 1998.
What about those “signals” that Google uses for ranking?
Signals are things Google uses to determine how to rank web pages. For example, it will read the words on a web page, so the words are a signal. If some words are in bold, this may be another signal that is noted (because that would mean it is important). The calculations used as part of PageRank give a page a PageRank score which is used as a signal. If a page is rated as mobile friendly, another signal is recorded.
Today Rankbrain is considered the third most important signal among google's 200+ ranking factors. So the three most important signals are:
- Les backlinks
- The contents
If you want a more visual guide to ranking signals, check out our periodic table of SEO success factors:
Periodic Table of Search Engine Optimization Success Factors 2015
It's a good guide, in our opinion, to the general things that search engines like Google use to help rank web pages.
Rankbrain is one of "hundreds" of signals that go into an algorithm that determines which results appear on a Google search page and where they rank, Corrado said. Within months of deployment, RankBrain became the third most important signal contributing to a search query result, he said.
Personally, we think the main reasons for its launch are:
- The difficulties of interpreting requests never dealt with before;
- The manual nature of coding existing algorithms to make changes
The difficulties of interpreting requests never dealt with before
In its early days, the search engine relied mainly on the presence on web pages of words found in a query to display its results.
For example, if you search for "lawyers", the search engine will take care of providing the pages that contain these words.
In addition, the slightest variation in the expressions used could lead to different results.
For example, the search engine could not give the same results for “clothing” and “clothing”. The same goes for the queries "best garden boots" and "best garden shoes".
But the problem does not end. Because this operation has given the opportunity to some “black hat” SEOs to repeat words and expressions in their content to find themselves at the top of the results. And this, even if their content is of poor quality. An example here .
Google has evolved a lot since that time. The search engine now manages to detect and punish websites that use Black Hat SEO practices, notably with the Penguin and Panda algorithms.
On the research side, the firm has naturally also made great progress. Indeed, the search engine today manages to understand the queries more and more, and to associate them with each other if they mean the same thing.
The Hummigbird, Stemming and Knowledge Graph updates have embodied Google's shift to seeing words as “entities” and not just a composition of characters.
Stemming allows you to understand the variations of the same word of the genre: mango, mango, mango. The Knowledge Graph meanwhile was a way for Google to better understand the relationships in words of the genre when searching for “Paris” the user probably wants to search for monuments, activities, people related to the capital of France.
The manual nature of the coding of algorithm updates
Statistics show that the engine constantly has to deal with queries that no one has ever searched for. About 15% or nearly 870 million searches per day.
To apply modifications to the algorithm in order to have better results according to the requests, it was necessary to manually work on it and given the size of these searches you easily understand that it is not easy. Well then, not at all.
What exactly does RankBrain do?
With an example it will be easier. Imagine an intern who performs about 5.8 billion tasks per day. At each task his superiors give advice on the work:
- Perfect: This is exactly what I wanted!
- It's not perfect yet: There is still work to be done on it;
- No, you did not solve the task: I wanted this instead.
The intern remembers well all the reactions of his superiors to do better tomorrow, where only 15% will be new tasks. This is how Rankbrain works in this analogy where he is the intern and the users are the superiors and the research tasks.
RankBrain, on the other hand, learns directly from how we interact with its results. Gary Illyes of Google describes it this way: “ [RankBrain] examines data on past searches and based on what has worked well for that search, he will try to predict what will work best for a certain query. It works best for long tail queries and queries we've never seen. "
Therefore, the system is completely autonomous and does not need to be told that such and such a result is bad and that the problem must be solved in such a way.
RankBrain already has criteria, including other ranking signals, that let it know whether a result is a perfect match to a query or not. He has a large database of old research results that allows him to make good decisions. So if you search for "sneakers", he might understand that you also mean "running shoes". She even acquired some basic notions, to understand that there are pages on "Apple", the technological company, and "Apple", the fruit.
This is the main reason why RankBrain performed better than Google engineers. RankBrain predicts what will work best, tests it, and if the change works, it keeps it going.
Is RankBrain really useful?
While the above examples are far from convincing to testify to the greatness of Rankbrain, I do believe it probably has a big impact, as Google claims. The company is quite conservative when it comes to its ranking algorithm. She does little tests all the time. But she only initiates big changes when she has a high degree of confidence.
The integration of RankBrain, as it is supposed to be the third most important signal, is a huge change.
Is RankBrain still learning?
Everything Rankbrain learns is offline, Google tells us. He does lots of historical research and learns to make predictions from it.
These predictions are tested and, if they prove to be correct, the latest version of RankBrain goes live. Then the cycle of offline learning and testing is repeated.
Does RankBrain do more than refine queries?
Typically, how a query is refined is not considered a ranking factor or signal.
Signals are usually content-related factors, such as the words on a page, links pointing to a page, whether a page is on a secure server, etc. They can also be linked to a user, such as where a researcher is or their search and browsing history.
So when Google talks about Rankbrain as the third most important signal, does that really mean a ranking signal? Yes. Google has confirmed to us that there is one element where Rankbrain contributes directly, in one way or another, to the ranking of a page.
Rankbrain tries to understand queries by gauging how well past SERPs have met searcher intent. Machine learning then uses that data to make predictions about what people are really looking for for the query.
These predictions come from RankBrain's vast understanding of how words relate to each other. Which brings us to the concept of word vectors.
We have already seen that Google uses the Knowledge Graph to relate words to concepts that exist in relation to each other.
But it only works with the information that is present in its database.
To go further with machine learning, Google turned to word vectors since it needed to learn the meaning behind words.
To make this effective, Google has developed an open source tool called " Word2vec ":
This tool uses machine learning and natural language processing to understand the real meaning of words on its own.
Example Word Vector
Google is continuously improving its system and the consequences are often noticeable in the ranking of the results it offers. This is one of the reasons why it is unlikely to maintain a given position in the SERPs.
SEO specialists therefore strive to know the trends related to the various factors that can affect the positioning of their website.
This is the case with Rankbrain which continues to enjoy a certain mystery as to how it works and how it relates to other ranking factors.