Leaked Google documents have provided an unprecedented glimpse into Google Search, revealing around 14,000 potential ranking factors and features. This leak is invaluable for SEOs, as it offers a deeper understanding of the search algorithm, allowing them to optimize websites more effectively.
The leak consists of documentation that apparently originated from Google’s internal Content API Warehouse and was accidentally released by an automated bot in March of this year. However, these valuable insights only started circulating at the end of May.
Key Insights from the Leaked Google Documents
It's important to note that the following information reflects the status as of March 2024 and may have since changed. Additionally, the weight of each ranking factor remains unclear. Below are what we consider the most important findings:
- Trust of the homepage plays a significant role in ranking
- Content freshness is important
- Link diversity and relevance are crucial
- New links seem to outweigh existing ones
- A function called titlematchScore measures how well the page title matches the search query
- The pageQuality function assesses the "effort" involved in creating article pages, determining whether they are easy to replicate. Elements like images, videos, and unique information can positively impact the "effort" calculation
- Originality is key for short content: OriginalContentScore decides whether the content is considered "thin content.
- Low-quality content on subpages can affect the entire website's ranking
- Google counts the number of tokens in documents. There may be a limit beyond which content is truncated. Therefore, important content should be placed as high as possible in the text
- The first paragraphs under headings should answer the search query briefly and precisely
- Google checks authorship: Authors who focus on a specific subject area and publish multiple works are considered more relevant
These points are primarily useful for further content optimization. The leak also contains many technical details, a few of which are highlighted below:
- Google stores a copy of every version of every page it has indexed, meaning it can "remember" every change ever made to a page
- Google can determine how many results per content type (blog posts, news articles, etc.) are displayed on the SERPs
- A function called Navboost (click-related search metrics) is used to promote or demote results, distinguishing between badClicks, goodClicks, lastLongestClicks, and unsquashedClicks
- Google uses a siteAuthority metric to evaluate the authority of websites on specific topics
- There are at least seven different types of PageRank to determine a page's link popularity
- If subpages do not yet have their own PageRank, it is based on the PageRank of the homepage
- siteFocusScore, siteRadius, siteEmbeddings, and pageEmbeddings are used for ranking
- Poor site navigation harms ranking
Leak Reveals Reasons for Content Demotion
The leaked documents also outline the criteria that can lead to content demotion, including:
- Links that do not match the target page
- Specific user behavior that indicates dissatisfaction
- Poor navigation and user experience
- Exact match domains
- Negative product reviews
- Demotion of global websites, potentially favoring local sites in specific regions
- Pornography
Implications and Consequences of the Leaked Information
"Google lied!" - This accusation is now being leveled at the tech giant, as some of the leaked information contradicts Google's official statements. For example, it was claimed that clicks are not a ranking factor, though it has long been an open secret that user behavior does impact ranking. It's important to understand that Google likely withheld many of these details intentionally to prevent manipulation of the rankings. While some SEOs have lost trust in Google, others see this positively, as the leak reveals many additional ranking factors that can help further improve content rankings on the SERPs.