Decoded: Google Quality Rater Guidelines – Content Quality

Since Google’s Search Quality Evaluator Guidelines themselves were leaked way back in 2014, I and a few others like Cyrus Shepard, Marie Haynes, Jennifer Slegg and Lily Ray have worked (thanklessly, in some instances) to highlight the true significance of the Search Quality Evaluator Guidelines (SQEG) — and this article aims to finally prove that significance.

Starting my career in web development in the early 2000s and moving to specialist SEO in 2006, my profession has always been the craft of reverse-engineering a black box — inferring how Google works from tests and observations.

The 2024 Google Content Warehouse data API leak, however, shattered that box.

For the first time, we have the blueprints to the actual system.

However, possessing the blueprints is not the same as understanding the machine.

The leak provided thousands of attributes, but it did not come with an instruction manual. It did not explicitly state, for instance, that contentEffort = the "Effort" section of the Quality Rater Guidelines — although there were some interesting annotations provided in some instances.

What makes this analysis distinct is the level of logical inference required to bridge that gap.

I had to make an experienced judgment to link the specific guideline regarding “originality” to the specific attribute OriginalContentScore.

While other analysts might mention these two things are related, this text standardises that relationship into a formal “canon of truth.”

This work is what I call Creative Inference Optimisation. It weaves together the foundational principles from Google’s human-focused Search Quality Evaluator Guidelines with the hard-coded reality of the API leak to create a single, coherent framework.

Here’s the relevant extract from the quality rater guidelines:

Diagram of Google content quality signals mapped to leaked attributes such as contentEffort, OriginalContentScore and siteAuthority

Section 3.2: Quality of the Main Content (Re-Engineered)

The quality of the Main Content (MC) is one of the most important considerations for PQ rating [Q* (Leaked in DOJ Trial), quality_score, siteAuthority].

The MC plays a major role in determining how well a page achieves its purpose [titlematchScore, commercialScore].

The unifying theme for evaluating the quality of the MC is the extent to which the MC allows the page to achieve its purpose and offers a satisfying user experience [NavBoost, goodClicks, lastLongestClicks].

For most pages, the quality of the MC can be determined by the amount of effort [contentEffort], originality [OriginalContentScore], and talent or skill [isAuthor, authorReputationScore, productReviewPUhqPage, productReviewPPromotePage] that went into the creation of the content.

For informational pages and pages on YMYL topics [chard, orbit_medical_classifier_trigger], consider the extent to which the content is accurate and consistency with well-established expert consensus is important [ymylHealthScore, ymylNewsScore, chard, is_debunking_query, consensus_score, consensus_num_passages_agree, consensus_num_passages_contradict, consensus_num_passages_neither].

Effort

Consider the extent to which a human being actively worked to create satisfying content [contentEffort].

Effort may be direct, such as a person translating a poem from one language to another. Effort may go into designing page functionality or building systems that power a webpage, such as the creation of a page that offers machine translation as a service to users.

On the other hand, the automatic creation of thousands of pages [numOfUrlsByPeriods] by running existing freely available content through existing translation software without any oversight, manual curation, etc., would not be considered to have effort [spamScore, copycatScore].

For pages like social media posts or forum discussions, the level of participation and depth of conversation is an important part of effort [ugcDiscussionEffortScore]. Contributions from multiple individuals on such pages can add up to a significant amount of total human effort.

Originality

Consider the extent to which the content offers unique, original content [OriginalContentScore] that is not available on other websites.

If other websites have similar content [shingleInfo], consider whether the page is the original source [contentFirstCrawlTime].

Talent or Skill

Consider the extent to which the content is created with enough talent and skill [authorReputationScore, isAuthor] to provide a satisfying experience for people who visit the page [goodClicks].

Accuracy

For informational pages, consider the extent to which the content is factually accurate.

For pages on YMYL topics [chard], consider the extent to which the content is accurate and consistent with well-established expert consensus is important [ymylHealthScore, ymylNewsScore, chard].

Additional Contextual Factors

The purpose of the page, topic of the page [siteFocusScore, siteRadius], and type of website [smallPersonalSite, isLargeChain] all play a role in how to evaluate the quality of the MC.

For example, consistency with well-established expert consensus is important for medical advice [healthScore].

Skill is important for how-to videos. Talent and originality is important for artistic expression [nimaAva, styleAestheticsScore].

The amount of effort expected for a short video shared on social media is less than for a full-length, professionally produced documentary on a streaming video website, but both need sufficient effort to create satisfying content for their purpose.

Think about what effort [contentEffort], originality [OriginalContentScore], talent, or skill looks like for the type of page that you are evaluating.

Evaluating a Page

For each page you evaluate, spend a few minutes examining the MC before drawing a conclusion about it.

Read the article, watch the video [docVideos], examine the pictures [docImages], use the calculator, play the online game, etc.

Remember that MC also includes page features and functionality, so test the page out. For example, if the page is a product page on a store website [shoppingProductInformation], put at least one product in the cart to make sure the shopping cart is functioning.

If the page is an online game, try to play the game yourself. Do your best to imagine that you are someone who's very interested in the topic, functionality, or purpose served by the page, then think about how satisfying the MC would be for that person [NavBoost, goodClicks].

High (goodClicks) and low (badClicks) quality MC comes in all formats (e.g., text, audio, video, images) [docImages, docVideos, nimaVq] and all lengths (e.g., short-form videos and full-length professional documentaries).

High and low quality content also exists on all types of websites, from small personal sites [smallPersonalSite] to large corporate sites [isLargeChain], from forums and social media [ugcScore] to websites that handle financial transactions.

Think carefully about what helps the page achieve its purpose and what makes the MC satisfying for users [lastLongestClicks].

Disclaimer: The Necessity of “Creative Inference Optimisation”

Why this logical inference is required:

This analysis utilises a methodology defined by myself as “Creative Inference Optimisation”. This approach is necessary because:

1. The Missing Instruction Manual

The Google Content Warehouse leak provided the “blueprints” (the raw code and attributes) but not the operating manual.

We have the variable names (e.g., contentEffort), but Google does not explicitly state, “This variable is used to measure E-E-A-T.” nor does it give us the “Weights and curves”.

2. Bridging the Gap

There is a distinct gap between Google’s public-facing advice (the Search Quality Evaluator Guidelines) and its internal engineering reality.

Logical inference is the only tool available to bridge this gap by mapping the human concepts (like “Effort”) to their likely machine-readable counterparts (like [contentEffort]).

Furthermore, the public record confirms that the leak did not include specifics on the “curves and thresholds” (a key phrase referenced in the DOJ vs. Google antitrust litigation) that determine how factors are weighted, necessitating expert interpretation.

3. Evidence-Based Interpretation

While the documents are authentic, the specific connections drawn here (e.g., linking siteAuthority to Q*) represent my “expert interpretation” and “logical inference”.

This method moves the industry from pure guesswork to an “evidence-based framework”.

References

Google Helpful Content Guidelines (Google's Webmaster Guidelines) – Provides official guidance.
Latest Search Quality Evaluator Guidelines – The September 11, 2025 version provides the qualitative standard for content quality.
The Google V DOJ Trial – My primary analysis of the U.S. DOJ vs. Google litigation and sworn testimony.
The Content Warehouse Leaks – My forensic analysis and interpretation of the Google Content Warehouse API attributes.
Leak Exploits and Analysis (Mark Williams-Cook) – Forensic and comparative analysis of exposed ranking mechanisms.

A Final Note

This work is intended to demystify the search engine’s operations, transforming the black box into a blueprint for you to understand. Google will never confirm if this is true - it is not official.

I, as special advisor to Searchable.com, am working to demystify Google Search, transforming the black box into a blueprint for you to understand.

I encourage you to embrace critical thinking in all your SEO analysis. The foundation for this entire discussion lies within that corpus of documents. If you have the drive to prove or refine these inferences, the material is there for you to explore for yourself.

This analysis represents a methodology I call Creative Inference Optimisation. It bridges the gap between the 'product specification' found in Google's human-focused Search Quality Evaluator Guidelines and the 'engineering blueprints' exposed by the Content Warehouse API leak. The result is a single, evidence-based framework for modern SEO.

I welcome feedback or edits to all my works, so please get in touch.

This is my art. I hope you enjoyed it.

Welcome to Searchable.com.

About the Author: Shaun Anderson (AKA Hobo Web) is a primary source investigator of the Google Content Warehouse API Leak with over 25 years of experience in website development and SEO (search engine optimisation).

AI Usage Disclosure: Shaun uses generative AI when specifically writing about his own experiences, ideas, stories, concepts, tools, tool documentation or research. His tool of choice for this process is Google Gemini Pro 2.5. All content was conceived, edited, and verified as correct by Shaun (and is under constant development). See the

Searchable AI policy