How to define SERP intent and ‘source type’ for better analysis

Posted On 04 Oct 2022

Comment: Off

Understanding the source types that Google displays in the SERPs helps SEOs know how viable it is to effectively rank for certain queries.

SERP analysis coupled with your keyword research is a staple of any modern SEO campaign.

Analyzing search intent is already a process within this. But when it comes to SERP analysis, all too often I see reports that stop at classifying a result by its intent – and that’s it.

We know that for queries with multiple common interpretations, Google works to provide a diversified results page with differentiations often being:

Result intent (commercial, informational).
Business type (national result, local result).
Aggregators and comparison sites.
Page type (static or blog).

And then when planning content we might develop a strategy based on Google ranking some informational pieces on Page 1, so we’ll create informational pieces too.

We may also use a tool to “aggregate” metrics on the first page and create artificial keyword difficulty scores.

This is where this strategy falls down, and in my opinion, will continue to show diminishing returns in the future.

This is because the majority of these analysis pieces don’t acknowledge or take into account source type. I personally believe that this is because the Search Quality Rater Guidelines that have led to E-A-T, YMYL, and page quality becoming a major part of our day-to-day workings don’t actually use the term source type, but it does talk about assessing and analyzing sources for things like misinformation or bias.

When we start to look at source types, we also need to look at and understand the concepts of quality thresholds and topical authority.

I’ve talked about quality thresholds, and how they relate to indexing in previous articles I’ve written for Search Engine Land:

But when we relate this to SERP analysis, we can understand how and why Google is choosing the websites and elements it is to form the results page and also achieve an idea of how viable it may be to effectively rank for certain queries.

Having a better understanding of ranking viability helps with forecasting potential traffic opportunities and then estimating leads/revenue based on how your site converts.

Get the daily newsletter search marketers rely on.

See terms.

Defining source types

Defining source types means going deeper than just classifying the ranking website as informational or commercial, as Google also goes deeper.

This is because Google compares websites based on their type, and not just the content being produced. This is particularly prevalent in search results pages for queries that can have a mixed intent and returns results of both commercial and informational intent.

If we look at the query [rotating proxy manager] we can see this in practice in the top 5 results:

#	Result Website	Intent Classification	Source Type Classification
1	Oxylabs	Commercial	Commercial, Lead Generation
2	Zyte	Commercial	Commercial, Lead Generation
3	Geek Flare	Informational	Informational, Commercial Neutral
4	Node Pleases Me	Informational	Open Source Code, Non-Commercial
5	Scraper API	Informational	Informational, Commercial Bias

Quality thresholds are determined by the website’s identity, general domain type (not just the blog subdomain or subfolder) and then context.

When Google retrieves information to compile a search results page, it will compare websites being retrieved first based on their source type group first. So in the example SERP, Oxylabs and Zyte will be compared first against each other, before the other source types elected for inclusion or that rank highest based on weighting and annotation.

The SERP is then formed based on these retrieved rankings and then overlaid with user data, SERP features, etc.

At face value, by understanding the source types that Google is choosing to display (and where they rank) for specific queries we can know whether they are viable search terms to target given your source type.

This is also common in SERPs for [x alternative] queries where the business may want to rank for competitor + alternative compounds.

For example, if we look at the top 10 blue link results for [pardot alternatives]:

#	Result Website	Intent Classification	Source Type Classification
1	G2	Informational	Informational, Non-Commercial Bias
2	Trust Radius	Informational	Informational, Non-Commercial Bias
3	The Ascent	Informational	Informational, Non-Commercial Bias
4	Capterra (Blog)	Informational	Informational, Non-Commercial Bias
5	Jotform	Informational	Informational, Non-Commercial Bias
6	Finances Online	Informational	Informational, Non-Commercial Bias
7	Gartner	Informational	Informational, Non-Commercial Bias
8	GetApp	Informational	Informational, Non-Commercial Bias
9	Demodia	Informational	Informational, Non-Commercial Bias
10	Software Suggest	Informational	Informational, Non-Commercial Bias

So if you are Freshmarketer or ActiveCampaign, while the business may see this as a relevant search term to target, and it aligns with your product positioning, as a commercial source type you’re unlikely to gain Page 1 traction.

This doesn’t mean to say that having the messaging, and comparison pages on your website are not important pieces of content for user education and conversion.

Different source types have different quality thresholds

Another important distinction to make is that different source types have different thresholds.

This is why third-party tools that produce keyword difficulty scores based on a metric such as backlinks for all results on Page 1 have issues, as not all source types on the majority of SERPs are judged in the same way.

This means that in order to ascertain the “benchmark” for what it will take your website and content to get into a traffic-driving position, you need to compare against other websites with the same source types, and then the type of content that they’re ranking with.

Topic clusters and frequency

Establishing good topic clusters and having easy-to-follow information trees allow search engines to understand your website source type and “usefulness depth” with greater ease.

This is also why, in my opinion, for a number of queries in the same space (e.g., tech), you are likely to see websites akin to G2 and Capterra frequently for a broad range of queries.

A search engine can have a greater level of confidence in returning these websites in the SERPs, regardless of the software/tech type, as these websites have:

High publishing frequencies.
A logical information tree.
Developed a strong reputation for helpful, accurate information

When developing webpages within the topic clusters, aside from semantics and good keyword research, it’s also important to understand the basics of natural language interfaces, particularly the Stanford Natural Language Inference (SNLI) corpus.

The basics of this are that you need to test the hypothesis against the text, and the conclusion is either that the text entails, contradicts, or is neutral against the hypothesis.

For a search engine, if the webpage contradicts the hypothesis, then it will have low value and shouldn’t be retrieved or ranked. Whereas if the webpage entails, or is neutral against the query, then it can be considered for ranking to both provide the answer and potential non-bias perspective (depending on the query).

We do this to an extent through content hubs/content clusters that have become more popular in the past five years as ways of demonstrating E-A-T and creating linkable, high-authority assets for non-brand search terms.

This is achieved through good information architecture on the website, and being concise in our topical clusters and internal linking, making it easier for search engines, at scale, to digest.

Understand source types to inform your SEO strategy

By better understanding the source types ranking most prominently for the target search queries, we can produce better strategies and forecasting that yield more immediate results.

This is a better option instead of driving toward search terms that we’re simply not appropriate for and won’t likely see a return in traffic against the resource investment.