Search engines aim to populate their indices with high quality results that satisfy user intent. As we’ve already established, there is no “correct” amount of content to include on a webpage for it to rank well in organic search. However, pages with a relatively low amount of content, known as thin pages, may not be indexed and can devalue the quality of your site as a whole. Let’s take a closer look at the problem of thin content and how you can combat it.
What is thin content?
Thin content can be defined as low quality pages that have little to no value for visitors.
Thin pages can’t simply be described as pages with a low volume of textual content because sometimes a small amount of content from a high authority site is all that is needed to satisfy user intent. Other forms of content like images and videos can also be engaging for visitors. However, instances of pages with thin content ranking well are the exception, not the rule.
What are search engines looking for when analysing content?
Search engines will typically look for some textual signals that indicate the subject of page content, such as H1s, titles, alt text, transcriptions, etc. Pages with relatively little content run the risk of being classed as doorway pages by search engines.
Back in 2011, Google released the Panda algorithm update which aimed to surface high quality web pages in its index and reduce the presence of pages deemed to be of lower quality. The Panda algorithm has been updated many times over the years and it is now updated in real time, but thin content is still seen as a signal of low quality.
The bottom line is that thin content should be avoided, as search engines want to populate their index with high quality results which, in most cases, means providing in-depth and valuable content that users will find useful.
— Gary “鯨理” Illyes (@methode) October 7, 2015
Which types of pages typically include thin content?
Sites with a lot of thin content may receive a message in the Manual Actions page of Google Search Console, stating that low quality content has been detected. Google lists a number of different types of pages which typically include thin content:
- Automatically-generated content – Text which is generated programmatically by a variety of methods, including scraping RSS feeds, auto-translated text and nonsense text stuffed with keywords.
- Thin affiliate pages – Low value and duplicate pages referring someone to a product or service on another site.
- Thin syndication – Scraped content or low-quality guest blog posts.
- Doorway pages – Very similar pages changed slightly to rank for a different phrase with each one e.g. to rank for a different location.
Pages, like the ones listed above, are considered webspam and go against Google’s webmaster guidelines, meaning that they may receive a manual action that removes part or the whole of a site from the index.
How to detect pages with thin content
If Google doesn’t perceive that your site is using thin content in a way that is intended to deceptively influence organic rankings, you probably don’t need to worry about receiving a manual action. However, thin content is still an issue that you need to be aware of and keep on top of.
Thin content can be identified in a number of ways:
- Running a crawl – DeepCrawl analyzes the content on a page and highlights thin pages which fall below a specified word count, empty pages with no content and duplicate pages with non-unique content.
- Finding low traffic pages – Using your web analytics platform to find pages with little traffic might help surface pages with thin content.
- Identifying high bounce rate pages – Looking at pages with a high bounce rate and poor engagement metrics may indicate that a page is not satisfying user intent because of thin content.
How to improve thin content pages
It should be clear by now that thin content is an issue you want to avoid, so let’s look at some of the steps you can take to avoid it:
- Define problem areas – Start by detecting which parts of your site are suffering in organic search due to thin content.
- Prioritise the biggest wins – It might not be possible to improve the quality for all of a site’s thin pages, so prioritising resource for pages that are likely to see the biggest improvements is a sound approach.
- Focus on solving user problems – Don’t write content to hit word quotas, think about how your web pages can solve user problems in a manner which engages the target audience.
- Improve E-A-T – Examine ways you can improve content quality according to Google’s Search Quality Raters’ Guidelines which centre around expertise, authoritativeness and trustworthiness.
- Add textual signals – Ensure that all of the pages on a site are populated with textual signals that indicate the subject of a page’s content: h1s, titles, alt text, transcriptions etc.