An SEO’s work is never done… just when you thought you were finished with Penguin audits, backlink audits and endless rounds of testing, it’s time to catch-up with Panda and make sure no thin or poor quality content has appeared on your sites.
In October 2015, Google’s Gary Illyes confirmed that Panda 4.2 is still rolling out after two and a half months and it’s expected to continue rolling out slowly for a some time yet.
Why pander to the Panda?
In essence, thin content (or ‘shallow’ content as it’s sometimes called) is lacking in substance and will discourage human engagement with your site.
From Google’s point of view, thin content could mean duplicate or similar content (internally or externally) or pages with a high proportion of navigation/image/dynamic elements and not enough copy.
Panda is also designed to crack down on sites with too many blank pages, ad-stuffing and technical glitches that hinder a user’s experience.
If you suspect that you or your clients’ sites have indulged in this type of content, you’ll need to find it and get it off Google’s radar sharpish. And, by sharpish, we mean right now. As Glenn Gabe mentioned in his excellent Search Engine Watch Panda audit post, just because you might have recovered recently doesn’t mean that you won’t get hit again in the next update if your site continues in the way it has.
But, in order to remove it, first you’ll need to find it. As it happens, we know just the tool for the job…
Five steps to Panda perfection with DeepCrawl
Here’s how to use DeepCrawl to optimize any Panda audit:
1. Run a Universal Crawl for the full site
A Universal Crawl will crawl the site, XML Sitemaps and organic landing pages in a single crawl, and import Google Analytics data, to identify gaps in the site architecture.to find every URL.
Make sure that you have Google Analytics integrated for additional engagement data to measure the quality of your pages.
2. Find low-quality sections of the site using Site Explorer
Use the Site Explorer report, setting the drop down to Analytics mode.
The Average Bounce Rate, Time on Site and Page Views per Visit metrics are a great way to identify any low-quality sections of your site.
Any sections which don’t drive any organic visits aren’t adding any SEO value, so consider removing them from Google’s index altogether by noindexing or canonicalizing where appropriate.
3. Find thin pages using Site Speed mode
The Content Size option will show you all content that is regarded as ‘thin’ and that could cause a Panda penalty.
4. Find non-indexable pages with the Architecture mode
This will show you the sections that are already non-indexable and won’t be causing you any issues.
5. Make thin content non-indexable on your site
Remove thin or low-quality content from Google’s index to prevent search engine users from being able to land on it from a search result.
Add a noindex and/or canonicalize where appropriate (if you’re unsure which option to choose, use our guide to noindex, disallow and nofollow here).
You can then run another crawl to check that the changes you’ve made to the site have affected the site as you intended.
Ongoing checks: useful DeepCrawl reports
Use these reports to help identify areas of improvement for user experience:
Fix broken links with the Validation > Internal Broken Links and Validation > External Broken Links reports.
Fix slow pages with the Validation > Max Load Time reports.