The ABC of HTTP (Status Codes)

Sam Marsden
Sam Marsden

On 24th November 2015 • 7 min read

 

What is an HTTP status code?

A status code is a piece of information returned by a web server in the HTTP response headers, after receiving a request for a file, such as a web page.

If the server was able to process the request for the web page, the server will usually return a 200 HTTP status, along with the requested web page. This tells the web browser, or crawler, that the request was completed successfully.

If the requested page isn’t available, the server will usually return a 404 status, with an error page. The human being can interpret the error page, but the 404 status can be used by the search engine to understand the request was not completed.

The web server needs to be configured to return the correct status code with every web page they serve.

The official list of status codes is listed on the W3C site.

Wikipedia provides a more comprehensive list of known status codes.

 

 

 

What are the important status codes for SEO?

 

200 OK

A page with a 200 status was successfully returned, and can be indexed.

This is the status code you should expect to see for important every page you’re expecting to be indexed and drive traffic.

 

2xx (not 200)

No 2xx status except 200 will be indexed.

 

 

301 Permanent redirect

A 301 response code is returned if the page has been moved to a new URL.

The ‘redirected to’ URL is also included in the header.

Although a page body is often returned with the response, it is not normally displayed, or indexed in Google.

The client which made the request will usually make a subsequent request for the redirected URL.

If the resulting URL after all the redirects is a 200, it will be indexed and the majority of PageRank will go to the target.

It’s possible to chain redirects together. Google will only follow up to 5 redirects in a single crawl. However they will continue following more URLs on subsequent crawls.

301 redirects do not pass the full authority. But if you redirect a full domain, e.g. www to non-www, or http to https, the full authority will be passed over.

Large numbers of 301 redirects and expired/low value URLs which 404, won’t directly cause any general ranking problems for the site (assuming you’re doing them properly and not losing important pages). But try to avoid chains of multiple redirects.

If you want Google to see your redirected URLs, such as after a URL change, it’s OK to submit the old URLs in a Sitemap to help Google re-crawl them more quickly.

 

302/303/307 Temporary redirect

These will be followed by Google and PageRank will flow, but the redirecting URL will stay indexed because the status code indicates this is temporary.

If a single step in the redirect chain is temporary, the entire redirect chain will be treated as temporary.

After a long time, a 302 may be interpreted as a 301.

 

304 Not modified

This response indicates the page content hasn’t changed since the last crawl.

The content body is not usually returned, and will not be seen by Google.

 

 

 

401/403 Not authorised

This status code is returned if the requested page required authentication that wasn’t provided.

These pages will not be indexed.

 

404 Not Found

These pages will be removed from Google’s index after multiple crawls

Large numbers of 4xx won’t cause penalties or SEO issues, but linked 404s affect usability. 404s in search results could affect rankings if the user experience drops.

You can submit an XML Sitemap with 404 pages to help get them removed from the index more quickly. It’s best to put them into a separate sitemap so you can see them separately to other indexable URLs.

 

410 Permanently deleted

Pages which return this status code will be removed from Google’s index after the 1st crawl, so it is a better choice than a 404 if you know a page has permanently expired.

 

 

 

500 Server error

These will be removed from the index after a single or multiple crawls.

Large numbers of 5xx may cause the crawl rate to drop temporarily.

Any URL which returns a 5xx status may be dropped from the index until is has been crawled with a 200 status.

 

503 Temporary Server Error

These will not be removed from the index immediately, like a temporary 500, but will be if the problem persists.

 

Other Status Codes

Google will simply ignore any pages which return an unrecognised HTTP status code, including all 2xx which are not 200.

 

 

How do you see the status code?

The HTTP status code is displayed in a wide variety of places.

Search Console shows the status codes for crawl errors.

Chrome developer console shows the status of every file in the Network view.

Web Sniffer
http://web-sniffer.net/

Fetch as DeepCrawl
https://tools.deepcrawl.co.uk/fetch-as-deepcrawl/

Use Fetch as Googlebot
It’s possible that a server might return a different status code, depending on who made the request. The only way to know exactly what status codes Google sees for your own sites, is to use Fetch as Googlebot in Search Console.

DeepCrawl
DeepCrawl shows the status code in every report, and on the page details view.

Author

Sam Marsden
Sam Marsden

Sam Marsden is DeepCrawl's SEO & Content Manager. Sam speaks regularly at marketing conferences, like SMX and BrightonSEO, and is a contributor to industry publications such as Search Engine Journal and State of Digital.

 

Tags

Get the knowledge and inspiration you need to build a profitable business - straight to your inbox.

Subscribe today