What is site architecture?
Site architecture is the foundation of the technical health of any website. Site performance optimisation is incredibly important, there’s no doubt about that. However, to reap the rewards of mobile usability enhancements, site speed improvements or better international targeting, you need to make sure your website is structured and organised correctly.
Access the Downloadable PDF Version of this Guide
The concept of site architecture is made up of two key elements:
- Categorisation: The types of pages on your site.
- Linking: How the pages on your site are connected.
Site architecture is a combination of the different pages on a website, and the way they are linked to one another.
Analysing site architecture is vital for an SEO, because it fundamentally affects how search engines are able to crawl your website, and how users are able to navigate it in order to convert.
Your website lives or dies based on how well architectured, designed and integrated it is. Every piece of content you publish, every article you write, every product you sell and every campaign you run is constrained, or set free, by your architecture.
Your primary navigation needs to reflect your users’ need groups. Your taxonomies need to intuitively connect your content. Your internal linking needs to allow users and search engines to explore contextually. Your lateral navigation needs to expose secondary routes without overwhelming users with choice. And all of this needs to change, evolve and adapt as your site continues to grow and change.
Design and manage a well-structured site and make a million tiny tweaks to your website’s content, structure and linking, every day, until you die. That’s how you win.
To start you off with some key areas to be thinking about when exploring site architecture, Bastian Grimm of Peak Ace AG has put together a list of the most common site architecture errors that he has found when auditing websites:
One major topic is always cannibalisation, in its multiple facets.
- Inconsistency in internal linking, with signals referring very differently within one domain. Examples include inconsistent use of naming regarding internal anchor texts (e.g. the same anchor text is used for multiple URLs, or a single URL has a different/unspecific anchor text). As anchor texts transmit relevancy, we often see improvements after implementing a consistent internal linking structure.
- When multiple URLs are targeting the same keywords. Besides the general duplicate content problem, what often happens is that none of the pages generate any top positions. Therefore, setting up a proper targeting/indexing strategy and monitoring it over time is essential.
Efficiency is also very important – you need to eliminate waste as best as possible and optimise your domain in such a way that Google can process it quickly and efficiently. Wasting Google’s resources is like throwing money out of the window (for them) – a situation Google doesn’t like at all.
- If multiple links (from the same source URL) refer to one and the same destination. From an SEO perspective, this isn’t necessary, also link equity is not passed on additionally.
- Badly maintained sitemaps. Sitemaps that endeavour to serve everything are a waste of resources and should be rectified. They should only serve HTTP 200 indexable, non-canonicalised URLs. To ensure that your sitemaps are clean and are therefore encouraging Google to discover new pages, crawl them before uploading. If they are okay, submit them using GSC. The new GSC provides much better data with regards to correctly implemented sitemaps.
- The widespread lack of maintenance that still exists across all industries is shocking. Even now we still see cases in which 302s are being used instead of 301s for permanent URL redirects, internal link redirects or even broken pages.
- Poor management of sorting and filtering URLs. Based on log files, we often see that Google runs into these generated, duplicate URLs for no reason. Letting Google crawl all your sorting and filtering URLs is simply a massive waste of resources. There are common and (still) functioning workarounds to prevent these cases (i.e. Post-Redirect-Get), but using the nofollow attribute is not a solution here.
- It is vital to be as fast as possible for both Google and the user. As your crawl budget is more or less based on computing timing and Google wants to process your domain quickly, fast loading websites have a clear advantage.
Depending on the number URLs a domain has, correctly setting up crawling directives is essential.
- As Google favours well-structured informational architecture, you need to understand the situation or problem you want to solve with your pages and then make an informed decision. However, directives like robots.txt and robots meta noindex are often misused or combined and therefore cannot work properly (e.g. you should never mix these two because the crawler can’t process the noindex tag within the <head> if the page is blocked by the robots.txt).
- As we know, the canonical tag is more of a hint than a directive and it is often the case that Google ignores it. Therefore, over-reliance on canonical tags can lead to messing up the index. Rectifying such situations can take months and a lot of re-crawling by Google. Therefore, check and monitor your canonical tags frequently to see if they are functioning as desired.
The last big field that is always among the top topics for site architecture is prioritisation.
- Surprisingly, we often see a lack of planning (or even no planning at all) that ultimately leads to “chaos” (e.g. the most relevant pages being only partially linked from the homepage). Special attention should be paid to internal prioritisation and the accessibility of important pages. A method of consistently monitoring your setup is also vital. If you are planning to change important navigational elements or you are just restructuring your template, you should always check the condition of your site both before and after the changes go live on a staging system.
- Another common example of incorrect prioritisation is when product pages get an equal amount of links to top category pages. As category pages are often much more valuable, you should analyse and monitor the proportion of links referring to these different URL types. If needed, you should think about changing your internal linking in favour of category pages.
Let’s take a closer look into some of the key considerations for search engines and users with regards to site architecture.
How site architecture impacts search engines
Search engines will do their best to crawl any websites they find without any guidance. However, we can make the search engines’ jobs much easier if we can organise our websites in an understandable way for their crawlers. Without the clear structuring of content, your most important pages could be completely overlooked by Google; is that a risk you want to take?
Site architecture is crucial for SEO because it impacts search engine crawlers in terms of crawling and indexing. Not only does site structure affect a search engine’s ability to navigate a website and find pages to add to its index, but it also helps in demonstrating the importance and topical relevance of different pages, which is key for ranking.
It’s impossible to overstate the importance of a good site architecture for SEO. Search engines take a lot of signals from the way a site is structured and how information is categorised. With a good site structure, you can send strong semantic signals to search engines and also help your users find the right content quickly as they navigate through your site. It’s especially important to look at aspects like click depth, content hierarchy, and proper labelling of links and sections. Following best practices for information architecture tends to result in an SEO-friendly website that often has an edge over competitors.
Internal linking is particularly important because it determines how deep a page is within a site’s architecture, which directly impacts Googlebot’s crawl rate.
Depth of content affects crawl rates.
-John Mueller, Google Webmaster Hangout
The way a site is internally linked also affects which pages get crawled and how frequently. If an important page is only linked to a couple of times, Google will crawl it less.
Internal linking affects crawl frequency.
-John Mueller, Google Webmaster Hangout
The structure and internal linking of a website will decide how often and how efficiently Google will be able to crawl it.
It’s important to consolidate the pages on your website and monitor the links between them to keep search engines happy and able to continue crawling with ease. This means keeping on top of ‘dead-end’ pages.
Crawlability is a major function of site architecture. Broken links hurt Google’s ability to index your website and recommend its content.
To keep Google happy (and provide a great user experience), periodically crawl your website for errors. Also, see issues exactly as Google does with the Index Coverage report in Google Search Console.
If you find any broken links or determine that you need to delete outdated content, create a redirect for each issue. Redirects can help Google understand things like whether you’ve just moved content to a different URL or deleted it completely.
If you use WordPress, there are plenty of plugins (like Yoast SEO Premium) that can help you do this without having to go into your website’s backend.
We’ll go into more detail and share some best practice advice on utilising internal linking to help improve site architecture later on in this guide.
How site architecture impacts users
The quality of your site architecture doesn’t just impact how search engines can crawl and index your content, it also affects users. This is arguably a more pressing issue as user experience becomes more and more intrinsically tied to rankings, especially for Google.
A site’s structure will determine a user’s journey, which pages they are more likely to land on and are able to navigate to, and whether or not they can complete their primary goal for visiting the website. This will impact user engagement, which will ultimately determine whether or not the user has a positive or negative experience with the brand. That’s a big deal. To keep users happy, you need to consistently match their expectations by mapping pages on a site to their intent.
Focus on structuring your site so that users are able to find what they’re looking for as quickly and as often as possible.
At the foundational level, your architecture should be guided by understanding users and the way that they think about your offerings. The more your categories and site taxonomy match up with your users’ mental maps, the more intuitive navigating the site will be. Some of those learnings can come through keyword research (helping to match the terminology that people are using). What’s more instructive is gathering people from your target demographics and watching them perform different tasks on your site. Your user experience has to be the foundation of the architecture you choose.
Just like the other aspects of SEO today, focusing on improving a website’s site architecture for humans first and foremost will lead to a winning search strategy.
The most common pitfall of information architecture is piling on content, images, and links ‘for SEO’. Your website is a window into how your organization runs. When a user feels that it is smart and sophisticated, they tend to stick around. When their next step doesn’t seem clear and intuitive, decision fatigue sets in and abandonment rates rise.
When investing in flashy features that don’t satisfy user intent, ask yourself, “Does this create momentum?” If the effort to create and maintain a resource doesn’t move a user toward the next macro or micro-conversion, be curious and question how the investment can better serve your business and audience.
The difference between URL structure and site architecture
There can often be confusion in the SEO industry about the difference between site structure and URL structure, and the two terms are sometimes used interchangeably. It’s important to note that there are distinct differences between the two.
Site structure refers to the entire architecture of a website and how all of its different pages are connected, whereas URL structure refers to the content of an individual URL string, as well as how it is formatted.
URL structure should be treated as supplementary to a site’s architecture and hierarchy, because this is a useful signal for users as the words in a URL can convey meaning and context to them while browsing a website. However, site architecture focuses on internal linking and the findability of pages.
Click depth determines page importance more than URL structure.
-John Mueller, Google Webmaster Hangout
Google sees pages that are one click from the homepage as the most relevant pages on a site and are given more weight in the search results. This is a much more important aspect for search engines that the structure of a URL.
Site architecture and URL structure are different concepts. URL structure should simply be used as a signal for conveying context to users.
Chapter 2: What is Information Architecture