Ask the Expert: Kevin Indig Answers Your Internal Linking Questions
For our first DeepCrawl webinar of 2019, we were joined by Kevin Indig of German Accelerator, who sat down with Jon Myers to talk about the important topic of internal linking. During the webinar with Kevin, so many questions were submitted by our audience that there wasn’t time to answer all of them live.
Luckily, Kevin went away and wrote up his thoughts on the remaining questions, which we’ve included in this post. Read on to find out what he has to say.
Can you go deeper into the business case for centralized internal linking?
Sites with decentralized internal linking have multiple conversion touchpoints or different formats for signing up. Centralized sites have a single user flow and funnel which points to one key page. Apply the model of either focus or balance depending on how many and what types of conversion pages you have. Remember to focus on the pages that are most important to the business.
Which model would you promote, centralized or decentralized internal linking?
There is no better or worse option. Both models are different but you are bound to one of the two depending on your business model.
Take a look at our webinar recap which explains what centralized and decentralized linking strategies are in more detail.
Have you found differences in the performance of links based on their placement?
The ideas from Kevin’s presentation are based on optimization at scale, which means changes to the top navigation, secondary navigation and footer. In-body links move the needle too but these need to be optimized manually which is less efficient.
Can you explain what CheiRank is?
CheiRank is an inverse PageRank, that refers to the outgoing link equity that is passed internally. PageRank is the link power received, but CheiRank is the link power that is given away. You have to look at both to work out what the balance is.
Does having a mega menu increase CheiRank?
If you have all of your categories included in a mega menu this increases outgoing links. However, this doesn’t mean it’s good or bad as it depends on how much PageRank you get to start with. This isn’t a problem for Amazon as they get so much PageRank that they can afford to have a lot of outgoing links and links in their navigation. Make sure you have the data around how much PageRank you have to start with.
Does CheiRank factor in links from the different navigations or not?
Yes, CheiRank should factor in all internal links.
How can you improve internal linking to increase how quickly new pages, such as ecommerce products, are crawled and indexed?
First get the data to verify that crawling could be an issue for your site. Get a list of URLs that aren’t getting crawled and use log files to see how quickly they were crawled by Google after going live. Then work out the correlation between infrequently crawled URLs with indexing.
Try linking new products on the homepage or prominent category pages. Think about how to tease these products on pages that are getting crawled a lot already, for example, could you show 40 products on the homepage instead of 20? Also, try shortening the clicks between getting from the homepage to new product pages by using an HTML sitemap.
Which tools would you recommend for checking crawl rates?
Look at server logs for this, but how you analyse log files depends on what kind of server you use for log file hosting. Kevin uses Splunk as his log file tool of choice.
Which tools for would you recommend for collecting data on incoming and outgoing links?
Kevin’s tool of choice is DeepCrawl because it gives you a full crawl of a website showing the number of URLs there are and what their link value is.
How do you convince others to give you access to log files?
Beer and pizza works for developers! In all seriousness, it depends on what the person’s concerns are around giving the log files. For example, log files can sometimes contain personal identifiers which could cause an issue with GDPR. In this instance, tell them that you only need anonymized data and only need to see the search engine bot user agents and IP addresses. You may need to sign an NDA in some cases, but it should be harmless to have this data.
How often should you be changing the navigation, especially for ecommerce sites?
To be able to grasp the impact of internal linking changes, you should wait 2 to 4 weeks until Google has had time to process these changes. This will also help cancel out any noise from seasonality.
Roll out internal linking changes in batches to make sure you can attribute these changes to performance impact. The rate of changes you can make depends on how frequently your site gets crawled. If 90% of your pages get crawled every day then making changes fairly frequently should be fine.
Is having a large number of internal links in the footer an issue if they do not get cached?
Caching and internal links should not be related. I assume you mean that when you look at the cached, indexed version of a page, you don’t see the footer links. That is indeed a bit weird and you should use the fetch and render tool in the old search console to find out more.
You should be able to see whether Google follows your footer links in your log files. I would recommend you to verify that. If Google follows the links, you should be fine. If not, you should find out why.
How do you manage internal links for large websites?
That’s a broad question but let me try to give you an actionable answer. In the webinar, I explained how to distinguish between centralized and decentralized internal linking. The larger a site, the sharper that model gets.
There are two types of internal links: manually set and programmatic links. Programmatic links are those from link modules like top navigation or footer. To optimize internal linking at scale, you want to focus on programmatic links. That doesn’t mean in-body or content links aren’t important – they very much are – but it’s hard to modify them at scale.
So, you should try to adjust programmatic internal links to your type of site, centralized or decentralized, to give those pages the most PageRank that have the strongest competition from a keyword perspective.
Is there such a thing as too many outlinks on a page? How do you typically deal with menus for large ecommerce sites, do you show all subcategories?
There is such a thing as too many outgoing links on a page but it highly depends on the incoming PageRank. Years ago, I helped a site in the real estate industry that was “dying” by tremendously decreasing the number of outgoing links. It helped them recover and ultimately made them stronger than before. However, the answer to that cannot be given universally. In each case, I’d look at how much internal PageRank the pages with many outgoing links get and then make an adjustment based on that.
For large Ecommerce sites, I sometimes link only to high-level categories and sometimes include the sub-categories as well. It depends on how often each page is crawled. It’s something that I start somewhere and then iterate based on testing different variations and degrees. A healthy medium is to link to the most important products (from a $$$ perspective) in the top nav and then link to main categories. On top of that, I like to add links to sub-categories in the footer if I don’t have them in the top nav.
We use a lot of third-party vendors for our products so we link to a lot of third-party sites in our navigation. How might this affect our link balance?
That means you have lots of links to other sites on each of your pages. I wonder if that’s really necessary or if a single page with outgoing links wouldn’t suffice? I’d have to see your site and how it’s set up, but generally, I would question that model.
Page Authority could be different, but should Domain Popularity or Domain Authority be the same for each internal URL?
Right, you want an equivalent to domain popularity on a page-level, meaning the number of links from unique domains to a page.
What kind of changes would you make to reduce internal linking to topic or category pages that might be hoarding PageRank?
You have two options: either add more links to topic/category pages so they can pass on more PageRank or take away more links from other pages that point at topic/category pages.
How do 5xx errors impact crawl rate?
Great question! What I have seen is that crawl rate increases a bit. It seems to me that Google understands that most outages are temporary, so it comes back to see when the site is available again. However, if 5xx persists too long the crawl rate decreases. Would love to hear what you and others have noticed!
Do you suggest directing PageRank to product pages so that those are more likely to show up in search results than category pages?
In most cases, yes, but that often means that PageRank goes through category pages, which strengthens them as well. The fact is that you want category pages to rank as well. But PageRank shouldn’t be focused solely on category pages. When you aim it at product pages, category pages benefit as well.
How do you score a page for external links if it has a lot of links but only a small number are good value, do you just count the unique domains?
Yes, I would suggest counting the unique linking domains. However, you can also use a proprietary metric like Domain Authority and such. The most important point is to stay consistent.
Have you seen any large sites use dynamic or automatically changing internal links to promote trending topics?
I haven’t seen it in the wild because it’s really hard to get right. But I’ve seen some experiments and tests around it. Dynamic internal linking is tough because if you overdo it, Google will penalize you. I’ve also seen some attempts where sites linked every word or every second with certain anchor text to a specific page in an automated style. It didn’t go well.
What can you use instead of log files if it’s very difficult to get them from your clients or if you can only get daily data because the site is so big?
I don’t think there’s an alternative to log files. But you can filter them for Google user agent to trim down the size. I also suggest using BigQuery to query and analyze them, or use a dedicated tool.
How much internal link anchor text diversification is important to increase ranking impact nowadays?
There are two approaches: either stay consistent with internal anchor links or try to use synonyms when pointing at the same page. I had good success with the latter. However, I can’t really give a degree or metric to quantify how diversified anchor text might be. I’d just crawl my site and see if I found any nonsensical anchor text if I was you.
If a marketplace also has microsites, should it have a centralized or decentralized internal linking structure?
The concept is driven by where your conversions happen. So, in your case, it would still be a decentralized model because users book freelancers on the respective category page for specific freelancer industries. You’re probably asking because you’re building two sides of the market, supply and demand. Conversions on the supply side (freelancers) probably happen on a few landing pages, so this would theoretically qualify for a centralized model. However, since those landing pages will probably not attract a lot of links or PageRank, SEO will be driven by your category and search pages. Thus, the decentralized model is the right way to go for you.
What are your suggestions for implementing the centralized model for higher education websites?
You don’t have to have a “features” or “solutions” page for this. However, it helps to zoom in on various value aspects of your product. For a college, it could be the curriculum for different subjects or what it’s like living on campus.
Do you often find that the homepage is the largest hoarder of links?
In most cases, yes. However, at Atlassian, for example, it was a product landing page that received most PageRank and it’s tough to just add 100 links to a product landing page. After my view, the homepage is a distribution page and landing page for new visitors, unless it’s also the login for a product.
Do you think that contextual links hold more weight than links in the navigation such as the footer?
I think they hold a bit more weight but if you want to factor that in, the model becomes exponentially more complicated. I think that’s where it needs a tool or solution that can give in-content links more weight.
Which internal linking model works best for news sites, centralized or decentralized?
Great question! As the business model for most news sites is ad-driven, I recommend it to be decentralized. However, if it’s about subscriptions, the model changes to centralized. So, it ultimately depends on your business model. I’m super thankful for this question, as I haven’t covered this use-case, yet.
Do you test improved internal linking on a subset of pages against a control group?
This is only possible when you have a highly homogenized set of page templates, say for an ecommerce site, social network, marketplace or news site. And even then, it’s not possible to test with 100% accuracy. So, in this case, I’d prefer to test against a version of the site on a staging or test environment instead.
When you have over a million pages and have calculated the different metrics, how do you digest this amount of data and action internal link changes?
Fantastic question! The answer is that you want to look for patterns around page templates that are ranking worse and crawled less often than they should. Say, you identify that product pages in a certain category are crawled once a week (arbitrary value) and perform worse on average than product pages in other categories. Then, your goal would be to link them better and see if that moves the needle.
We’d like to say another big thank you to Kevin for taking the time to answer all of these questions and providing his expert insights on this topic.
Learn more about internal linking with our site architecture white paper
If you want to learn even more about internal linking, make sure you read our ultimate guide to site architecture which goes into more detail on internal linking structures and how to best organize and categorize your site.