On the road to URL nirvana, there is a power struggle that must first be overcome. Your URLs must be keyword-rich for SEO, descriptive for your users, but social media needs them to be shorter-than-short.
And while we can’t tell you how to placate your social media team when they have too many characters to fit your keyword-rich URL into a tweet (if in doubt, Open Graph tags and 301s are your friend), we do know a thing or two about writing, formatting and future-proofing your URLs…
The ideal URL: short, descriptive and efficient
A URL that is as short as possible is best for users, search results and social media. In search results it means users can see the whole URL (and the keywords in bold) before they click, for social it means it’s easier to add a URL to a tweet and for users in general it means they can see another clue to what the page is about.
The ideal URL uses keywords that are consistent with the page’s content, to give search engines and users clues to what’s included.
Some content management systems like WordPress give you the option to use post IDs instead of words for the end of your URLs, but this option isn’t recommended as it causes confusion when you come to organise your posts (as it’s impossible to tell from the URL which post is which) and gives no indication to search engines about the topic of your content. If you currently have this set-up, consider changing your URL structure (using 301 redirects carefully to avoid negative SEO or user experience effects).
If you want to use a page ID for any reason (for example, if you’re planning to redirect large groups of URLs) then consider using words followed by a page ID to satisfy both requirements.
A descriptive URL, with a unique set of paths on the root can help to filter to individual page types, or parts of the site.
Filterable URLs make life easier when it comes to analytics. For example, Google Analytics filters are based on URLs (as opposed to custom dimensions) and the real time feature only provides basic filtering on fields like URLs.
Considering which parts of the site should not be crawled when planning your URL design can also make it easier to do back-end work like disallowing in Robots.txt, which can only be controlled based on matching URL patterns.
URL design: the technical basics
Separate words with hyphens
Google might have trouble understanding your URLs if you don’t separate words with punctuation (eg. example.com/tennisequipment), so you should separate words using punctuation to help search engines read your URLs.
Word separation is best done with a hyphen (example.com/tennis-equipment) as Google treats this like a space, whereas it treats the underscore as a separate character (example.com/tennis_equipment).
Consider using dates in URLs
This can make sense to include the publish date for some time-sensitive publications like newspapers and blogs, for example:
However, keep in mind that if you update the date stamp on the post (to post it to the top of a chronological list, for example), then the URL will change as well and you will have to make sure the old URL is redirected to the new one.
Don’t worry about file extensions
Google ignores these except for specific known types, but anything else is fine, including clean URLs without any extension or trailing slash.
Design with pagination best practice in mind
Google say that every URL in a paginated set should have the same URL format, with a single changing variable (eg. example.com/category/page-1, example.com/category/page-2). Read more at our pagination best practice guide.
Keep mobile and desktop URLs consistent
If you have a separate website for mobile, it’s best to keep URLs on each site as consistent with the desktop as possible to make it easier to add rel alt tags on the desktop pages that will point to the mobile pages (more information on Google Developers). Hosting your mobile site on a subdomain (such as m.example.com) will allow you can keep the URL paths identical.
URL design: Further in-depth
Make your canonical URL format easy to remember
Simple, intuitive URL design can help your developers identify the canonical URL for any page as easily as possible.
Match your social URLs to your canonical ones
If you choose to specify special URLs for sharing (eg. example.com/seo-software redirects to example.com/best-seo-software-for-large-businesses), typically these would be 301 redirected. Otherwise canonicalizing them will help avoid splitting authority and social signals across separate URLs.
Give your Sitemap a friendly name
Giving your Sitemaps friendly names makes them much easier to work with. Most sites use the format example.com/sitemap.xml for the main Sitemap, or Sitemap Index. Google seems to accept any URL, including any file extension, providing the Sitemap contents are valid (more information on our Sitemap guide).
Include site search keywords as URL parameters
Including your site search keywords as URL parameters means you can create reports from them in Google Analytics and access important information about common phrases your users use to find their way around your site. You can then use this data to identify gaps or opportunities in your internal linking structure, or check that the language you use matches that of your users’ vocabulary.
Use non-UTF characters
Non-UTF characters within URLs can be encoded when they are linked in the HTML, although it’s important to avoid accidentally re-encoding these URLs a second time.
Use redirectable URL structures
If you ever need to redirect various URLs, like during a site migration, the process will be easier if you include unique paths so that groups of URLs can be redirected in bulk. Including a page ID as well as a descriptive element in each URL (for example: example.com/category/best-washing-machines-1234/) can make this easier to manage as well as this can be looked up in a database more efficiently than matching a full URL.
Considerations for international sites
Domains, sub-domains and subfolders can all be configured for international SEO using Search Console geographic targeting and hreflang tags. However, some tools like Alexa don’t distinguish data at a subdomain level or subfolder, so unique domains can sometimes offer an advantage here.
Remember that you need to enter all related URLs across all international sites in your hreflang tags; if your international URLs don’t directly translate between websites, this process is a lot more complicated and mistakes can slip through.
Possible problems caused by incorrect URL design
URL duplication occurs when multiple URLs return the same page within the same site. While your website might be set up to allow users to access the same page through slightly different URLs, Google will treat these as different pages containing identical content.
- Case inconsistency: /news and /NEWS
- Trailing slash inconsistency: /news/ and /news
- Default path duplicates: example.com and example.com/index.html
- Duplicate URL formats (unfriendly and friendly format): example.com/category/subcategory/page.html and example.com/page
- Inconsistent ordering: example.com/category-1/subcategory/page-1 and example.com/category-2/page-1
- HTTP/HTTPS/Aliased domains
- Path repeated more than twice due to broken internal links: Google may not crawl or index these URLs because they are such a common cause of duplication.
- Session IDs: these will reduce crawl efficiency as Google treats them as separate URLs and will try to crawl every single version as a separate page. Keep anything dynamic or session-based out of your URLs. If this isn’t possible, ensure they’re excluded from Googlebot’s crawls using URL Parameter settings in Search Console (under Crawl > URL Parameters).
These issues can all be managed with correct redirect, canonical tag and hreflang tag implementation: for more information, see our guides on duplicate content issues you need to fix, URL duplication and hreflang.
URLs that are too long
Anything over 1024 characters may not be crawled, although no sensible URL would be this long except in an error.
Relying on hash fragments to be indexed
Google won’t index URLs with hash fragments, but will render URLs with hash fragments to find other URLs to crawl. So make sure you’re not relying on URLs with hash fragments to be indexed.
Including anything that might change in the life of the page
Avoid including anything which might change, e.g. the number of people killed in an earthquake.
URL management using DeepCrawl: useful reports
1. All URLs
Review every URL on the site and check your canonical URL format using Indexation > All URLs in your DeepCrawl report.
2. Duplicate Pages
Any duplication caused by having session IDs or other parameters in your URLs will show in Indexation > Indexable Pages > Duplicate Pages. Use this information to identify parameters that need to be excluded from Google’s crawls via Search Console.
If you have already managed your parameter settings via Search Console, you can reflect these changes by adding the excluded parameters in Project > Advanced Settings > Remove Parameters before running your crawl. This will ensure that the report reflects how Googlebot will behave, providing you have Googlebot set as your User Agent.
4. Inconsistent Open Graph and Canonical URLs
You can see any Open Graph URLs that don’t match the canonical URL in Content > Social Tagging > Inconsistent Open Graph and Canonical URLs.
5. Max URL Length
Any URLs that exceed the 1,024 character limit will show under Validation > Other > Max URL Length.
6. Double Encoded URLs
Use Validation > Other > Double Encoded URLs to catch these issues.