Google Only Needs to Crawl Facet Pages That Include Otherwise Unlinked Products
For ecommerce sites, if Google can access and crawl all of your products through the main category page then it won’t need to crawl any of the facets. However, facets should be made crawlable if they contain products that aren’t linked to from anywhere else on the site.
Ensure All Product Pages Can be Crawled With Considered Use of Noindex
eCommerce sites with facets should be careful which pages are noindexed because this may make it difficult for Googlebot to crawl individual product pages e.g. noindexing all category pages. Webmasters might consider noindexing specific facets or deciding that everything after a certain number of pages in a paginated set be noindexed.
Googlebot Can Recognise Faceted Navigation & Slow Down Crawling
Googlebot understands URL strucures well and can recognise faceted navigation and will slow down when it realises where the primary content is and where it has strayed from that. This is aided by GSC parameter handling.
Canonicalization For Filter Results Pages Isn’t Recommended
Canonicalization shouldn’t be used for filter pages. This is because canonical tags can be ignored and filter pages aren’t always the same as they have different types of results.
Canonicalise Faceted pages to Non-filtered Version
Google recommends allowing crawling of faceted pages but canonicalise to non-filtered version of that page instead of blocking them with robots.txt.
Indexable Product Variations Should Reflect Search Behaviour
Variations of pages which people are searching for should be made indexable, otherwise the variations should be folded together.
Prevent Excessive Crawling on Filters, Sort Orders and Pagination with Nofollow
Add nofollow to filtered, sorted and paginated results pages to prevent excessive crawling.
Use Noindex or Canonical on Faceted URLs Instead of Disallow
John recommends against using robots.txt disallow to prevent facet URLs from being crawled as they may still be indexed, and allow them to be crawled and use a noindex or canonical tag, unless they are causing a server performance issue.