-
-
Notifications
You must be signed in to change notification settings - Fork 49
Description
Caution
Blocked by #417
Brief Definition of what Canonical does:
Canonical tells a search engine which page is important of a collection of similar or identical pages.
What google says:
Canonicalization is the process of selecting the representative –canonical– URL of a piece of content. Consequently, a canonical URL is the URL of a page that Google chose as the most representative from a set of duplicate pages. Often called deduplication, this process helps Google show only one version of the otherwise duplicate content in its search results.
The effective outcome is that if I don't set it correctly none of my pages are considered really relevant as they all share the relevance among each others and search engines usually penalize all of them.
What's the issue
Current canonical tag setup
Any product is reachable under multiple urls:
- https://example.com/products/{slug}
- https://example.com/products/{ID}
- https://example.com/products/{historyslug} (if present)
One URL is current, all the others are history, non search friendly or only kept for legacy reasons (well running off site links). 1 could rank well, but needs to share visibility with 2 (that doesn't rank well because it doesn't have keywords in the slug) and 3 (which doesn't receive proper internal linking as all linking goes to the current slug).
Nevertheless the slug is set on 1 which is the intended canonical url. 2 Should be forwarded or return a 404 depending on how you want to see it and 3 should 301 redirect to 1 to avoid loosing previously created backlinks from other websites.
The commit message here contains
Generates a simple canonical tag based on the request path,…
which is exactly the opposite of what canonical tags are made for (indicate the url of primary html page instead of the request path to explain to search engines which page is dominant in a collection of pages that are a derivative of the primary one to avoid duplication of content).
What should be done?
Throw out all current canonical logic and reduce the canonical
A sane default would be that canonical renders always the correct current {storeurl}/{language}/{resourceroute}/{ressource-slug}. So globalize should probably override something here in case of translation.
What should also be done?
We have mitigated the problem through #413 redirecting friendlyID (which you should approve:) history urls and IDs (as in example 2) to the current slug. So while the construction of the canonical is still not that great, it is mitigated.
We are working on having the same thing working also on taxons and in content / blog pages.
Solidus Version:
Any
To Reproduce
Create a product and navigate to that product via
- https://example.com/products/{slug}
- https://example.com/products/{ID}
- https://example.com/products/{historyslug} (if present)
Current behavior
All links return distinct canonical links despite being the same resource.
Expected behavior
2 and 3 have 301 redirects to 1 and 1 has a canonical link identical to the slug configured.