Page MenuHomePhabricator

Canonical URL of "redirect=no" view must include "redirect=no" query
Open, MediumPublic

Description

If I understand correctly, redirect pages are telling search engines that they themselves are the canonical page, rather than their targets. For instance,

meta:Help:Redirection is a page that redirects to meta:Help:Redirect. The redirect page itself includes the following HTML:

<link rel="canonical" href="https://meta.wikimedia.org/wiki/Help:Redirection"/>

Shouldn't that instead read like the following:

<link rel="canonical" href="https://meta.wikimedia.org/wiki/Help:Redirect"/>

...in order to advise search engine spiders that Help:Redirect is where the authoritative info lives?

For what it's worth, this question on Quora is what got me thinking about this, and my (admittedly limited) understanding of canonical links derives from this page on Medium.
https://www.quora.com/How-does-Google-search-index-treat-wikipedia-redirect-articles
https://help.medium.com/hc/en-us/articles/217991468-Duplicate-Content-and-SEO

Wikipedia_talk:Wikipedia_Signpost#Overview:_How_will_the_Newsletter_Extension_impact_the_Signpost.27s_future.3F

Event Timeline

If this is valid and leads to a change, we should also consider cases where a redirect page carries additional information. I would imagine we'd want to treat these like any other redirect page, but more expert folks than myself should probably think it over. Example:

https://en.wikipedia.org/w/index.php?title=Czechia&redirect=no

This already works the way you're describing for me. When I visit https://meta.wikimedia.org/wiki/Help:Redirection, I see <link rel="canonical" href="https://meta.wikimedia.org/wiki/Help:Redirect"/>.

This already works the way you're describing for me. When I visit https://meta.wikimedia.org/wiki/Help:Redirection, I see <link rel="canonical" href="https://meta.wikimedia.org/wiki/Help:Redirect"/>.

That's after you've been redirected, though. The page I'm talking about is this one:
https://meta.wikimedia.org/w/index.php?title=Help:Redirection&redirect=no

(note the "&redirect=no" parameter at the end.)

I don't know much about how search engine spiders work. I don't know whether they ever reach the "&redirect=no" version of the page or not. But if they do, I believe they are being given incomplete/inaccurate information.

Oh, I see. I think this is intentional though. Spiders can reach https://meta.wikimedia.org/w/index.php?title=Help:Redirection&redirect=no the same as humans (by following the "Redirected from" link), but that page is not equivalent https://meta.wikimedia.org/wiki/Help:Redirection (it displays "metadata" about the redirect, it isn't the redirect). https://meta.wikimedia.org/wiki/Help:Redirection is equivalent to https://meta.wikimedia.org/wiki/Help:Redirect, and it seems that this is correctly indicated.

... but that page is not equivalent ...

True, but as I understand it, equivalency is not the right standard.

For a much more exaggerated example, this page:
https://medium.com/wikisignpost/22-december-2016-45fefc2204d1

is clearly not equivalent to this page:
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2016-12-22/Year_in_review

However, at least according to Medium's interpretation of the best use of the "canonical" label (see link in my first post), the first is supposed to list the second as the "canonical" version.

Here's a question: Is there any circumstance you can imagine where a Google search should return the redirect page, including the "&redirect=no" parameter? If so, what are the circumstances? And if not, why should a redirect page list itself as the canonical link, rather than its target?

Here's a question: Is there any circumstance you can imagine where a Google search should return the redirect page, including the "&redirect=no" parameter? If so, what are the circumstances?

That's kind of a moot point, since we disallow robots from indexing everything under /w/ (we want them to index /wiki/). https://meta.wikimedia.org/robots.txt

But if you search for the URL, Google will acknowledge that it knows that it exist, but it won't show you anything else: https://www.google.pl/search?q=https%3A%2F%2Fmeta.wikimedia.org%2Fw%2Findex.php%3Ftitle%3DHelp%3ARedirection%26redirect%3Dno

And if not, why should a redirect page list itself as the canonical link, rather than its target?

Because it contains completely different content.

Krinkle renamed this task from Redirects should (?) set canonical URL differently to Should canonical URL of "redirect=no" view about redirects be something else?.Jul 31 2017, 9:08 PM
Krinkle triaged this task as Medium priority.

https://en.wikipedia.org/w/index.php?title=Czechia&redirect=no is the url to view the redirect page (as being a redirect itself, with information, history, discussion and a way to modify, the redirect). There is no other or better url to view this information.

The canonical url for https://en.wikipedia.org/w/index.php?title=Czechia&redirect=no should be https://en.wikipedia.org/w/index.php?title=Czechia&redirect=no (the same).

It's canonical url should definitely not be https://en.wikipedia.org/wiki/Czech_Republic, which is the destination of the redirect, and an entirely different page (nothing about the redirect, no a link to discussion about the redirect, edit history of the redirect, no action links to edit the redirect, etc.).

However, there is a bug here. Right now the canonical url for <https://en.wikipedia.org/w/index.php?title=Czechia&redirect=no> is shown as <https://en.wikipedia.org/wiki/Czechia> which is also wrong, given that that will respond with the content of Czech_Republic.

Krinkle renamed this task from Should canonical URL of "redirect=no" view about redirects be something else? to Canonical URL of "redirect=no" view must include "redirect=no" query.Jul 31 2017, 9:13 PM
Krinkle moved this task from Untriaged to General on the MediaWiki-Redirects board.