Page MenuHomePhabricator

Investigate and fix <link rel="alternate"/> tags in page source
Open, MediumPublic

Description

We (Kate and Mikhail) were investigating canonical vs alternate links for T234340 based on information in https://webmasters.googleblog.com/2019/02/consolidating-your-website-traffic-on.html and were really confused when we were seeing clicks and impressions reported for en.m.wikipedia.org, since

In the current version, some of your traffic is attributed to the canonical property and some to the alternate property. The new version should attribute all of your traffic to the canonical property.

So based on our system and that blog post, we should be seeing 0 and 0 for .m. domains in the Google Search Console (GSC) but that's not the case. Looking at the page source revealed something incredibly weird:

From view-source:https://en.wikipedia.org/wiki/Wikipedia:

<link rel="alternate" href="android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Wikipedia"/>
<link rel="alternate" type="application/x-wiki" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="edit" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="apple-touch-icon" href="/static/apple-touch/wikipedia.png"/>
<link rel="shortcut icon" href="/static/favicon/wikipedia.ico"/>
<link rel="search" type="application/opensearchdescription+xml" href="/w/opensearch_desc.php" title="Wikipedia (en)"/>
<link rel="EditURI" type="application/rsd+xml" href="//en.wikipedia.org/w/api.php?action=rsd"/>
<link rel="license" href="//creativecommons.org/licenses/by-sa/3.0/"/>
<link rel="canonical" href="https://en.wikipedia.org/wiki/Wikipedia"/>
<link rel="dns-prefetch" href="//meta.wikimedia.org" />

and from view-source:https://en.m.wikipedia.org/wiki/Wikipedia:

<link rel="alternate" href="android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Wikipedia"/>
<link rel="manifest" href="/w/api.php?action=webapp-manifest"/>
<link rel="alternate" type="application/x-wiki" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="edit" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="apple-touch-icon" href="/static/apple-touch/wikipedia.png"/>
<link rel="shortcut icon" href="/static/favicon/wikipedia.ico"/>
<link rel="search" type="application/opensearchdescription+xml" href="/w/opensearch_desc.php" title="Wikipedia (en)"/>
<link rel="EditURI" type="application/rsd+xml" href="//en.wikipedia.org/w/api.php?action=rsd"/>
<link rel="license" href="//creativecommons.org/licenses/by-sa/3.0/"/>
<link rel="canonical" href="https://en.wikipedia.org/wiki/Wikipedia"/>
<link rel="dns-prefetch" href="//meta.wikimedia.org" />

For some reason we're using href="android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Wikipedia", shouldn't that be href="https://en.m.wikipedia.org/wiki/Wikipedia" for indexing to correctly understand which version of the article is canonical vs alternate?

It is our suspicion that this is why we're not seeing what we expect to see in GSC.

QA

NOTE: The patch will be deployed on Thursday, 20th February
  1. Navigate to any page in a content namespace (e.g. mainspace)
  2. Observe that there's no alternate link whos href attribute begins with android-app://org.wikipedia, e.g. android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Cult_of_Luna

Details

Related Gerrit Patches:
mediawiki/extensions/MobileFrontend : masterNo longer emit alternate link with android-app scheme.

Event Timeline

mpopov created this task.Fri, Feb 7, 10:19 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFri, Feb 7, 10:19 PM
mpopov updated the task description. (Show Details)Fri, Feb 7, 10:35 PM
Dbrant added a subscriber: Dbrant.Tue, Feb 11, 6:56 PM

@mpopov This url format is in fact correct, for the purpose that it was supposed to serve. But actually this purpose is quite outdated and probably no longer supported on Android. In the old days, this tag basically enabled the following logic:

"if the user has this particular app installed on the device (defined by the package name org.wikipedia) then launch that app explicitly, and don't navigate to this link in the browser."

The thing is that Android has stopped "honoring" this scheme a long time ago, so we can probably remove this alternate link quite safely.

Change 571560 had a related patch set uploaded (by Dbrant; owner: Dbrant):
[mediawiki/extensions/MobileFrontend@master] No longer emit alternate link with android-app scheme.

https://gerrit.wikimedia.org/r/571560

Thank you for explaining the motivation and submitting the patch to fix it, @Dbrant!

@Niedzielski @Jdlrobson @ovasileva: Will a separate patch be required to add <link rel="alternate" href="[mobile url]"/> in so that Google correctly indexes mobile pages as alternate to their canonical versions?

phuedx added a subscriber: phuedx.

I'm moving this into Needs Code Review after a discussion with @ovasileva.

Change 571560 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] No longer emit alternate link with android-app scheme.

https://gerrit.wikimedia.org/r/571560

For history this was added here: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/commit/677aa8935ff497674265f2581accc25487d4a977
I didn't see any reason not to remove it at this point.

I think this can skip QA (at least I'm not sure how to QA this other than verify the link is removed from the output HTML when it hits production).

Neither config variable removed in https://gerrit.wikimedia.org/r/571560 are used anywhere other than WMF production 👍

<snip /> at least I'm not sure how to QA this other than verify the link is removed from the output HTML when it hits production

LGTM

phuedx reassigned this task from Dbrant to Edtadros.Mon, Feb 17, 10:22 AM
phuedx updated the task description. (Show Details)
ovasileva triaged this task as Medium priority.Mon, Feb 17, 10:39 AM

Thank you for explaining the motivation and submitting the patch to fix it, @Dbrant!
@Niedzielski @Jdlrobson @ovasileva: Will a separate patch be required to add <link rel="alternate" href="[mobile url]"/> in so that Google correctly indexes mobile pages as alternate to their canonical versions?

@mpopov - from my understanding this is the same as the test we are doing in T234807: Determine impact of $wgMFNoindexPages search traffic to arwiki, zhwiki, hiwiki, itwiki, nlwiki, kowiki (although please correct me if I'm wrong here) - discussed with @kzimmerman on raising the priority of this analysis - if we don't see any decreases we can go ahead an do this for all wikis.