Page MenuHomePhabricator

Investigate and fix <link rel="alternate"/> tags in page source
Closed, ResolvedPublic

Description

We (Kate and Mikhail) were investigating canonical vs alternate links for T234340 based on information in https://webmasters.googleblog.com/2019/02/consolidating-your-website-traffic-on.html and were really confused when we were seeing clicks and impressions reported for en.m.wikipedia.org, since

In the current version, some of your traffic is attributed to the canonical property and some to the alternate property. The new version should attribute all of your traffic to the canonical property.

So based on our system and that blog post, we should be seeing 0 and 0 for .m. domains in the Google Search Console (GSC) but that's not the case. Looking at the page source revealed something incredibly weird:

From view-source:https://en.wikipedia.org/wiki/Wikipedia:

<link rel="alternate" href="android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Wikipedia"/>
<link rel="alternate" type="application/x-wiki" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="edit" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="apple-touch-icon" href="/static/apple-touch/wikipedia.png"/>
<link rel="shortcut icon" href="/static/favicon/wikipedia.ico"/>
<link rel="search" type="application/opensearchdescription+xml" href="/w/opensearch_desc.php" title="Wikipedia (en)"/>
<link rel="EditURI" type="application/rsd+xml" href="//en.wikipedia.org/w/api.php?action=rsd"/>
<link rel="license" href="//creativecommons.org/licenses/by-sa/3.0/"/>
<link rel="canonical" href="https://en.wikipedia.org/wiki/Wikipedia"/>
<link rel="dns-prefetch" href="//meta.wikimedia.org" />

and from view-source:https://en.m.wikipedia.org/wiki/Wikipedia:

<link rel="alternate" href="android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Wikipedia"/>
<link rel="manifest" href="/w/api.php?action=webapp-manifest"/>
<link rel="alternate" type="application/x-wiki" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="edit" title="Edit this page" href="/w/index.php?title=Wikipedia&amp;action=edit"/>
<link rel="apple-touch-icon" href="/static/apple-touch/wikipedia.png"/>
<link rel="shortcut icon" href="/static/favicon/wikipedia.ico"/>
<link rel="search" type="application/opensearchdescription+xml" href="/w/opensearch_desc.php" title="Wikipedia (en)"/>
<link rel="EditURI" type="application/rsd+xml" href="//en.wikipedia.org/w/api.php?action=rsd"/>
<link rel="license" href="//creativecommons.org/licenses/by-sa/3.0/"/>
<link rel="canonical" href="https://en.wikipedia.org/wiki/Wikipedia"/>
<link rel="dns-prefetch" href="//meta.wikimedia.org" />

For some reason we're using href="android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Wikipedia", shouldn't that be href="https://en.m.wikipedia.org/wiki/Wikipedia" for indexing to correctly understand which version of the article is canonical vs alternate?

It is our suspicion that this is why we're not seeing what we expect to see in GSC.

QA

NOTE: The patch will be deployed on Thursday, 20th February
  1. Navigate to any page in a content namespace (e.g. mainspace)
  2. Observe that there's no alternate link whos href attribute begins with android-app://org.wikipedia, e.g. android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Cult_of_Luna

QA Results

ACStatusDetails
1T244614#5908962
2T244614#5908962

Event Timeline

@mpopov This url format is in fact correct, for the purpose that it was supposed to serve. But actually this purpose is quite outdated and probably no longer supported on Android. In the old days, this tag basically enabled the following logic:

"if the user has this particular app installed on the device (defined by the package name org.wikipedia) then launch that app explicitly, and don't navigate to this link in the browser."

The thing is that Android has stopped "honoring" this scheme a long time ago, so we can probably remove this alternate link quite safely.

Change 571560 had a related patch set uploaded (by Dbrant; owner: Dbrant):
[mediawiki/extensions/MobileFrontend@master] No longer emit alternate link with android-app scheme.

https://gerrit.wikimedia.org/r/571560

Thank you for explaining the motivation and submitting the patch to fix it, @Dbrant!

@Niedzielski @Jdlrobson @ovasileva: Will a separate patch be required to add <link rel="alternate" href="[mobile url]"/> in so that Google correctly indexes mobile pages as alternate to their canonical versions?

phuedx subscribed.

I'm moving this into Needs Code Review after a discussion with @ovasileva.

Change 571560 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] No longer emit alternate link with android-app scheme.

https://gerrit.wikimedia.org/r/571560

For history this was added here: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/commit/677aa8935ff497674265f2581accc25487d4a977
I didn't see any reason not to remove it at this point.

I think this can skip QA (at least I'm not sure how to QA this other than verify the link is removed from the output HTML when it hits production).

<snip /> at least I'm not sure how to QA this other than verify the link is removed from the output HTML when it hits production

LGTM

phuedx updated the task description. (Show Details)
ovasileva triaged this task as Medium priority.Feb 17 2020, 10:39 AM

Thank you for explaining the motivation and submitting the patch to fix it, @Dbrant!

@Niedzielski @Jdlrobson @ovasileva: Will a separate patch be required to add <link rel="alternate" href="[mobile url]"/> in so that Google correctly indexes mobile pages as alternate to their canonical versions?

@mpopov - from my understanding this is the same as the test we are doing in T234807: Determine impact of $wgMFNoindexPages search traffic to arwiki, zhwiki, hiwiki, itwiki, nlwiki, kowiki (although please correct me if I'm wrong here) - discussed with @kzimmerman on raising the priority of this analysis - if we don't see any decreases we can go ahead an do this for all wikis.

Edtadros subscribed.
Test Result

Status: ✅ PASS
OS: macOS Catalina
Browser: Chrome
Device: MBP
Emulated Device: iPhoneX

Test Artifact(s):

QA Steps

Navigate to any page in a content namespace (e.g. mainspace)
Observe that there's no alternate link whos href attribute begins with android-app://org.wikipedia, e.g. android-app://org.wikipedia/http/en.m.wikipedia.org/wiki/Cult_of_Luna

✅ AC1: Desktop

Screen Shot 2020-02-21 at 5.00.38 PM.png (2×1 px, 1 MB)

✅ AC2: Mobile

Screen Shot 2020-02-21 at 5.04.55 PM.png (2×1 px, 1 MB)