Page MenuHomePhabricator

Enable $wgMFNoindexPages for: Italian, Dutch, Korean, Arabic, Chinese, and Hindi Wikipedias
Closed, ResolvedPublic2 Estimated Story Points

Description

Background

In T205495: Enable $wgMFNoindexPages for beta we added alternate tags for mobile versions of the page on the beta cluster so that the site can be indexed properly. We would like to test these on more wikis to determine their effects on traffic from mobile search engines

Acceptance criteria:

  • add link rel="alternate" to the following projects (as in {T205495):
  • arwiki
  • zhwiki
  • hiwiki
  • add link rel="alternate" to the following projects (as in {T205495):

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Can we please put this on hold until the other interventions (T208755 and T209720) are done and analyzed?

A patch for T206497 is in the works for several wikis to point mobile to desktop. Will this impact the a/b test in any way if enabled?

Theoretically, this would affect both the control and test pages roughly equally so the sameAs A/B test shouldn't be affected too much due to it being a true randomized controlled experiment, but the sitemaps test would be greatly contaminated by this. So it would be best to hold off on this for the time being.

Can we please put this on hold until the other interventions (T208755 and T209720) are done and analyzed?

A patch for T206497 is in the works for several wikis to point mobile to desktop. Will this impact the a/b test in any way if enabled?

Theoretically, this would affect both the control and test pages roughly equally so the sameAs A/B test shouldn't be affected too much due to it being a true randomized controlled experiment, but the sitemaps test would be greatly contaminated by this. So it would be best to hold off on this for the time being.

We're ok here. We actually took this into account when planning the sitemaps launch with 3 wikis with sitemaps, 3 with $wgMFNoindexPages, and 3 with both. @mpopov is checking to confirm wikis that we can use as a control.

@ovasileva just reminded me that we did discuss this before, but I forgot with everything else that was going on.

Alright, I got mixed results re: controls so we'll do the best we can with what we have. This indexing thing isn't so much about more traffic as just fixing errors, so we should be okay anyway.

So nevermind re: what I said earlier, it's alright to go ahead with this :)

Change 473889 merged by jenkins-bot:
[operations/mediawiki-config@master] wmf-config: Enable wgMFNoindexPages for 6 wikis

https://gerrit.wikimedia.org/r/473889

Change 476060 had a related patch set uploaded (by Imarlier; owner: Imarlier):
[operations/mediawiki-config@master] config: move wgMFNoindexPages to InitialiseSettings-labs

https://gerrit.wikimedia.org/r/476060

Legoktm subscribed.

+Wikimedia-Site-requests since this is a request for a configuration change.

The premise on T205495 was:

We are not currently providing Google the metadata that they request in order to properly index mobile sites...

To date, no one has responded to cscott on T198970#4706473 about how the Google indexing pipeline works. What indications do we have that this will make any difference for Google? What are the contingency plans in case this experiment goes wrong?

It doesn't seem like this is ready for a production deployment yet.

+Wikimedia-Site-requests since this is a request for a configuration change.

The premise on T205495 was:

We are not currently providing Google the metadata that they request in order to properly index mobile sites...

To date, no one has responded to cscott on T198970#4706473 about how the Google indexing pipeline works. What indications do we have that this will make any difference for Google? What are the contingency plans in case this experiment goes wrong?

It doesn't seem like this is ready for a production deployment yet.

We have an explicit indication that this will make a difference for Google, provided by Google: https://developers.google.com/search/mobile-sites/mobile-seo/separate-urls. What we are doing here is implementing Google's recommendation, in order to validate that it actually does improve mobile site results.

The contingency plan is to revert the change (set wgMFNoindexPages=false for all wikis).

We have an explicit indication that this will make a difference for Google, provided by Google: https://developers.google.com/search/mobile-sites/mobile-seo/separate-urls. What we are doing here is implementing Google's recommendation, in order to validate that it actually does improve mobile site results.

I expect that's true for most websites, but we already know that Google has a special pipeline for Wikipedia content, and they aren't scraping our webpages. What I'm asking is if anyone has explicitly asked our Google contacts on whether this change would be helpful, and whether it will even make a difference. Maybe @cscott could help here?

Change 476060 merged by jenkins-bot:
[operations/mediawiki-config@master] config: move wgMFNoindexPages to InitialiseSettings-labs

https://gerrit.wikimedia.org/r/476060

kchapman subscribed.

Ian had taken this on, but it isn't really in Performance's remit. Is this something Readers Web wants to take on?

I'm trying to understand what's missing before we can start working on this task.
@Legoktm Looks like you would like to check with Google team to find out is it worth doing this task, right? If something goes wrong (less pageviews, etc) we will revert this patch.
@cscott I read your comment (https://phabricator.wikimedia.org/T198970#4706473), is it still up to date? Did we contact Google? Is there anything I can help?
@mpopov are you still happy to continue with this patch?

I'm trying to understand what's missing before we can start working on this task.
@Legoktm Looks like you would like to check with Google team to find out is it worth doing this task, right? If something goes wrong (less pageviews, etc) we will revert this patch.

I don't think anything has changed since my last comment (T206497#4783049), but I have not discussed this topic since then.

Also note that since then, proposals like T214998: RFC: Remove .m. subdomain, serve mobile and desktop variants through the same URL have been raised, which would get rid of the need for this entirely.

Just wanted to clarify that this task would be a test only. Based on the results we would either consider making the change for more wikis or reverting it back to the current state.

Looks like no one is fully familiar whats gonna happen if we decide to enable that $wgMFNoindexPages, I propose to enable it on couple wikis, and then check the results. It won't hurt us.

pmiazga moved this task from Needs Prioritization to Upcoming on the Web-Team-Backlog board.
pmiazga subscribed.

Looks like no one is fully familiar whats gonna happen if we decide to enable that $wgMFNoindexPages, I propose to enable it on couple wikis, and then check the results. It won't hurt us.

Not knowing what enabling something is going to do seems like a bad reason to just do it, unless I'm misunderstanding your comment. I don't really understand the resistance to just asking our contacts for advice (though cscott kind of already did that).

We will reach out to our contacts from Google. However, we know that they cannot share details of their ranking algorithms or confirm/reject that we are getting special preference in any way, so we are not expecting to get much detail from them on this subject. What we do know is what they say on their public documentation. This task is based on these public recommendations around the requested data for properly indexing the mobile site and the analysis done in T198970. Based on this, we devised this test, similar to the tests we already completed for sitemaps T206496 and the sameAs schema.org property T208755.

ovasileva set the point value for this task to 2.Jul 9 2019, 4:49 PM

I find it puzzling that "we know they cannot". We know for a fact that they are using a special pipeline for handling wiki content, and we have worked with them to build it. We can see their User-Agent in our logs ("MediaWikiCrawler-Google/2.0"). I've talked to Google engineers personally. They come to Wikimania. I can understand that Google is a big company, and sometimes it can be hard to find the right person at Google who actually knows how things work. But talking to the wrong person at Google isn't proof of anything.

Per T206497#5274358 probably belongs in PO backlog but ready to work on if and when we need to.

We discussed this again with Google and based on that conversation, think that we can go ahead with the test, although we still do not have any information on whether the indexing change would cause any improvements. In terms of their crawler, it looks like the MediaWikiCrawler-Google/2.0 and other related crawlers
only make up a fraction of the hits we're getting from the generic Googlebot- it’s possible that even if this test has no effects on the latter, we can still hope to see it affecting the former.

Change 538295 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[operations/mediawiki-config@master] Enable alternate mobile link for ar,zh,hi,it,nl and ko wikis.

https://gerrit.wikimedia.org/r/538295

Change 538295 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable alternate mobile link for ar,zh,hi wikis.

https://gerrit.wikimedia.org/r/538295

Mentioned in SAL (#wikimedia-operations) [2019-09-24T11:16:27Z] <urbanecm@deploy1001> Synchronized wmf-config/VariantSettings.php: SWAT: 8bf6aae: Enable alternate mobile link for ar,zh,hi wikis (T206497) (duration: 00m 54s)

Mentioned in SAL (#wikimedia-operations) [2019-09-30T11:54:03Z] <pmiazga@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:538296|Enable alternate mobile link for it, nl, ko wikis. (T206497)]] (duration: 00m 57s)

Change 540193 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[operations/mediawiki-config@master] Do not set wgMFNoindexPages config flag in mobile.php

https://gerrit.wikimedia.org/r/540193

Change 540193 merged by jenkins-bot:
[operations/mediawiki-config@master] Do not set wgMFNoindexPages config flag in mobile.php

https://gerrit.wikimedia.org/r/540193

Mentioned in SAL (#wikimedia-operations) [2019-10-02T11:12:35Z] <pmiazga@deploy1001> Synchronized wmf-config/mobile.php: SWAT: [[gerrit:540193|Do not set wgMFNoindexPages config flag in mobile.php (T206497)]] (duration: 01m 14s)

ovasileva claimed this task.
ovasileva updated the task description. (Show Details)

Results to be tracked in T234807