Page MenuHomePhabricator

If MobileFrontend auto detects, if the mobile view should be delivered, it should set a vary header
Closed, ResolvedPublic

Description

According to Google's best practices for mobile sites[1], a page that serves different html to different devices, should set the Vary header to User-Agent (or add it to the list of existing Vary headers) to allow the google bot (and possibly other third party cache infrastructures) to know, that the content (or the delivered html) varies when different user agents/devices are used. Currently, MobileFrontend doesn't set this header at all, but should, if the $wgMFAutodetectMobileView config variable is set to true and $wgMobileUrlTemplate is '' or false (with that configuration, the same url is used for different html, mobile and desktop one).

[1] https://developers.google.com/webmasters/mobile-sites/mobile-seo/dynamic-serving?hl=en#the-vary-http-header

Event Timeline

Florian created this task.Jan 10 2016, 2:57 PM
Florian claimed this task.
Florian raised the priority of this task from to Normal.
Florian updated the task description. (Show Details)
Florian added projects: MobileFrontend, SEO.
Florian added a subscriber: Florian.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 10 2016, 2:57 PM

Change 263228 had a related patch set uploaded (by Florianschmidtwelzow):
Add Vary header User-Agent, if appropriate

https://gerrit.wikimedia.org/r/263228

BBlack added a subscriber: BBlack.Jan 10 2016, 8:55 PM

I'd like to understand this more before we push the change above. In general, "Vary: User-Agent" is very harmful to caches, as there are far more unique User-Agent strings than there are unique variants of our content...

But as far as I know, Vary doesn't provide a setting like "specialdevices", right? :/ Btw.: This doesn't apply to wmf wikis. Furthermore, I only try to use the instructions from Google, at they make sense from my perspective. However I haven't thought about your point :(

Jdlrobson changed the task status from Open to Stalled.Jan 19 2016, 6:13 PM
Jdlrobson added a subscriber: Jdlrobson.

In theory Florian's right this shouldn't impact Wikimedia instances given we use the config variables it's hidden behind.

@BBlack given this would just apply to 3rd parties, can you comment specifically whether this is the right thing to do for them?

Pong - we should discuss here rather than in the patch.

Perhaps recapping things said elsewhere, but to sum it up from my POV:

The reason Google wants to see Vary:UA is so that their crawler knows if it fetches with differing desktop-v-mobile UAs, it will get different content for desktop-v-mobile. That much makes sense.

In the presence of a reverse-proxy cache like our Varnish setup, though, having the application emit Vary:UA causes massive cache fragmentation: there are many distinct UAs within each of the "mobile" and "desktop" bins, and they'd all be cached separately. This is horrible for site performance, and nobody with a content cache in their edge infrastructure would want the applayer to emit that header. It would only make sense for the app to emit that header for e.g. Google's consumption if you're sure there's no shared cache in the way, and I don't think the code can reliably know that about the user's (user as in other non-WMF site using MW+MFE) infrastructure setup.

IMHO, at least if you're going to offer this as a feature, it should be toggle-able via config and come with a big warning in the release notes about effects on content caches. Maybe even off-by-default would make more sense.

Pong - we should discuss here rather than in the patch.
Perhaps recapping things said elsewhere, but to sum it up from my POV:
The reason Google wants to see Vary:UA is so that their crawler knows if it fetches with differing desktop-v-mobile UAs, it will get different content for desktop-v-mobile. That much makes sense.
In the presence of a reverse-proxy cache like our Varnish setup, though, having the application emit Vary:UA causes massive cache fragmentation: there are many distinct UAs within each of the "mobile" and "desktop" bins, and they'd all be cached separately. This is horrible for site performance, and nobody with a content cache in their edge infrastructure would want the applayer to emit that header. It would only make sense for the app to emit that header for e.g. Google's consumption if you're sure there's no shared cache in the way, and I don't think the code can reliably know that about the user's (user as in other non-WMF site using MW+MFE) infrastructure setup.
IMHO, at least if you're going to offer this as a feature, it should be toggle-able via config and come with a big warning in the release notes about effects on content caches. Maybe even off-by-default would make more sense.

Yup, I think we are in complete agreement and only considering this for 3rd party support.
[[ https://gerrit.wikimedia.org/r/#/c/263228/3/includes/MobileFrontend.hooks.php

Florian's patch does this ]]
$config->get( 'MFAutodetectMobileView' ) && !$config->get( 'MobileUrlTemplate'  )

On production MobileUrlTemplate is set, as this is used to set a separate domain for mobile.
MFAutodetectMobileView is usually false.

The only exception to the above is wikitech.wikimedia.org which I suspect this change would help cc @bd808

bd808 added a comment.Mar 4 2016, 12:41 AM

I think there should be an additional feature flag beyond the config detection. The instructions from Google are honestly gross. They should at least offer a meta tag alternative.

That's fair. @Florian care to make it more explicit?

@Jdlrobson @bd808 I'm totally fine with feature flagging it. Do you agree, that it makes sense to turn it on by default and simply set the default for the Wikimedia cluster to false?

So the if condition would be extended like:

$config->get( 'MFAutodetectMobileView' ) && !$config->get( 'MobileUrlTemplate'  ) && $config->get( 'MobileVaryOnUA' )

(the naming can be changed, of yourse ;))

Florian changed the task status from Stalled to Open.Mar 4 2016, 2:00 PM
bd808 added a comment.Mar 4 2016, 4:40 PM

@Jdlrobson @bd808 I'm totally fine with feature flagging it. Do you agree, that it makes sense to turn it on by default and simply set the default for the Wikimedia cluster to false?

I personally don't agree, but getting SEO boost from Google hasn't ever been high on my list of important criteria for a wiki/website. The harm that Vary: User-Agent causes to any unaware intermediate caching tier is catastrophic. It really only makes sense if the cache layer is made to munge the User-Agent header as suggested in this Fastly blog post:

if (req.http.User-Agent ~ "(Mobile|Android|iPhone|iPad)" {
  set req.http.User-Agent = "mobile";
} else {
  set req.http.User-Agent = "desktop";
}

So the if condition would be extended like:

$config->get( 'MFAutodetectMobileView' ) && !$config->get( 'MobileUrlTemplate'  ) && $config->get( 'MobileVaryOnUA' )

(the naming can be changed, of yourse ;))

The feature flag should be the first check in the chain, not the last I think.

@Jdlrobson @bd808 I'm totally fine with feature flagging it. Do you agree, that it makes sense to turn it on by default and simply set the default for the Wikimedia cluster to false?

I personally don't agree, but getting SEO boost from Google hasn't ever been high on my list of important criteria for a wiki/website. The harm that Vary: User-Agent causes to any unaware intermediate caching tier is catastrophic. It really only makes sense if the cache layer is made to munge the User-Agent header as suggested in this Fastly blog post:

Ok, thanks for your input! I'll set it to false, it can be enabled by users they want it whenever they want :)

So the if condition would be extended like:

$config->get( 'MFAutodetectMobileView' ) && !$config->get( 'MobileUrlTemplate'  ) && $config->get( 'MobileVaryOnUA' )

(the naming can be changed, of yourse ;))

The feature flag should be the first check in the chain, not the last I think.

Yep, was just an example to indicate, that I don't want to remove the other conditions :)

Change 263228 merged by jenkins-bot:
Add Vary header User-Agent, if appropriate

https://gerrit.wikimedia.org/r/263228

Jdlrobson closed this task as Resolved.Mar 4 2016, 10:13 PM
Deskana moved this task from Tag to Done on the SEO board.Jul 6 2018, 10:28 AM