Page MenuHomePhabricator

Investigate the discrepancy between pageview counts and banner impressions for WLM 2016
Closed, DuplicatePublic

Description

Per Adam's investigation, the hourly wlm_2016 banner impressions for traffic from South Korea are 2-3 orders of magnitude lower than hourly pageviews from South Korea. (South Korea is being one example that we have looked more into.) This may mean that we are not showing banners to all pageviews across the board, and this may result in substantial reduction in WLM submissions. Let's fix this asap, please.

Event Timeline

Just a breadcrumb--this might be as simple as a failure to set pgehres.bannerimpressions.count. These values should all be divisible 1000, but are divisible by 10 instead, which indicates that we're not compensating for the record impression sample rate of 0.01.

Thanks so much for reporting this and for starting to dig in!!

I've reviewed the pageview and impression numbers in Hive for two sample hours and, generally, everything checks out [1]:

TimePageviews [2]Banner requests [3]/beacon/impression [4]
2016/09/12 10:00-11:00 hrs9408069635744
2016/09/13 14:00-15:00 hrs11271384289855

There could still be a problem with banners actually displaying that we don't know about. Maybe it's possible to contact the Korean community of one of the projects to verify that the banners have really been showing all the time?

Thanks again!!!

[1] Full queries here: P4040
[2] Pageviews on ko.wikipedia and ko.m.wikipedia from South Korea, except Special pages.
[3] Web requests to Special:BannerLoader from South Korea with uselang=ko and banner=wlm_2016 from pageviews on ko.wikipedia and ko.m.wikipedia.
[4] Web requests to /beacon/impression from South Korea with uselang=ko and banner=wlm_2016 from pageviews on ko.wikipedia and ko.m.wikipedia. Sampled at 1:100 on the client.

AndyRussG lowered the priority of this task from Unbreak Now! to High.Sep 14 2016, 5:55 AM
AndyRussG raised the priority of this task from High to Unbreak Now!.Sep 14 2016, 6:02 AM

@LilyOfTheWest
While we're looking at impression counts, I notice that the Wiki Loves Monuments campaign is set to spam 100% of readers, all of the time. I want to make a small appeal to your people to take advantage of one of the CentralNotice methods of reducing the annoyance caused by banners--and maybe even improving the clickthrough rate as a happy consequence.

The simplest method is to throttle the banner allocation down to a smaller percentage of pageviews:
https://www.mediawiki.org/wiki/Extension:CentralNotice/Allocation_system#Campaign_maximum_allocation

A more interesting alternative is the "impression diet", where a campaign can be configured to only show e.g. 3 times to each individual, followed by a long rest period in which we don't show this person the same banner.
https://www.mediawiki.org/wiki/Extension:CentralNotice/Impression_diet

In Fundraising, we've found that the majority of our responses come from the first handful of banner impressions (the first two!), and the clickthrough rate for subsequent impressions quickly dwindles down to a narrow tail.

Also, +1 to @AndyRussG's work, Hive is the more authoritative data source and the discrepancy between webrequests and the pgehres.bannerimpressions table is a downstream reporting bug on our end, and won't affect your CentralNotice campaign.

I also just checked HTTP status in logs for Special:BannerLoader over the same two sample hours I mentioned above. 100% of the requests had a 200 (OK) response.

Wondering if this might not be the same as T144952, where we saw the name of the banner injected into the page, rather than the banner itself...

Thanks so much for reporting this and for starting to dig in!!

I've reviewed the pageview and impression numbers in Hive for two sample hours and, generally, everything checks out [1]:

TimePageviews [2]Banner requests [3]/beacon/impression [4]
2016/09/12 10:00-11:00 hrs9408069635744
2016/09/13 14:00-15:00 hrs11271384289855

There could still be a problem with banners actually displaying that we don't know about. Maybe it's possible to contact the Korean community of one of the projects to verify that the banners have really been showing all the time?

Thanks again!!!

[1] Full queries here: P4040
[2] Pageviews on ko.wikipedia and ko.m.wikipedia from South Korea, except Special pages.
[3] Web requests to Special:BannerLoader from South Korea with uselang=ko and banner=wlm_2016 from pageviews on ko.wikipedia and ko.m.wikipedia.
[4] Web requests to /beacon/impression from South Korea with uselang=ko and banner=wlm_2016 from pageviews on ko.wikipedia and ko.m.wikipedia. Sampled at 1:100 on the client.

Thanks, @AndyRussG. I'll contact the South Korean organizers and will let you know.

@LilyOfTheWest
While we're looking at impression counts, I notice that the Wiki Loves Monuments campaign is set to spam 100% of readers, all of the time. I want to make a small appeal to your people to take advantage of one of the CentralNotice methods of reducing the annoyance caused by banners--and maybe even improving the clickthrough rate as a happy consequence.

@awight Thanks for bringing this up. I've made a note that we come back to this after the end of the contest this year (in November). I need to ask the team, but if we have never experimented with such changes in the past in the context of the WLM, I'd rather we don't change the way we do things in the middle of the contest. This being said, I hear you that banner fatigue can be a problem in this case.

It seems quite likely that this was an instance of the same issue we saw in T144952, so I've merged this task into that one. FWIW the problem had to do with the i18n message infrastructure that banners rely on, and has been partly resolved. (It only seems to happen now for the first 1-2 minutes after a banner was created.) Please don't hesitate to re-open if there are indications that this is wrong Thanks much! :)