Andrew Green
3:29 PM (21 minutes ago)tl;dr: As compared to last year, there are more banner impressions, yet less page views, in the NL and ES campaigns. Please see the initial (last) e-mail in the thread copied below for details.
to Maximilian, Jessica, Peter, me, Elliott, fr-tech, Data> Andrew Green
> 3:29 PM (21 minutes ago)
>
Hi!
Wow...> to Maximilian, Jessica, Thanks for the detailed info...Peter, I'm just digging into this now.
Katie's suggestion a of new proxy would make senseme, Elliott, except that this is happening at the same time in two countriesfr-tech, which doesn't sound too likely (though not impossible).Data
> Hi!
It could be a change or issue with how pageviews are counted (for example, a change in how bots are filtered out?)... Or also some new Javascript issue, maybe due to the refactoring, or an update in another part of Mediawiki code.
If I understand the reports you sent, it looks like the amounts raised have declined, but the total number of donations increased on mobile, and in NL, they increased on desktop, while in ES,>
> Wow... they decreased on desktop?Thanks for the detailed info... While the donation rate (by time?) decreased?I'm just digging into this now.
I guess I'll first try looking at the campaign configuration and checking for JS issues>
> Katie's suggestion a of new proxy would make sense, then try to understand the data moreexcept that this is happening at the same time in two countries, and maybe bother you with some questions and/or queries we might try to dig deeper..which doesn't sound too likely (though not impossible).
Thanks much!!!> It could be a change or issue with how pageviews are counted (for example, a change in how bots are filtered out?)... Or also some new Javascript issue, TTY soonmaybe due to the refactoring, cheers,or an update in another part of Mediawiki code.
Andrew
On 09/05/17 01:37 PM,>
> If I understand the reports you sent, it looks like the amounts raised have declined, but the total number of donations increased on mobile, and in NL, they increased on desktop, while in ES, they decreased on desktop? Katie Horn wrote:While the donation rate (by time?) decreased?
Hmm. Could this be explained by some kind of internet "speed accelerator" coming into use, which largely functions by caching initial pageloads somewhere outside of our systems, but the javascript still fires causing those users viewing cached articles to see fresh banners?
This is, of course> I guess I'll first try looking at the campaign configuration and checking for JS issues, a complete shot in the dark.
-Katie
On 09/05/17 01:05 PMthen try to understand the data more, Maximilian Pany wrote:and maybe bother you with some questions and/or queries we might try to dig deeper...
Hi Andrew (CC Jessica, Peter, David, Elliot, and fr-tech),
I hope all is well!
Now that both the NL and ES campaigns are finished, we are looking more closely at some of the trends in banner impressions. One that particular struck us is a large increase in banner impressions with a simultaneous decrease in page views from last year to this year for both NL and ES campaigns (see tables 3-6 in the attached monitoring reports, each just for one language of the campaigns).
This trend is particularly pronounced for desktop, but also exists for mobile and iPad (although page views have increased for mobile devices, they are far outpaced by an increase in impressions). While we found some difference in the central notice settings from last year to this year for maximum banners seen (last year went from unlimited to a max of three impressions while this year stayed at a constant max of 10) potentially accounting for some differences, we are also seeing a big increase in large banners (table 6 in the reports),> Thanks much!!! which shouldn’t be due to differences in max amount seen.
SpecificallyTTY soon, we see:cheers,
- nlNL: 7.4% decrease in desktop page views with a 91% increase in desktop large banner impressions> Andrew
- esES: 22.4% decrease in desktop page views with a 69% increase in desktop large banner impressions
These numbers seem so discordant (it’s hard to believe that we really have that many more users and that they each, on average>
> On 09/05/17 01:37 PM, visit wikipedia less often!) that we are contemplating alternative explanations. Our list of hypotheses includes (in no particular order):
[1] More non-user impressions this year than last — scrapers/crawlers/spiders generating more impressions this year (perhaps because of a change on our side or on the crawler side)Katie Horn wrote:
- How could we best check this in the data> Hmm. Could this be explained by some kind of internet "speed accelerator" coming into use, which largely functions by caching initial pageloads somewhere outside of our systems, but the javascript still fires causing those users viewing cached articles to see fresh banners?
- After talking to Elliot, this sounds less likely given that many crawlers don’t run java script and if they do>
> This is, tend to respect robots.txt. Did anything with respect to this change?
[2] Some impressions that are not actually user impressions get recorded as such in the lutetium/frdev1001 dataof course, and this number has increased for some reason (perhaps a technical change?) from last year to this yeara complete shot in the dark.
- Where exactly does the data in the pgehres.bannerimpressions table come from?>
> -Katie
- Is there any chance that it might include banner impressions that the user never saw, perhaps because they were hidden or previously closed (i.e., had a “hidden” or “close” reason code in banner history)?
[3] There was a change in the proportion of page views getting impressions from last year to this year>
> On 09/05/17 01:05 PM, perhaps because not all eligible page views last year received impressionsMaximilian Pany wrote:
- My understanding is that this is not randomly sampled but that 100% of non-logged in users in a country in which a campaign is running get banners delivered (unless they closed it or have reached max) — is this correct?> Hi Andrew (CC Jessica, Peter, David, Elliot, and fr-tech),
- Peter mentioned that some changes might have been made to the refactor code>
> I hope all is well!
>
> Now that both the NL and ES campaigns are finished, we are looking more closely at some of the trends in banner impressions. One that particular struck us is a large increase in banner impressions with a simultaneous decrease in page views from last year to this year for both NL and ES campaigns (see tables 3-6 in the attached monitoring reports, each just for one language of the campaigns).
>
> This trend is particularly pronounced for desktop, but also exists for mobile and iPad (although page views have increased for mobile devices, but thought that this is unlikely to affect things — is that rightthey are far outpaced by an increase in impressions). While we found some difference in the central notice settings from last year to this year for maximum banners seen (last year went from unlimited to a max of three impressions while this year stayed at a constant max of 10) potentially accounting for some differences, Peter?
[4] There is a hiccup in how we analyze these datawe are also seeing a big increase in large banners (table 6 in the reports), which shouldn’t be due to differences in max amount seen.
- Unlikely, given that we process impression data from this year and last year in the same way — if we’d accidentally inflate these data>
> Specifically, both years should be affected the same and their shouldn’t be an artificial increasewe see:
- We have triple-checked our code from pulling these data to processing them> - nlNL: 7.4% decrease in desktop page views with a 91% increase in desktop large banner impressions
> - esES: 22.4% decrease in desktop page views with a 69% increase in desktop large banner impressions
>
> These numbers seem so discordant (it’s hard to believe that we really have that many more users and that they each, on average, visit wikipedia less often!) that we are contemplating alternative explanations. Our list of hypotheses includes (in no particular order):
>
> [1] More non-user impressions this year than last — scrapers/crawlers/spiders generating more impressions this year (perhaps because of a change on our side or on the crawler side)
> - How could we best check this in the data?
> - After talking to Elliot, this sounds less likely given that many crawlers don’t run java script and if they do, tend to respect robots.txt. Did anything with respect to this change?
>
> [2] Some impressions that are not actually user impressions get recorded as such in the lutetium/frdev1001 data, and this number has increased for some reason (perhaps a technical change?) from last year to this year
> - Where exactly does the data in the pgehres.bannerimpressions table come from?
> - Is there any chance that it might include banner impressions that the user never saw, but a mistake hiding somewhere always remains a possibility
[5] These increases in impressions are real and represent new users
[6] What potential explanation did I forget?
We were hoping that you might be able to answer our questions above and/or might have an idea of what’s going on.
Thank you very much!!!
Best,perhaps because they were hidden or previously closed (i.e., had a “hidden” or “close” reason code in banner history)?
>
> [3] There was a change in the proportion of page views getting impressions from last year to this year, perhaps because not all eligible page views last year received impressions
> - My understanding is that this is not randomly sampled but that 100% of non-logged in users in a country in which a campaign is running get banners delivered (unless they closed it or have reached max) — is this correct?
Max> - Peter mentioned that some changes might have been made to the refactor code, but thought that this is unlikely to affect things — is that right, Peter?
>
> [4] There is a hiccup in how we analyze these data
> - Unlikely, given that we process impression data from this year and last year in the same way — if we’d accidentally inflate these data, both years should be affected the same and their shouldn’t be an artificial increase
> - We have triple-checked our code from pulling these data to processing them, but a mistake hiding somewhere always remains a possibility
>
> [5] These increases in impressions are real and represent new users
>
> [6] What potential explanation did I forget?
>
> We were hoping that you might be able to answer our questions above and/or might have an idea of what’s going on.
>
> Thank you very much!!!
>
> Best,
> Max