Page MenuHomePhabricator

Make banner impression counts available somewhere public
Open, Needs TriagePublic

Description

CentralNotice should display how many banner counts each banner and campaign has received. The counts should also be available through an API.

The hope is that public numbers will make it easier for CN admins to do A/B testing and other impact and effectiveness analysis. It will also enable discussions between stakeholders about relative shares of impressions.

(Might be a duplicate card?)

Event Timeline

awight created this task.Oct 8 2015, 7:09 PM
awight raised the priority of this task from to Needs Triage.
awight updated the task description. (Show Details)
awight added subscribers: awight, KTC, Jalexander, ellery.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 8 2015, 7:09 PM
atgo added a subscriber: atgo.
Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 13 2015, 6:08 PM
awight renamed this task from Make banner impression accounts available somewhere public to Make banner impression counts available somewhere public.Nov 13 2015, 6:08 PM
awight set Security to None.
atgo removed a subscriber: atgo.Mar 30 2016, 10:03 PM
awight added a comment.Jul 7 2016, 1:02 AM

One possible visualization: pacman-sized pie charts in the campaign's summary table line, with a pie for the share of each affected wiki's traffic, or a golden freaking pie when it's using a noteworthy share of traffic over all sites.

awight added a comment.Jul 7 2016, 1:05 AM

@Jseddon
Feel free to triage this task some day!

Jseddon triaged this task as Low priority.Jul 7 2016, 11:32 PM
awight created subtask Restricted Task.Jul 25 2016, 11:27 PM
Jseddon raised the priority of this task from Low to Normal.Aug 16 2016, 11:10 PM

@Nuria fr-tech, @Jseddon and I would like to discuss this topic. We understand analytics would need to work on part or most of this. An API would be a great start but visualizations similar to pivot would be a good P2 or 3.

Can we discuss how much effort this would take and where it might fit in your roadmap?

Nuria added a comment.Apr 19 2017, 8:31 PM

@DStrine: banner counts are available on Pivot, see: http://bit.ly/2pgY9vN

pivot is available through NDA so I believe the bulk of FR staff has access to those.

@Nuria there are many CentralNotice admins who are not under NDA and most campaigns in CN are not for fundraising. We are interested in getting some data to them to judge the effectiveness of their campaigns. Obviously this needs to be appropriately aggregated/anonymized too.

Nuria added a comment.EditedApr 19 2017, 8:40 PM

We are interested in getting some data to them to judge the effectiveness of their campaigns. Obviously this needs to be appropriately aggregated/anonymized too.

Let me understand:

  • The data available now in pivot only contains fundraising banners or does it contain all banners?
  • What data do you need to measure effectiveness of a campaign?
  • What data is of private nature and needs to be anonymized?

Please edit the ticket description with this information, it is abit meager to understand what does this project entitle.

We are interested in getting some data to them to judge the effectiveness of their campaigns. Obviously this needs to be appropriately aggregated/anonymized too.

Let me understand:

  • The data available now in pivot only contains fundraising banners or does it contain all banners?

It seems it contains all banners as you have a choice to filter based on different banner types.

  • What data do you need to measure effectiveness of a campaign?

I can speak to the needs of one community project and that is Wiki Loves Monuments (this applies to the majority of other Wiki Loves contests):
At the moment, it's very hard for the admin who sets up the banner or the people who are responsible for banners to understand how well the banner is doing. In the case of Wiki Loves Monuments, we have ~40 different banners shown to traffic from different countries. It has happened in the past that the contributions from one country has been significantly lower than expected, and as community members with no access to banner view data, it's impossible to know, for example, if the banner is being viewed by the traffic from the country or not. Some hourly information about the number of times a banner has been shown, the number of impressions, per country can be really helpful for diagnosing these problems at scale. This is of course only one example. We are always interested to learn how effective are the banners we use for Wiki Loves Monuments. If we want to start testing these banners, we need to know more about the impressions these banner create. There are more examples, of course.

Basically, because banners are heavily used by the community, it makes sense to empower the community to analyze the data around it to improve community projects. :)

  • What data is of private nature and needs to be anonymized?

Let's say you want to release banner impression by country on an hourly basis. Would it be a problem from the privacy point of view if you had only 1 pageview from the country in a specific hour and the data shows that the one pageview has clicked on the banner? (The answer may be yes, especially when it comes to logged in users, but I'm not sure if we show banners to logged in users at all, so this may not be relevant.)

Nuria added a comment.Apr 20 2017, 4:23 AM

Basically, because banners are heavily used by the community, it makes sense to empower the community to analyze the data around it to improve community projects. :)

@LilyOfTheWest From your description I gather you need banner impressions per country. However how does that measure that a campaign was effective is not clear. I see how it could help you diagnose that a campaign "malfunctions" (it is not shown in the country that it was supposed to) but not much otherwise. I do not know much about how these banners are setup so I might be missing something obvious. Are banner impressions driven by clickthrough?

Let's say you want to release banner impression by country on an hourly basis.

Any type of country-level pageview data has many privacy implications, and the level of anonymization you need to render that data safe might actually do away with findings you can extract from data. So it is unlikely that we will ever be able to release any disglosed (banner or per-article pageviews) data for small geographical "entiites". Please see: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageview_hourly/Sanitization

We are due to work on this problem next year..

@Nuria I'm wondering if we can meet and talk about options. Community members need to know a few base things like:
*are we getting any impressions in a campaign?
*impressions per banner in a campaign
*impressions per country

I think there are a couple other basic ones but I'd like to start with something. They currently have no idea if their banners are reaching an audience.

Also I want to acknowledge that you have a bunch of other priorities. I'm happy scope out a super lightweight version and see where it fits in your schedule.

Nuria added a comment.EditedApr 21 2017, 9:23 PM

@DStrine we can met at your convenience .Before meeting I would like to understand what we want to measure. Is there any documentation anywhere as to how banner impressions are triggered? (as I said I do not much about this, I am trying to understand the context a bit more) If our objective is to measure effectiveness of a campaign we should define what that is. We try to map out the questions we want the data to answer before hand to make sure that the data you are requesting actually answers those questions. This the way we work with any data request we have for the creation of a new dataset.

Nuria added a comment.Apr 24 2017, 4:07 PM

Ping @LilyOfTheWest please let me know if you can provide more info. We can certainly compile a dataset of banners/country but please note that measures "eyes on campaign" which is quite different from effectiveness of campaign (you could have tons of impressions but no clicks)

Nuria moved this task from Incoming to Dashiki on the Analytics board.Apr 24 2017, 4:08 PM

Pitching in from WMDE Fundraising. Our tracking requirements are two-fold:

  • general banner impression data, as fine-grained as possible. At the moment we're using the "Legacy hiding and impression counting support" campaign mixin that provides us with a 15-minute granularity impression data for each banner in the current campaigns.
  • Troubleshooting data: Comparing page impressions to banner impressions to detect if the banners were blocked or have an error that prevents them from being displayed.
Nuria added a comment.Apr 24 2017, 5:00 PM

Pitching in from WMDE Fundraising. Our tracking requirements are two-fold:

@gabriel-wmde :Let's please not mix use cases, fundraising data for banner counts is already available internally and has been so for a while

Troubleshooting data: Comparing page impressions to banner impressions to detect if the banners were blocked or have an error that prevents them from being displayed.

Again, this is already available. Have you used pivot? http://pivot.wikimedia.org

Banner and pageview data are available at different granularities

Nuria added a comment.EditedApr 24 2017, 5:27 PM

@gabriel-wmde : now that i think about it my comment above assumes WMDE Fundraising works just like overall fundraising. Do take a look at data in pivot and let us know what you find is missing.

@Nuria I don't think WMDE has access to pivot or any LDAP related systems. That may be very complicated to setup.

Nuria added a comment.Apr 24 2017, 6:19 PM

@DStrine: that is incorrect, we have several users of pivot from WMDE office. @gabriel-wmde: do check with your peers, this is how you can file a ticket for access: https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Pivot

mforns added a subscriber: mforns.May 10 2017, 5:27 PM

Hi folks!

Following up on the meeting that we just had (meeting notes), a couple questions:

  1. We spoke about the target segment of a campaign, and gave as an example: readers from a given country. Apart from country, what other target segments are used?
  1. Regarding clicks on banners, wouldn't it be possible to fire a JavaScript event when the banner is clicked, and send click-related information to the same beacon, just before the browser requests the page linked in the banner?
  1. We spoke about the target segment of a campaign, and gave as an example: readers from a given country. Apart from country, what other target segments are used?

CentralNotice has features to target users by any combination of the following criteria:

  • country
  • user (interface) language
  • wiki (project)
  • logged-in vs. anonymous
  • device category (desktop, ipad, iphone, mobile or unknown)

Also, some data stored on the client that's used to determine which banner to show could potentially "leak" information about users' habits into impression records. For example, since large Fundraising banners are only shown once to any user (within a certain time period) frequent users have a lower likelyhood of seeing a large banner on any given page view. And users who have donated within the last year won't see Fundraising banners on the device they used to make the donation.

Finally, since the banners can contain arbitrary Javascript, they can show or hide based on any criteria visible to that code. For example, there have been campaigns that target only users with a minimum number of edits (for community elections). There was also a test Fundraising campaign that targeted a specific set of articles. CentralNotice does expose an API to in-banner Javascript which allows it to control what is sent to the servers via beacon/impression. If the in-banner code is written as it should, decisions to show or hide banners like this will be available in impression data.

  1. Regarding clicks on banners, wouldn't it be possible to fire a JavaScript event when the banner is clicked, and send click-related information to the same beacon, just before the browser requests the page linked in the banner?

Beyond a few API methods it exposes, CentralNotice does not interact with in-banner code. From our point of view, banners are just blobs of stuff (HTML, JS, CSS) that we inject into the page. There's no standard interaction with any of our code when a banner is clicked. We could set up a mechanism like the one you describe to provide click-through rates... but that'd be a new feature. Maybe it'd be easier to limit the current task to sanitizing and publishing existing data? Also, I don't know that we have any requests for such a feature... I think normally raw banner-click totals are handled directly by the campaign organizers (and FR has its own way for that)... So, by providing them impression totals, we'll give them the means to calculate click-through rates on their own, maybe? I can certainly see how it'd be nice to have unified impression and click-through data, but again, it might be best to leave that as a separate step?

Thanks so much!!!!!!! :D

mmodell removed a subscriber: awight.Jun 22 2017, 9:33 PM
Nuria moved this task from Dashiki to Wikistats Production on the Analytics board.Jul 17 2017, 4:19 PM

@Nuria : Have you read the meeting notes mentioned above? To my naive eye, they seem to contain some of the types of questions you asked for earlier. If they're not what you're looking for, then iIt might be helpful to have a meeting with an analyst and/or researcher, to help others understand the kinds of questions that would be particularly meaningful.

Is there enough information now to schedule a meeting? Or should we continue the conversation here for now?

@mforns : Can you say who was at that meeting?

Milimetric raised the priority of this task from Normal to Needs Triage.Apr 2 2018, 3:43 PM
Milimetric moved this task from Wikistats Production to Blocked on the Analytics board.
Milimetric closed this task as Declined.Jul 5 2018, 4:39 PM
Milimetric added a subscriber: Milimetric.

Closing for inactivity, please re-open if publishing this somewhere public is still a priority.

@Milimetric I can say that this would still be helpful from the community point of view. Not sure if that warrants reopening, and no idea what happened at the meeting - so don't have the full picture.

DStrine reopened this task as Open.Jul 5 2018, 5:20 PM

I have reopened this. This is still a fairly important project. I have conversations about how to prioritize this at least once a quarter.

@DStrine: ok, let us know when you have a solid plan for it and how it fits in with other work.

LilyOfTheWest added a comment.EditedJul 19 2018, 8:24 PM

@Milimetric Hashing out the first step of what's needed from the community organizer end is relatively straightforward if we can understand better the limitations from the privacy perspective on your end. On the community organizer end, we can think of a relatively simple model to start from: On a daily basis output something along these lines: date, banner_name, project, sampling rate, number of impressions.

Do you see any limitation for releasing such a data string? If yes, what are those? If not, give us a bit more time and we will finalize the string we're asking.

There are a bunch of privacy issues. This requires a lengthy research project and some level of community involvement. @Jseddon is the community liaison and he and I are pretty much on the same page. We just need to find the time to tackle this.

@DStrine what are the timelines you're looking into for setting aside time for this? Knowing how long you expect to need time before you can pick up this task can help us understand if we have to look into other means to answer the questions we have in the meantime. We don't want to start new endeavors to get answers to our basic campaign related questions unless they're absolutely needed as they will take a lot of volunteer resources. Given that this task has been open since 2015, I'm concerned that if we don't set some timeline on our ends, this will be left open for at least another year or two.

Untagging Analytics as there hasn't been any input in a while. When you know more details, tag us again.