Page MenuHomePhabricator

Document that wikimedia pageviews API is blocked by ad blockers
Open, MediumPublic

Description

Ad blockers such as uBlock and AdBlockPlus block all AJAX requests to the pageviews API. Try uBlock in Firefox and perform requests on RESTBase. These ad blockers are very popular and cause tools like Pageviews Analysis to fail.

See comment from @MusikAnimal below and update the on-wiki docs + the hyperswitch documentation so that others who run into this problem don't get stuck.
Consider making a proxy on the front-end from /pv to /pageviews to avoid this issue, but if ad-block software starts blocking /pv just give up the fight :)

Possible alternative layouts that aren't blocked

/metrics/page/*

  • /metrics/page/readers/
  • /metrics/page/views/
  • possibly, later:
    • /metrics/page/edits/
    • /metrics/revision/score/

The exact routes that most ad blockers are currently targeting can be found here.

Event Timeline

MusikAnimal raised the priority of this task from to Needs Triage.
MusikAnimal updated the task description. (Show Details)
MusikAnimal added a project: Pageviews-API.
MusikAnimal added a subscriber: MusikAnimal.
Restricted Application added a project: Analytics. · View Herald TranscriptFeb 15 2016, 3:38 AM
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript

We'll look into this but we don't really control adblock software

Milimetric moved this task from Incoming to Event Platform on the Analytics board.Feb 18 2016, 6:14 PM

Interestingly, AdBlock only has a problem with https://tools.wmflabs.org/pageviews/ but not with http://tools.wmflabs.org/pageviews-test/ (running the exact same thing) which might indicate it's not the AdBlockers which are a problem.

I believe that might only be true with AdBlock Plus. The behaviour I'm seeing on Safari OSX with uBlock:

So it seems the /pageviews route itself is certainly problematic. A wild guess is if the pageviews API route was changed to something else, it might just work...

What I might try to do is set up a router to the API through my Ruby app running on http://tools.wmflabs.org/musikanimal This is no solution that I plan on using for Pageviews Analysis (the Ruby app would eventually get overload from all those requests), but it would definitively tell us if in fact it's the /pageviews route that is the problem.

I can confirm the ad blockers are blacklisting routes with /pageviews/*. You'll see this in the list of routes that uBlock uses here. Additionally, I did the aforementioned experiment in routing API requests through a middleman service that I quickly wrote, under a different route. That worked. Finally, I had to rename my app to something other than /pageviews in order for the CSS/JS to load.

Apparently any kind of pageviews statistics stuff is considered an ad, as I see they have several other similar routes blocked. That being said I'm unsure if there's a fitting name we could use that isn't already blocked. Even if we use one that isn't currently blocked, there's no reassurance it won't be moving forward.

Apparently any kind of pageviews statistics stuff is considered an ad, as I see they have several other similar routes blocked. That being said I'm unsure if there's a fitting name we could use that isn't already blocked. Even if we use one that isn't currently blocked, there's no reassurance it won't be moving forward.

Yeah, I agree. Since we can't control what ad-block software does, are you ok with me just adding the information you found to the documentation / help text of the API itself?

@Milimetric Of course! We're going to try to use our own backend server to make the requests and get around the ad blockers using the route /pv, which I don't think will be blocked. You could do the same, but surely you'd prefer a more descriptive route. Not expecting you to change it.

Milimetric renamed this task from Wikimedia pageviews API blocked by ad blockers to Document that wikimedia pageviews API is blocked by ad blockers.Mar 7 2016, 4:24 PM
Milimetric triaged this task as Medium priority.
Milimetric updated the task description. (Show Details)
GWicke added a subscriber: GWicke.EditedMar 16 2016, 5:00 PM

My preference would be to find a new primary URL, and alias / redirect requests in the transition period. I have added some possible alternatives in the task description.

Closely related: T119094: Expose pageview data in each project's REST API

Now could be a good time to consider the layout with per-project entry points.

GWicke updated the task description. (Show Details)Mar 16 2016, 5:17 PM
MusikAnimal updated the task description. (Show Details)Mar 16 2016, 5:22 PM

Hm, good suggestions, @GWicke, let the bike-shedding begin :)

/metrics/views/

That would let per-article, aggregate, and top fit nicely into that hierarchy, and would still allow /metrics/edits and other similar extensions.

Just going to state my opinion here again, that I worry about the potential of these proposed names getting blocked at some point. For instance /tracking/views/* is blocked, and /metrics/views/* isn't too far off. Similarly /tracking/visits and /tracking/visitors are deemed ads by this wretched software.

The terms readers and reads are not listed in any variation, and to me seem like fitting alternatives.

MusikAnimal updated the task description. (Show Details)Mar 16 2016, 6:11 PM

Yeah, if it's truly ad-blockers' intention to ban all analytics-related content, it won't be long before they ban any meaningful name we come up with. So it's definitely a frustrating problem. Is there any way to ask them to exclude wikimedia from all their regexes? They could revisit that policy when hell freezes over and we start running ads :P

The only worry I have would be that '/views/' makes it into one of those lists, but at the same time I'm optimistic that the term is general enough to avoid a blanket rule.

I also think that it's worth reaching out to block list maintainers. We should have broad support as an ad-free entity, so I wouldn't be surprised if they accommodated us when alerted to issues. Actually, how about trying this for 'pageviews'?

Reaching out to the ad block maintainers certainly wouldn't hurt, and I think they'd be inclined to believe we are in fact ad-free :)

However I don't think these browser add-ons necessary automatically update, and we can't expect people do that either. So if we are restructuring the routes to be on a per-domain basis, I'd argue we should also go ahead and rename /pageviews to something currently not blocked. My preference is still /metrics/page/readers/. It's descriptive and fitting for our project.

Does anyone have any contact with the block list maintainers? I can poke randomly around the internet otherwise :)

See T128974. @kaldari got them to whitelist tools.wmflabs.org, but I not the RESTBase API. Maybe we could get them to whitelist not only RESTBase but all *.wikipedia.org and sister projects, since it seems we might be changing the routes to be on a per-project basis. If we are successful than we can keep /metrics/pageviews moving forward

I think Kaldari reached out to them on the EasyList forums

hm, that exception for tools.wmflabs.org makes me a little hopeful that we can convince them. Do you think it'd be useful for more of us to weigh in there?

Nuria added a subscriber: Nuria.Oct 24 2016, 3:57 PM

Removing analytics as i do not see an actionable for us here.

GWicke set Security to None.