Page MenuHomePhabricator

A combination of Special:MyPage redirects and pagecounts allows an external site to know the wikipedia login of an user
Closed, ResolvedPublic

Description

A request to Special:MyPage/ccvhjhdkjvkvkhjhkvkjvdh appears in the pagecounts as a visit to User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh by making a browser calling such a special page, (for example via an iframe or an embedded image or an ajax call) and by afterwards consulting the pagecounts an external site can know that my login is Xavier_Combelle and eventually correlate it with the ip or any personal information it has on me.

As I thought about a way to the best way to solve this issue would be to make a soft redirect instead of an hard one.


patches:

  • master: +
  • 1.23 - included in
  • 1.24 - included in
  • 1.25 - included in
  • 1.26 - included in

CVE: CVE-2015-8628

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Do you mean that in the envision setting the visits are not tracked? What I thought is that they will simply not appear in public page counts?

I'm not sure what you mean by this. Currently Special:MyLanguage/foo redirects to foo/en-gb or foo/fr, etc based on a setting in Special:Preferences. This yields a small amount of private info being leaked by the public stats, which could be exploited, but seems like it would be much more difficult to exploit, compared to Special:MyPage.

My question is when using the X-analytics:dnt=1 header, where does and where does not appear the visit on the different log and stats? To avoid the leak, the only thing it needs is not appear in page counts. It can still appear on all wikimedia logs and stats. So I thought that setting the header would just remove it from page counts and so would not impact other internal stats as the ones needed for "tracking visiting stats of the target of Special:MyLanguage" but maybe I'm wrong.

My question is when using the X-analytics:dnt=1 header, where does and where does not appear the visit on the different log and stats? To avoid the leak, the only thing it needs is not appear in page counts. It can still appear on all wikimedia logs and stats. So I thought that setting the header would just remove it from page counts and so would not impact other internal stats as the ones needed for "tracking visiting stats of the target of Special:MyLanguage" but maybe I'm wrong.

Oh my bad, I was using page stats and the public page counts as synonyms. I meant just the public page counts (e.g.the stuff on dumps.wikimedia.org , stats.grok.se etc)

Wait, that sounds like you're talking about a query parameter called dnt=1. I was talking about adding that to the X-Analytics header of the response, as you did in your patch, but ignoring whether or not the target of the redirect exists as a user page. That seems like it would solve the more specific targeted attack you were worried about.

The problem is the header in the response won't be sent back with the next request from the client (following the 302). The next request is what leaks the private data. So either a query parameter or cookie would have to be used for this method.

Well this solution does seem kind of icky due to all the special casing very early in the MW setup process, I will admit its elegant in that

  • Not specific to our setup. Would work with other analytic setups that third parties might use.
  • Does not require extra url parameters that user's could forge, or really long ugly tokens.
  • Would allow us to do something like "redirected from Special:MyPage", although this patch does not do that (or other things like setting rel=canonical link)

I'm starting to think this might be the way to go, since I think this would also prevent the attack on very low traffic wikis where there might only be a handful of visits to User: namespace pages in an hour, so even sending the user to just Special:MyPage could leak that the user was one of small subset of users.

I'm starting to think this might be the way to go, since I think this would also prevent the attack on very low traffic wikis where there might only be a handful of visits to User: namespace pages in an hour, so even sending the user to just Special:MyPage could leak that the user was one of small subset of users.

Even on a big wiki, if you can convince the user to load special:mypage via say a hidden iframe, you can just as easily convince them to load it 500 times to get above the noise of normal traffic.

My question is when using the X-analytics:dnt=1 header, where does and where does not appear the visit on the different log and stats? To avoid the leak, the only thing it needs is not appear in page counts. It can still appear on all wikimedia logs and stats. So I thought that setting the header would just remove it from page counts and so would not impact other internal stats as the ones needed for "tracking visiting stats of the target of Special:MyLanguage" but maybe I'm wrong.

Oh my bad, I was using page stats and the public page counts as synonyms. I meant just the public page counts (e.g.the stuff on dumps.wikimedia.org , stats.grok.se etc)

But this public page counts don't really need the count of Special:MyLanguage redirection. Does they ?

How do I attach a patch to a bug in phab? I think I'm missing something obvious

Same way as you attached other files: Drag it into the comment area.

See also T165: Provide a way to upload a file in Phabricator if drag'n'drop is not available or not wanted. If you're affected by that bug, the alternative is to go to https://phabricator.wikimedia.org/file/upload/ and upload the file (being sure to set the permissions to "No One") and then copy-paste the resulting F-number into the textbox like {F123}.

Well this solution does seem kind of icky due to all the special casing very early in the MW setup process, I will admit its elegant in that

  • Not specific to our setup. Would work with other analytic setups that third parties might use.
  • Does not require extra url parameters that user's could forge, or really long ugly tokens.
  • Would allow us to do something like "redirected from Special:MyPage", although this patch does not do that (or other things like setting rel=canonical link)

I'm starting to think this might be the way to go, since I think this would also prevent the attack on very low traffic wikis where there might only be a handful of visits to User: namespace pages in an hour, so even sending the user to just Special:MyPage could leak that the user was one of small subset of users.

  • The Special:MyPage problem would be solved equally with the url parameter+X-analytic:dnt=1 response header method and this method.
  • If another installation than wikimedia ones is also using cache, their cache method must also be updated to proper work. So the public analytics work become a parametrize cache work for them. And cache is known to be a hard problem.
  • I wonder if some mediawiki extensions don't keep track of visit page counts and if they should not be updated too if necessary.

About url parameter forging, I don't think it's a concern because the meaning of this parameter is don't publish stats about my visit on this page. A pretty similar behavior could be done with an api request

I think I see the problem you guys are trying to solve, but it sounds like it's mostly in your domain now. If you think I can help, please grab me in IRC or set up a meeting.

@csteipp when we say soft redirect, that means the user has to click again? If so, then soft redirects are not a workable solution. This is a pretty bad user experience, so unless there was a direct breach or urgent need, I hope an alternative will work.

@JKatzWMF, from the description at https://phabricator.wikimedia.org/T109724#1610437, so no extra click required, does that seem palatable from a product perspective?

@DarTar, if we make this a soft redirect, so we only store a hit to Special:MyPage/whatever instead of the specific User:whatever, is that OK for research?

If so, I'll keep working on bawolff's patch, and try to get that deployed this week.

@JKatzWMF and @DarTar, ping? Let me know if I should schedule 20 mins to chat / demo, and I can walk you through it.

@csteipp I reviewed the thread and cannot think of any implication whatsoever of this change for Research, thanks for checking.

@csteipp thank you for the offline explanation. This is fine from a reading perspective. have you checked with @Jdforrester-WMF? Since this impacts logged in users, it is mostly his jurisdiction from an audience perspective.

Confirming that I'm content with a app-level (non-HTTP) redirect being used for Special:MyPage in the same way as we changed #REDIRECT to work last year.

I think that instead of "(redirected from Special:MyPage)" if we said "(This is your account's version of Special:MyPage.)" or somesuch we could reduce confusion completely, but that's a bonus, and not necessary to make this change.

Note however that discarding page counts of Special:MyLanguage redirects would be a real pity as those are the principal entry point for some policy pages. I think for example of https://www.mediawiki.org/wiki/Help:VisualEditor/User_guide (declaring my interest! ;-)).

Note however that discarding page counts of Special:MyLanguage redirects would be a real pity as those are the principal entry point for some policy pages. I think for example of https://www.mediawiki.org/wiki/Help:VisualEditor/User_guide (declaring my interest! ;-)).

@Jdforrester-WMF, do you need to track non-existent translations too?

In testing @Bawolff's patch, realized that is redirecting a lot of pages that we don't need to redirect. Things like Special:ListAdmins, Special:CreateAccount, Special:BetaFeatures. Should we restrict the soft-redirect to classes that contain "my" in the name? Or probably a little less fragile, have each of the private ones implement a "PrivateRedirect" interface, and test for that instead of the relatively generic RedirectSpecialPage?

Built on @Bawolff's patch, but let individual classes specify if they should use this behavior. And added site config so it can be disabled for sites that don't make their stats public.

@Bawolff, have time for a code review at some point?

I did leave out Special:MyLanguage redirects explicitly. That could let an attacker narrow down who they sent to the page, but that seems both harder to get useful information from, and would hurt the editing team (among others, I suspect). So I'm on the fence if it's worth it.

I think leaving out Special:MyLanguage open isn't a good option. Updated patch. @Jdforrester-WMF, we'll have to find your team another way to count hits to the language versions of that page.

could we make graphite metrics count those hits?

We're trying to make graphite data more public, so probably not.
Eventlogging would work, and be pretty trivial to implement.

Mostly looks good. I changed the following:

  • set $wgTitle for older code (including the likes of EditPage.php)
  • Reset the cached version of what the action is, in case it changes after redirect (e.g. Special:MyPage?action=edit should work)
  • Disable squid cache for these pages (They vary by what user is accessing them, even for anons, so not safe to store in squid).

Other thoughts:

  • Maybe the default should be that the redirect is personally identifiable. If someone forgets to set it, better to err on the side of secure. Also most of them seem to be of this nature.
  • I'm not really sure if we have to include Special:MyLanguage. The threat of linking someone's IP address to the language set in Special:Preferences seems rather limited. But I go back and forth on that in my head.

One other thing to note about this patch, is that it disables varnish cache for those pages. I don't think this is much of an issue since anons are unlikely to view Special:MyPage in great number, but its always good to note up-front anytime a patch messes with caching.

Changes from my patch look good to me.

  • Changing getAction to use a private property instead of a static should be safe, and shouldn't hurt performance much. There should only be one MediaWiki object created for each requests.
  • Setting the cache with lowerCdnMaxage(0) I think is what we want. Although we would usually use setSquidMaxage(0), using lowerCdnMaxage will prevent the target from setting a non-zero squid maxage, which is what we want in this case. Although @aaron, if you think that's abusing the functionality, let me know.
  • Setting $wgTitle. Yep, I forgot that.

So do we know if this needs analytics attention or changes to our pipeline anymore? Or does the patch work ok with the current flow?

So do we know if this needs analytics attention or changes to our pipeline anymore? Or does the patch work ok with the current flow?

It should work fine with current flow.

Milimetric moved this task from Paused to Done on the Analytics-Kanban board.
Milimetric removed a project: Analytics-Kanban.

Patch is deployed
20:45 csteipp: deployed patch for T109724 to wmf4/5

I wonder if we should advertise that stats are handling these types of special pages differently (Not the reason for doing so, just that we are doing so). Some users might be confused by this change. Perhaps a line in tech news would be appropriate. Thoughts?

Mmmmm I mean if you mentioned this on analytics-l, where you're likely to hit most people that would be potentially confused by the change, everyone will ask questions. So my suggestion is that if we don't want to talk about the potential hole we shouldn't mention anything.

The patch incorrectly drops query parameters from the URL set in wgInternalRedirectTargetUrl, causing T118823.

+						$output->addJsConfigVars( array(
+							'wgInternalRedirectTargetUrl' => $target->getFullURL(),
+						) );

This should probably use $target->getFullURL( $query )?

Tested and patchified @matmarex's fix above.

The GuidedTour parameters (and any other whitelisted parameters) are now preserved in the JS replaceState: T118823: Special pages won't pass The Wikipedia Adventure query parameters

LGTM. Deployed.

22:31 csteipp: deployed followup for T109724

demon changed the visibility from "Custom Policy" to "Public (No Login Required)".Dec 18 2015, 12:39 AM
demon changed the edit policy from "Custom Policy" to "All Users".
demon changed Security from Software security bug to None.

Change 259951 had a related patch set uploaded (by Chad):
SECURITY: Make Special:MyPage and friends fake redirect to prevent info leak

https://gerrit.wikimedia.org/r/259951

Change 259952 had a related patch set uploaded (by Chad):
Add $query to JavaScript redirect info

https://gerrit.wikimedia.org/r/259952

Change 259920 merged by Chad:
SECURITY: Make Special:MyPage and friends fake redirect to prevent info leak

https://gerrit.wikimedia.org/r/259920

Change 259921 merged by Chad:
Add $query to JavaScript redirect info

https://gerrit.wikimedia.org/r/259921

Change 259942 merged by Chad:
SECURITY: Make Special:MyPage and friends fake redirect to prevent info leak

https://gerrit.wikimedia.org/r/259942

Change 259909 merged by Reedy:
SECURITY: Make Special:MyPage and friends fake redirect to prevent info leak

https://gerrit.wikimedia.org/r/259909

Change 259943 merged by Chad:
Add $query to JavaScript redirect info

https://gerrit.wikimedia.org/r/259943

Change 259910 merged by Reedy:
Add $query to JavaScript redirect info

https://gerrit.wikimedia.org/r/259910

Change 259933 merged by Reedy:
SECURITY: Make Special:MyPage and friends fake redirect to prevent info leak

https://gerrit.wikimedia.org/r/259933

Change 259934 merged by Reedy:
Add $query to JavaScript redirect info

https://gerrit.wikimedia.org/r/259934

A request to Special:MyPage/ccvhjhdkjvkvkhjhkvkjvdh appears in the pagecounts as a visit to User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh

I'm not convinced of the proposed solution. Why downgrade MediaWiki functionality because of an information leak in Wikimedia Foundation infrustructure? I'd rather reverse the burden: we could just stop counting one visit to "User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh" and instead add 1 to
"Special:MyPage/ccvhjhdkjvkvkhjhkvkjvdh".

We stopped counting the 302'ing special pages after https://meta.wikimedia.org/wiki/Research_talk:Page_view/Archive_1#Special_namespace_and_actual_problems . It doesn't seem impossible to do the opposite (under some condition):

how would the definition count redirects? Only towards the redirect source, only to the target, both, or some other scheme? --QChris (WMF) (talk) 11:33, 23 October 2014 (UTC)

Also, the description of the problem lacks one step: the attacker first needs to create "User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh" (presumably by creating such a subpage for all users of a wiki?) because 404 are excluded: https://meta.wikimedia.org/wiki/Research:Page_view#Step_1:_apply_general_filters

Change 259951 merged by jenkins-bot:
SECURITY: Make Special:MyPage and friends fake redirect to prevent info leak

https://gerrit.wikimedia.org/r/259951

Change 259952 merged by jenkins-bot:
Add $query to JavaScript redirect info

https://gerrit.wikimedia.org/r/259952

A request to Special:MyPage/ccvhjhdkjvkvkhjhkvkjvdh appears in the pagecounts as a visit to User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh

I'm not convinced of the proposed solution. Why downgrade MediaWiki functionality because of an information leak in Wikimedia Foundation infrustructure?

I don't think we downgraded anything here. If it wasn't for the $query bug, I wouldn't have noticed anything changed in the weeks this was deployed to WMF production. If anything, it's probably more efficient to do it this way.

Also, the description of the problem lacks one step: the attacker first needs to create "User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh" (presumably by creating such a subpage for all users of a wiki?) because 404 are excluded: https://meta.wikimedia.org/wiki/Research:Page_view#Step_1:_apply_general_filters

That depend on which stats you are looking http://en.wikipedia.org/wiki/User:Xavier_Combelle/TriggerInLog doesn't exist and it appears in http://stats.grok.se/en/201508/User:Xavier_Combelle/TriggerInLog

Moreover an existing "User:Xavier_Combelle" would be enough, just by requesting enough time would appear as an anomaly in attack

A request to Special:MyPage/ccvhjhdkjvkvkhjhkvkjvdh appears in the pagecounts as a visit to User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh

I'm not convinced of the proposed solution. Why downgrade MediaWiki functionality because of an information leak in Wikimedia Foundation infrustructure? I'd rather reverse the burden: we could just stop counting one visit to "User:Xavier_Combelle/ccvhjhdkjvkvkhjhkvkjvdh" and instead add 1 to
"Special:MyPage/ccvhjhdkjvkvkhjhkvkjvdh".

We stopped counting the 302'ing special pages after https://meta.wikimedia.org/wiki/Research_talk:Page_view/Archive_1#Special_namespace_and_actual_problems . It doesn't seem impossible to do the opposite (under some condition):

how would the definition count redirects? Only towards the redirect source, only to the target, both, or some other scheme? --QChris (WMF) (talk) 11:33, 23 October 2014 (UTC)

Apart the fact that it is not really a downgrade of functionality, as the redirect still work as @matmarex pointed, It is not possible to log the target of a redirect differently unless you add a special parameter. (There is no difference between a http request from a redirect than from a direct one) This solution was envisioned and eventually discarded,

Yes, that used to happen but no longer does AFAIK. See the link I provided.

Yes, that used to happen but no longer does AFAIK. See the link I provided.

Even if it's right, http://en.wikipedia.org/wiki/User:Xavier_Combelle or http://en.wikipedia.org/wiki/Talk_User:Xavier_Combelle probably exists and by doing enough request to these pages a similar leak could be reached.

To confirm that this is working as intended:

  1. Open private/incognito window.
  2. Disable JavaScript (e.g. Dev Tools, cmd+shift+P, Disable J.., accept).
  3. Load https://en.wikipedia.org/wiki/Special:Mypage/abc.

Expected result: Confirm that you see User:.../abc in the title, but address bar still shows Special:Mypage/abc.

(When JavaScript is enabled, we use the History API to dynamically rewrite the address bar to be what we'd previously redirect to.)