Page MenuHomePhabricator

Images of private wikis are publicly accessible if attacker knows the URL or the filename
Closed, ResolvedPublicSecurity

Description

Details

Risk Rating
High
Author Affiliation
WMF Technology Dept

Event Timeline

Found it while debugging T338765.

The ACL has access to public but not the private:
T338765#8958173

Mmm, that's not ideal, but is it intended?

A few questions:

  1. How many wikis are thus affected? And how many buckets (see below)
  2. How are these set up? Presumably some tool has incorrect behaviour
  3. Why do we have both thumbor and thumbor-private?
  4. If we turn off global read, won't that make it impossible for anyone to download the image?

My last question arises because, picking a public example https://commons.wikimedia.org/wiki/File:Geraldine_Ulmar_in_Gilbert_and_Sullivan%27s_The_Mikado.jpg - all of the images and thumbs if you visit them you get to an upload.w.o page without auth - e.g. https://upload.wikimedia.org/wikipedia/commons/thumb/a/a0/Geraldine_Ulmar_in_Gilbert_and_Sullivan%27s_The_Mikado.jpg/530px-Geraldine_Ulmar_in_Gilbert_and_Sullivan%27s_The_Mikado.jpg

I don't see how that could ever work for a private wiki unless those URLs work without auth?

As to the number of buckets question, e.g. for office-wiki:

root@ms-fe2009:~# swift list | grep office
wikipedia-office-local-deleted
wikipedia-office-local-public
wikipedia-office-local-temp
wikipedia-office-local-thumb
wikipedia-office-local-transcoded
wikipedia-office-timeline-render

For answer to most of your questions, go to a page in office wiki. E.g. https://office.wikimedia.org/wiki/Contact_list

Read access must go through mw with thumb.php

For answer to most of your questions, go to a page in office wiki. E.g. https://office.wikimedia.org/wiki/Contact_list

Read access must go through mw with thumb.php

Sorry, I am too stupid. All the images on the contact page are from commons anyway aren't they? And in any case if I want to e.g. download the CEO's picture, the "download" button in media viewer is just a link to the upload.wm.o page https://upload.wikimedia.org/wikipedia/commons/f/fb/Maryana_Iskander.jpg

Note that the image URL only depends on the file name, and can easily be reconstructed by anyone – the /f/fd/ part in the example from the task description is just the first two characters of the MD5 hash of the file name (underscores, no File: prefix).

Example:

not all of them:

<figure class="mw-halign-center" typeof="mw:File"><a href="/wiki/File:Mo_abualruz_picture.jpg" class="mw-file-description"><img src="/w/thumb.php?f=Mo_abualruz_picture.jpg&amp;width=134" decoding="async" class="mw-file-element" srcset="/w/thumb.php?f=Mo_abualruz_picture.jpg&amp;width=201 1.5x, /w/thumb.php?f=Mo_abualruz_picture.jpg&amp;width=268 2x" data-file-width="800" data-file-height="800" width="134" height="134"></a><figcaption></figcaption></figure>

In another way, look at the html https://office.wikimedia.org/wiki/Special:NewFiles

Ladsgroup renamed this task from Images of private wikis are publicly accessible if attacker knows the URL to Images of private wikis are publicly accessible if attacker knows the URL or the filename.Jun 23 2023, 11:28 AM
Ladsgroup added a project: Vuln-Infoleak.

I think that if we take the global-read off wikipedia-office-local-public it will no longer be possible to download original images from office wiki at all.

Is that wrong? [I mean, maybe we should do so anyway and fix it later, but...]

I think that if we take the global-read off wikipedia-office-local-public it will no longer be possible to download original images from office wiki at all.

Is that wrong? [I mean, maybe we should do so anyway and fix it later, but...]

It is wrong :D MediaWiki in private wikis serves the images with this url not the swift one:
https://office.wikimedia.org/w/thumb.php?f=Mo_abualruz_picture.jpg&width=134

Try this logged in and logged out.

That means as long as mw:media has access to those containers, it should be able to retrieve the images and pass it to the authorized user.

And for non-thumb images, the image page links to mw too:
e.g. go to
https://office.wikimedia.org/wiki/File:CA_KPIs_-_Q2.pdf

The link in the page is https://office.wikimedia.org/w/img_auth.php/e/e6/CA_KPIs_-_Q2.pdf which means first mw authorized it and then proxies it to the user.

Thank you for your patience!

Does thumbor-private need write access to anything other than wikipedia-office-local-thumb and wikipedia-office-local-transcoded? of the following

wikipedia-office-local-deleted
wikipedia-office-local-public
wikipedia-office-local-temp
wikipedia-office-local-thumb
wikipedia-office-local-transcoded
wikipedia-office-timeline-render

-deleted probably, an admin should be able to see thumbnail of a deleted image for undeletion or other reasons. not sure about timeline-render, regardless it should be really low priority so we can skip it for now.

Current state:

wikipedia-office-local-deleted:
              Read ACL: mw:thumbor-private,mw:media
             Write ACL: mw:thumbor-private,mw:media
wikipedia-office-local-public:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media
wikipedia-office-local-temp:
              Read ACL: mw:thumbor-private,mw:media
             Write ACL: mw:thumbor-private,mw:media
wikipedia-office-local-thumb:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media
wikipedia-office-local-transcoded:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media
wikipedia-office-timeline-render:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media

So I think we want:

for c in wikipedia-office-local-public wikipedia-office-local-thumb wikipedia-office-local-transcoded wikipedia-office-timeline-render ; 
  do swift post "$c" --read-acl 'mw:thumbor-private,mw:media' --write-acl 'mw:thumbor-private,mw:media' ; 
done

?

i.e. remove global-read everywhere, and just write R/W to thumbor-private and mw

LGTM. Once that's applied and works, we can switch to other private wikis.

Done, and I think working correctly - https://office.wikimedia.org/w/thumb.php?f=Abbrev-bot.png&width=120 gives me a thumb if logged in or an error otherwise:

Error generating thumbnail

Access denied. You do not have permission to access the source file.

Oh, poop, I need to do this in both eqiad and codfw, sorry!

I think upload.wm.o should now be DTRT:

mvernon@ms-fe2012:~$ curl -o /tmp/foo -v -H "Host: upload.wikimedia.org" http://ms-fe2012.codfw.wmnet/wikipedia/office/e/e6/CA_KPIs_-_Q2.pdf
[...]
* Connected to ms-fe2012.codfw.wmnet (10.192.48.44) port 80 (#0)
> GET /wikipedia/office/e/e6/CA_KPIs_-_Q2.pdf HTTP/1.1
> Host: upload.wikimedia.org
> User-Agent: curl/7.74.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 401 Unauthorized

Let's fix collab so we can close the public task:

for c in wikipedia-collab-local-public wikipedia-collab-local-thumb wikipedia-collab-local-transcoded wikipedia-collab-timeline-render ; 
  do swift post "$c" --read-acl 'mw:thumbor-private,mw:media' --write-acl 'mw:thumbor-private,mw:media' ; 
done

look good? Current state:

root@ms-fe2009:~# for i in $(swift list | grep collab); do echo "$i:" ; swift stat "$i" | grep "ACL" ; done
wikipedia-collab-local-deleted:
              Read ACL: mw:thumbor-private,mw:media
             Write ACL: mw:thumbor-private,mw:media
wikipedia-collab-local-public:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media
wikipedia-collab-local-temp:
              Read ACL: mw:thumbor-private,mw:media
             Write ACL: mw:thumbor-private,mw:media
wikipedia-collab-local-thumb:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media
wikipedia-collab-local-transcoded:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media
wikipedia-collab-timeline-render:
              Read ACL: mw:thumbor,mw:media,.r:*
             Write ACL: mw:thumbor,mw:media

Okay, for that I suggest we make a ticket for traffic but that should be pretty low priority given that logged in users wouldn't load the file directly (by hitting https://upload.wikimedia.org/wikipedia/office/f/fd/401k_Rollover_Contribution_Form.pdf) so it won't get cached in edges (unless mw does it internally...)

There is some sadness here because we're not consistent in naming, but...

#download https://noc.wikimedia.org/conf/dblists/private.dblist
for i in $( sed -re '/^#/d;s/wiki(media)?$//;s/_/-/' private.dblist ); do
  if swift list | grep -q "wikimedia-${i}-local-public" ; then
    prefix="wikimedia-$i"
  elif swift list | grep -q "wikipedia-${i}-local-public" ; then
    prefix="wikipedia-$i"
  else echo "$i not found"; continue
  fi
  echo "$prefix"
  for suffix in local-public local-thumb local-transcoded timeline-render ; do
    swift post "${prefix}-${suffix}" --read-acl 'mw:thumbor-private,mw:media' --write-acl 'mw:thumbor-private,mw:media'
  done
done

?

I'll tweak to add an echo and paste the output for plausibility checking...

P49474 is output of the above (with an echo before the swift post).

that would work. One fun aspect of this is that if thumbor doesn't have it in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/thumbor/values.yaml#87 then we change the containers to be accessed only by thumbor-pirvate but thumbor itself would try to access via the public account.

@Ladsgroup do you want to try and fix that values.yaml file? and/or have me restrict the set of private wikis I'm trying to fix now?

that would work. One fun aspect of this is that if thumbor doesn't have it in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/thumbor/values.yaml#87 then we change the containers to be accessed only by thumbor-pirvate but thumbor itself would try to access via the public account.

that being said, it really can't be more broken than it is right now...

@Ladsgroup do you want to try and fix that values.yaml file? and/or have me restrict the set of private wikis I'm trying to fix now?

I can fix the thumbor charts.

OK, I've run that script in both eqiad and codfw, so this is fixed for now.

I think @Ladsgroup is kindly fixing the thumbor charts.

New private wikis are created thus:
https://wikitech.wikimedia.org/wiki/Add_a_wiki#IMPORTANT:_For_Private_Wikis

which, per IRC, ends up calling https://github.com/wikimedia/mediawiki-extensions-WikimediaMaintenance/blob/master/filebackend/setZoneAccess.php

which to my untrained eye ought to be doing something half-way plausible with ACLs ?

[though, does it know to use thumbor-private rather than thumbor?]

sbassett changed the task status from Open to In Progress.Jun 23 2023, 2:12 PM
sbassett triaged this task as High priority.
sbassett added a project: SecTeam-Processed.
sbassett moved this task from Incoming to Watching on the Security-Team board.

Thanks @Ladsgroup and @MatthewVernon for the quick action on this. I'd guess that attempting to analyze potential exposure for something like this would be a bit of a nightmare.

We could query hadoop for anything like https://upload.wikimedia.org/wikipedia/office/... and that should workTM given that private wikis have a different url for loading ( https://office.wikimedia.org/w/img_auth.php/...) but that goes as far as three months and this probably have been broken for years.

New private wikis are created thus:
https://wikitech.wikimedia.org/wiki/Add_a_wiki#IMPORTANT:_For_Private_Wikis

which, per IRC, ends up calling https://github.com/wikimedia/mediawiki-extensions-WikimediaMaintenance/blob/master/filebackend/setZoneAccess.php

which to my untrained eye ought to be doing something half-way plausible with ACLs ?

Yeah but this is not the first time the file backend code in mediawiki having serious and critical bugs (which we fixed multiple ones just in the last couple of months) and it's also not written in readable nor understandable manner. Until further notice, any new private wiki created should be double checked by SRE if it has local upload enabled IMHO.

I've updated the notes on creating new wikis to note the need to get container permissions fixed, and also added https://wikitech.wikimedia.org/wiki/Swift/How_To#Checking_/_Fixing_container_ACLs_for_private_wikis so that future-us will know what needs doing.

So what are the next steps for this?

So what are the next steps for this?

The immediate issue from this task seems resolved - is that correct? I'm not sure what follow-up work still needs to happen.

A couple of follow ups would be useful:

  • Doing the hadoop query to see the number in the past ninety days for at least officewiki (which should be easy)
  • Notifying legal possibly?
  • Some follow up on how to have tests for this to have some way of alerting in case ACL breaks again
  • Fixing the setZone part in mw filebackend
  • They also got cached in edges, maybe some follow ups in this part.

A couple of follow ups would be useful:

  • Doing the hadoop query to see the number in the past ninety days for at least officewiki (which should be easy)
  • Notifying legal possibly?
  • Some follow up on how to have tests for this to have some way of alerting in case ACL breaks again
  • Fixing the setZone part in mw filebackend
  • They also got cached in edges, maybe some follow ups in this part.

Tagging Privacy Engineering for guidance on at least some of the above.

We definitely should run a hadoop query (or a set of queries) to get a sense of access over the past 90 days. I pulled database codes / domain names from canonical_data.wikis where status = "open" and visibility = "private" and got the following list:

database_codedomain
advisorswikiadvisors.wikimedia.org
arbcom_cswikiarbcom-cs.wikipedia.org
arbcom_dewikiarbcom-de.wikipedia.org
arbcom_enwikiarbcom-en.wikipedia.org
arbcom_fiwikiarbcom-fi.wikipedia.org
arbcom_nlwikiarbcom-nl.wikipedia.org
arbcom_ruwikiarbcom-ru.wikipedia.org
auditcomwikiauditcom.wikimedia.org
boardgovcomwikiboardgovcom.wikimedia.org
boardwikiboard.wikimedia.org
chairwikichair.wikimedia.org
chapcomwikiaffcom.wikimedia.org
checkuserwikicheckuser.wikimedia.org
collabwikicollab.wikimedia.org
ecwikimediaec.wikimedia.org
electcomwikielectcom.wikimedia.org
execwikiexec.wikimedia.org
fdcwikifdc.wikimedia.org
grantswikigrants.wikimedia.org
id_internalwikimediaid-internal.wikimedia.org
iegcomwikiiegcom.wikimedia.org
ilwikimediail.wikimedia.org
legalteamwikilegalteam.wikimedia.org
movementroleswikimovementroles.wikimedia.org
noboard_chapterswikimedianoboard-chapters.wikimedia.org
officewikioffice.wikimedia.org
ombudsmenwikiombuds.wikimedia.org
otrs_wikiwikivrt-wiki.wikimedia.org
projectcomwikiprojectcom.wikimedia.org
stewardwikisteward.wikimedia.org
sysop_itwikisysop-it.wikipedia.org
techconductwikitechconduct.wikimedia.org
wg_enwikiwg-en.wikipedia.org
wikimaniateamwikiwikimaniateam.wikimedia.org

Could we run queries to get unauthorized access for these wikis? We should take up the question of notifying legal once we have a sense of the scale of the problem.

Re: cached data — is there any way of getting all potentially vulnerable files and forcing them to be purged from the cache?

Change 934654 had a related patch set uploaded (by QChris; author: Christian Aistleitner):

[integration/config@master] Zuul: Follow IncidentReporting -> ReportIncident extension rename

https://gerrit.wikimedia.org/r/934654

We definitely should run a hadoop query (or a set of queries) to get a sense of access over the past 90 days. I pulled database codes / domain names from canonical_data.wikis where status = "open" and visibility = "private" and got the following list:

database_codedomain
advisorswikiadvisors.wikimedia.org
arbcom_cswikiarbcom-cs.wikipedia.org
arbcom_dewikiarbcom-de.wikipedia.org
arbcom_enwikiarbcom-en.wikipedia.org
arbcom_fiwikiarbcom-fi.wikipedia.org
arbcom_nlwikiarbcom-nl.wikipedia.org
arbcom_ruwikiarbcom-ru.wikipedia.org
auditcomwikiauditcom.wikimedia.org
boardgovcomwikiboardgovcom.wikimedia.org
boardwikiboard.wikimedia.org
chairwikichair.wikimedia.org
chapcomwikiaffcom.wikimedia.org
checkuserwikicheckuser.wikimedia.org
collabwikicollab.wikimedia.org
ecwikimediaec.wikimedia.org
electcomwikielectcom.wikimedia.org
execwikiexec.wikimedia.org
fdcwikifdc.wikimedia.org
grantswikigrants.wikimedia.org
id_internalwikimediaid-internal.wikimedia.org
iegcomwikiiegcom.wikimedia.org
ilwikimediail.wikimedia.org
legalteamwikilegalteam.wikimedia.org
movementroleswikimovementroles.wikimedia.org
noboard_chapterswikimedianoboard-chapters.wikimedia.org
officewikioffice.wikimedia.org
ombudsmenwikiombuds.wikimedia.org
otrs_wikiwikivrt-wiki.wikimedia.org
projectcomwikiprojectcom.wikimedia.org
stewardwikisteward.wikimedia.org
sysop_itwikisysop-it.wikipedia.org
techconductwikitechconduct.wikimedia.org
wg_enwikiwg-en.wikipedia.org
wikimaniateamwikiwikimaniateam.wikimedia.org

Could we run queries to get unauthorized access for these wikis? We should take up the question of notifying legal once we have a sense of the scale of the problem.

Yeah, it should be rather easy to do it, query webrequest with uri_host = 'upload.wikimedia.org' and url_path like '/wikipedia/office/%' (made from top of my head, not sure if it works 100%)

Re: cached data — is there any way of getting all potentially vulnerable files and forcing them to be purged from the cache?

So they get cached for seven days and now it's easier to just let them expire. My thinking was along the lines of possibly swift setting cache header is private in case something similar happens in the future. But it's honestly so low-prio that we can simply not do it.

Change 934654 abandoned by Hashar:

[integration/config@master] Zuul: Follow IncidentReporting -> ReportIncident extension rename

Reason:

Kosta went to do the same via https://gerrit.wikimedia.org/r/c/integration/config/+/935044/ which I have merged before noticing your change :)

https://gerrit.wikimedia.org/r/934654

Given that more than ninety days have passed since this bug got fixed, we don't have any logs of who might have accessed the private files. I suggest closing this and filing follow ups for fixing setZone and other issues?

sbassett claimed this task.
sbassett moved this task from Watching to Our Part Is Done on the Security-Team board.

Given that more than ninety days have passed since this bug got fixed, we don't have any logs of who might have accessed the private files. I suggest closing this and filing follow ups for fixing setZone and other issues?

Sounds fine. Is there anything keeping this task from becoming public? After a quick glance, I'm not seeing any obvious PII?

Sounds fine with me. Maybe @MatthewVernon might have objections though?

No complaints from me (after all, the docs update at least hints that there has been a problem in this area).

sbassett changed Author Affiliation from N/A to WMF Technology Dept.Jun 21 2024, 5:05 PM
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed the edit policy from "Custom Policy" to "All Users".
sbassett changed Risk Rating from N/A to High.