Openrefine stores passwords in plain text. This is now publicly accessible in PAWS public.
Description
Details
- Risk Rating
- Medium
- Author Affiliation
- Wikimedia Communities
Related Objects
Event Timeline
To be more specific, the files workspace.json and workspace.old.json both contains in clear the username and password wikidata_credentials":["username", "password"] and now on PAWS these files are public (here is mines https://public.paws.wmcloud.org/1200/ for instance).
Is it possible to have non-public files on PAWS?
Otherwise, I suppose direct Wikidata editing should be disabled, and users would have to make edits through the QuickStatements export.
Side note, @Bouzinac has all but guessed this vulnerability over at https://www.wikidata.org/wiki/Wikidata_talk:Tools/OpenRefine#OpenRefine_now_available_on_PAWS%21, I suggest reaching out to prevent any further public speculation.
It should be possible. But we'd probably need to automate file permissions on the workspace.json file.
Side note, @Bouzinac has all but guessed this vulnerability over at https://www.wikidata.org/wiki/Wikidata_talk:Tools/OpenRefine#OpenRefine_now_available_on_PAWS%21, I suggest reaching out to prevent any further public speculation.
I already have sent him an email (and this is indeed how how I found the problem).
This is not a security issue with PAWS. The file is world-readable. I changed that one to not be world readable. If you open a shell in PAWS and use chmod 600 workspace.json it will fix this.
@VIGNERON Openrefine is changing the files to world-readable somehow. I don't know why yet, but it should not be used until this is fixed. Please stop using it. We will remove it from PAWS until this is sorted out.
Yay, I saw that and I'm not using it (not OpenRefine itself, I just looked at the files in PAWS, I'll stop totally to avoid any problem).
At a glance I don’t see any special permission stuff in saveWorkspace(). As far as I can tell, Java programs (or at least the java.io.File part) seem to respect the umask; maybe we can change the umask to something like 0027 before launching OpenRefine?
Ok, I've forced a pull of the new image without openrefine and deleted all user servers.
Ok, at this point, we are now working on fixing the whole issue with openrefine and PAWS definitely respects filesystem permissions (found ways around it so I created https://github.com/toolforge/paws/pull/73 (actually already deployed).
Grepping for wikidata_credentials I only found three affected users. I've changed permissions on the files to limit the damage but credentials need to be rotated.
PKM@PKMbot bouzinac Martin Urbanec@OpenRefinePAWS
I talked to @Urbanecm on IRC and provided a list of all folks who used it to rotate creds in case they'd logged in with wiki credentials.
Bot passwords PKM@PKMbot and Martin Urbanec@OpenRefinePAWS invalidated, user account Bouzinac disabled temporarily as compromised. Users informed.
I've push out the umask change for openrefine. The actual installation in the image is now at https://github.com/toolforge/paws/pull/74
It is perhaps possible to test that in minikube with everything before re-releasing it. I have not yet.
Deploying the new image with open refine on board (with a new umask trick). Hopefully, this time all will be well. You may need to restart PAWS servers to see the change (since I'm not forcibly blasting them all away like I did initially).
Ok, it's up and ready for testing. It may not be as useful with openrefine files all private now (I hope), but please check PAWS public if you try it out to be sure that it isn't doing anything bad before putting credentials you care about in there.
Thank you all for working on this!
I think it makes sense to keep all OpenRefine files private. Users can go through the UI if they want to download project data in various formats.
I do not think it is appropriate. Private files in PAWS should be the exception, OpenRefine should keep the files with the password private and then we can remove the umask. Ideally we can even use oAuth tokens from PAWS to authenticate and not even connect with passwords.
Anything here that would keep us from making this task public? I'm not seeing anything obvious.
Not at this point. When created this would call attention to the plaintext stored password of several users.