Page MenuHomePhabricator

OAuth Support for Pywikibot
Closed, DuplicatePublic

Description

OAuth Support for PyWikiBot
Secure Authentication for Wikimedia bots`
March 17, 2015`

  1. Synopsis

The Python MediaWiki Robot Framework (Pywikibot) [1] is a collection of tools made to fit the maintenance need on Wikipedia, but it can also be used on other MediaWiki sites. Like other bots on wikimedia’s Tool-labs pywikibot stores authentication passwords on shared servers. This is fairly vulnerable and our aim in this project is to provide OAuth support to Pywikimedia in order to make it more secure.
In this project I propose a python library pyoauth-2 implementation into pywikimedia-core to make it more secure.

  1. The Project

OAuth is an open standard for authorization. Most of the apps and websites today are taking off OAuth as a standard way to handle authentication. It is literally the most secure way of granting required account permission to third party applications.

2.1. How it Works
OAuth provides client applications a 'secure delegated access' to server resources on behalf of a resource owner.. Designed specifically to work with Hypertext Transfer Protocol (HTTP), OAuth essentially follows these steps to let authorization server to issue access tokens to third-party clients, with the approval of the resource owner, or end-user.[1]

  1. The client submits an authorization request to the server, which validates that the client is a legitimate client of its service.
  2. The server redirects the client to the content provider to request access to its resources.
  3. The content provider validates the user's identity, and often requests their permission to access the resources.
  4. The content provider redirects the client back to the server, notifying it of success or failure. This request includes an authorization code on success.
  5. The server makes an out-of-band request to the content provider and exchanges the authorization code for an access token.

The server can now make requests to the content provider on behalf of the user by passing the access token. Each exchange (client->server, server->content provider) includes validation of a shared secret, but since OAuth 1 can run over an unencrypted connection, each validation cannot pass the secret over the wire.

  1. Goal:

The goal of our project is to enable OAuth integration to pywikibot. This will help pywikibot to store only a token in the shared environment, and in worst case scenario someone can make a few edits with the token, but it can be revoked at any time, and the malicious user can not lock out the rightful user.

  1. Benefits

OAuth Security Provisions
• MediaWiki users can allow other websites to edit and perform other actions using the MediaWiki api on their behalf.
• The attached website does not share the user's password, instead they are issued a unique token and secret to make calls on behalf of the user.
• The access is limited to explicit sets of permissions (“grants”) for the application.[1]
• Users can revoke their authorization of an attached application at any time.
• Administrators can reject entire applications at any time

  1. Details of the project

5.1. Current Status
The current pywikibot authentication mechanism works by storing passwords on shared servers. This is clearly not an secure way of authentication as it poses vulnerability of locking user out of their own account by changing e-mail address, password etc.

The current logging mechanism for pywikimedia is :

mylang = 'en'
family = 'wikinews'
usernames['wikinews']['en'] = u'My Bot Name'

The user needs to have a bot account on some project.

5.2. Proposed Support
To support OAuth for Pywikimedia we can use python package python-oauth2 [2]. It is python implementation of OAuth with several modifications listed below :
• 100% unit test coverage.
• The DataStore object has been completely ripped out. While creating unit tests for the library I found several substantial bugs with the implementation and confirmed with Andy Smith that it was never fully baked.
• Classes are no longer prefixed with OAuth.
• The Request class now extends from dict.
• The library is likely no longer compatible with Python 2.3.
• The Client class works and extends from httplib2. It's a thin wrapper that handles automatically signing any normal HTTP request you might wish to make.

5.3. Example of The proposed work :
Here I have presented a simple Twitter Three-legged OAuth Example which can be easily incorporated into pywikimedia bot.
Below are some snippets of oauth which can be included (with some modifications) into the LoginManager class of login.py. This will allow to authenticate the bot using access keys securely generated by oauth-2 using the HMAC_SHA1 Signature method.
Note : The code below is just to explain the workflow that will be incorporated into pywikibot.

			
		import oauth2 as oauth

		consumer_key = 'user_key_for_pywikimedia'
		consumer_secret = 'my_secret_from_twitter'

		request_token_url = 'http://google.com/oauth/request_token'
		access_token_url = 'http://google.com/oauth/access_token'
		authorize_url = 'http://google.com/oauth/authorize'

		consumer = oauth.Consumer(consumer_key, consumer_secret)
		client = oauth.Client(consumer)

Step 1: Get a temporary request token which will be used for the user to authorize an access token and to sign the request.

		resp, content = client.request(request_token_url, "GET")
		if resp['status'] != '200':
    		raise Exception("Invalid response %s." % resp['status'])

		request_token = dict(urlparse.parse_qsl(content))

Step 2: Redirect to the provider. Since this is just a demo example I didn’t wrote redirect method but can be done very easily.After the user has granted access to you, the consumer, the provider will redirect you to pywikimedia. This can be defined in the oauth_callback argument.

Step 3: Once the consumer has redirected the user back to the oauth_callback
URL pywilibot can request the access token the user has approved. It can use the request token to sign this request. After this is done it throw away the request token and use the access token returned. Pywikibot should store this access token safely, in a database, for future use.

		token = oauth.Token(request_token['oauth_token'],
    		request_token['oauth_token_secret'])
		token.set_verifier(oauth_verifier)
		client = oauth.Client(consumer, token)

It's probably a good idea to put your consumer's OAuth token and OAuth secret into your project's settings.

                consumer = oauth.Consumer(settings.TWITTER_TOKEN, settings.TWITTER_SECRET)
		client = oauth.Client(consumer)

5.4. Test Plan
To complete this task, a unit test would be added to the test suite to perform a login and logout using OAuth with assertions that verify APISite._userinfo is correct, and a second unit test should login, edit a userpage, and confirm the edit was performed using the OAuth-authenticated account. The unit test would be configured to run on travis-ci when the secret key is available in the Travis configuration, and skipped when it isnt.

  1. Required Deliverables :
    1. Pywikibot should be able to perform login and logout using pyoauth2.
    2. By the end of this project Pywikibot should be able to use OAuth for secure authentication and successful login.
    3. Pass regression and unit tests proposed above.
  1. Timeline

This is tentative timeline :
7.1. For Mid-evaluation (3 July 2015) :

•	Implement OAuth support to Pywikibot.
•	Create successful login and log out sessions using OAuth.

7.2. For Final Evaluation (21 August 2015) :

•	Further Modifications if asked by mentors.
•	Unit and Regression Tests.
  1. Additional Info :

Project Wiki Page : http://www.mediawiki.org/wiki/Manual:Pywikibot
Codebase : https://github.com/wikimedia/pywikibot-core
Py-oauth2 : https://github.com/simplegeo/python-oauth2

  1. References

[1]. https://www.mediawiki.org/wiki/OAuth/For_Developers
[2].”py-oauth2”, https://github.com/simplegeo/python-oauth2
[3]. “pywikimedia”, http://en.wikiversity.org/wiki/Pywikipediabot
[4]. http://tools.ietf.org/html/rfc6749

  1. Past Experience

I have worked on a number of open-source projects and have sound knowledge of version control systems like Git.

10.1. I have interned at Samsung Research Institute Bangalore where I did some of the following tasks :

10.1.1.	Integrated  OAuth to one of the core modules of their intranet system. It was done in python and I used   pyoauth2 for that (the same I am proposing here).
10.1.2.	Wrote python version of Android Debug Bridge (ADB) as pyadb.

I cannot disclose the links because of Samsung's non-disclosure agreement.

10.2. Ecommerce Website using Django and python. The project is here.
10.3. I have also worked with a social network group called AroundYoga based in Philadelphia where I have used OmniAuth for authentication which is just a Rails gem of OAuth. Here is the link.
10.4. I have also removed a bug in Gedit. Here is the link.

  1. About Me

Email : sampadmedda@gmail.com
Phone No : (+91) 9739570362
College : National Institute of Technology Karnataka
Class : B.Tech, Computer Engineering, Final Year 2015
Location: India
Time Zone: UTC+5:30
Typical working hours: 6PM to 2AM before 9th May, 12PM to 1AM after 9th May (Indian Standard Time)

Event Timeline

Sampadmedda claimed this task.
Sampadmedda raised the priority of this task from to High.
Sampadmedda updated the task description. (Show Details)
Sampadmedda subscribed.
Aklapper added a project: Pywikibot-General.

This particular problem has already been reported into our bug tracking system. Further handling of the reported issue happens in T74065.

Restricted Application added a subscriber: Unknown Object (MLST). · View Herald TranscriptMar 18 2015, 1:16 PM
Qgil set Security to None.
Qgil subscribed.

This is in fact a stub of a Google-Summer-of-Code (2015) application.

Ricordisamoa renamed this task from OAuth Support for PyWikiBot to OAuth Support for Pywikibot.Mar 18 2015, 1:48 PM

@Sampadmedda , have you started or finished a microtask described in T74065?

You mention that you have a merged bugfix to gedit, and link to https://github.com/ndbroadbent/dotfiles , which is a project rather than your change. Could you link to the diff of your bugfix. Do you have any python code which is open source? Or can you link to your profile on any Q&A site where you have contributed answers to python related questions? etc.

Could not upload file.
Please find my proposed solution at this following link
GSOC-15-Proposal_sampad

Regards
Sampad

@Sampadmedda what do you mean with "Could not upload file"? What kind of file is it?
I think this is your commit to dotfiles, isn't it?

@Sampadmedda, your proposal is not public. Could you add it to this task instead so everyone can comment on it and give feedback? Thank you!

@NiharikaKohli , I updated the description part.

@NiharikaKohli I'd say a page on mediawiki.org is more suitable for such a long document that is supposed to be edited frequently. The task description would then simply link to it.

@Ricordisamoa, Yes that is the link I'm referring to.
Thanks

@Ricordisamoa, this is the new proposal system we are trying out this year. Allows more people to subscribe, drop comments, get notified. I don't think editing here is significantly different from wiki editing either. Phabricator supports wikitext. Proposals on wiki sometimes get lost in the abyss and miss out on feedback. We could always revert back to the wiki page system next year if this doesn't turn out to be more successful. :)

A bit offtopic: I think @Ricordisamoa's main concern is that for each edit a “foobar edited the task description. (Show Details)” appears in the history here so comments get interrupted by it. And depending on the user's settings each subscriber gets notified for each revision.

@NiharikaKohli I'm afraid a system requiring screen-long descriptions and continuous "edited the task description" notifications isn't going to work :-\

@Sampadmedda, you have edited the description 16 times in 3h30, and you could probably have ended up in the same place by doing those edits locally and then posting one update here. Also, if you need a wiki page to draft your proposal, we can offer you all the wiki page you need under your User page in mediawiki.org.

We have good reasons to think that Phabricator beats mediawiki.org when it comes to present a proposal, gather interest, and drive community discussion. We are asking GSoC students to use one tool in addition to Google Melange to share their proposals publicly, and I don't think requesting them to use three tools to handle their proposals is a better idea.

Then again this is a new process, and you are among the first applicants. Some rough edges are expected. Thank you for your understanding.

@Qgil, @Ricordisamoa , I have made my final draft and there would be no further changes from my side unless and until asked specifically by the members or mentors.

Please do understand that I am new to phabricator and wanted my proposal to be as much complete as possible.
This is my first attempt at GSOC and even though I know that good Content speaks all , I just don't want to leave any stone unturned. I was actually learning from other proposals and incorporating the necessary edits.
Now, this is the final edit from my side.
Thank you for understanding.

@Sampadmedda , in your proposal (10.1) you mention that you interned at 'Samsung Research Institute Bangalore'. When was that, and how long were you an intern there?
In 10.3, your link to https://bitbucket.org/SampadMedda/aroundyogatransfer is reporting "You do not have access to this repository." and https://bitbucket.org/SampadMedda says "Nothing to see here These repositories are private. Or they don't exist." Also http://www.aroundyoga.com/ is offline. Is there any public information about this project? Any chance that the code could be open sourced?

@jayvdb, I was intern at Samsung from May 2014 to July2014. I can send send in proof of employment if u need :).

For aroundyoga the repo is private , my employer is cautious about this but I can give u access to a cloned repo if u need.

Aroud yoga is offline as we are migrating it to [[ clot.co | Clot.co ]]

How does your pyadb compare with this codebase, created in 2012

I ported adb to python from scratch for automating Samsung devices as part of my project. I was given their own HQ tweaked codebase, I had no idea about this one....

But going through adb.py seems to match most of the functionalities that I had incorporated in my version except I added some more
like integrating uiautomator capability etc.