OAuth Support for PyWikiBot
Secure Authentication for Wikimedia bots`
March 17, 2015`
- Synopsis
The Python MediaWiki Robot Framework (Pywikibot) [1] is a collection of tools made to fit the maintenance need on Wikipedia, but it can also be used on other MediaWiki sites. Like other bots on wikimedia’s Tool-labs pywikibot stores authentication passwords on shared servers. This is fairly vulnerable and our aim in this project is to provide OAuth support to Pywikimedia in order to make it more secure.
In this project I propose a python library pyoauth-2 implementation into pywikimedia-core to make it more secure.
- The Project
OAuth is an open standard for authorization. Most of the apps and websites today are taking off OAuth as a standard way to handle authentication. It is literally the most secure way of granting required account permission to third party applications.
2.1. How it Works
OAuth provides client applications a 'secure delegated access' to server resources on behalf of a resource owner.. Designed specifically to work with Hypertext Transfer Protocol (HTTP), OAuth essentially follows these steps to let authorization server to issue access tokens to third-party clients, with the approval of the resource owner, or end-user.[1]
- The client submits an authorization request to the server, which validates that the client is a legitimate client of its service.
- The server redirects the client to the content provider to request access to its resources.
- The content provider validates the user's identity, and often requests their permission to access the resources.
- The content provider redirects the client back to the server, notifying it of success or failure. This request includes an authorization code on success.
- The server makes an out-of-band request to the content provider and exchanges the authorization code for an access token.
The server can now make requests to the content provider on behalf of the user by passing the access token. Each exchange (client->server, server->content provider) includes validation of a shared secret, but since OAuth 1 can run over an unencrypted connection, each validation cannot pass the secret over the wire.
- Goal:
The goal of our project is to enable OAuth integration to pywikibot. This will help pywikibot to store only a token in the shared environment, and in worst case scenario someone can make a few edits with the token, but it can be revoked at any time, and the malicious user can not lock out the rightful user.
- Benefits
OAuth Security Provisions
• MediaWiki users can allow other websites to edit and perform other actions using the MediaWiki api on their behalf.
• The attached website does not share the user's password, instead they are issued a unique token and secret to make calls on behalf of the user.
• The access is limited to explicit sets of permissions (“grants”) for the application.[1]
• Users can revoke their authorization of an attached application at any time.
• Administrators can reject entire applications at any time
- Details of the project
5.1. Current Status
The current pywikibot authentication mechanism works by storing passwords on shared servers. This is clearly not an secure way of authentication as it poses vulnerability of locking user out of their own account by changing e-mail address, password etc.
The current logging mechanism for pywikimedia is :
mylang = 'en' family = 'wikinews' usernames['wikinews']['en'] = u'My Bot Name'
The user needs to have a bot account on some project.
5.2. Proposed Support
To support OAuth for Pywikimedia we can use python package python-oauth2 [2]. It is python implementation of OAuth with several modifications listed below :
• 100% unit test coverage.
• The DataStore object has been completely ripped out. While creating unit tests for the library I found several substantial bugs with the implementation and confirmed with Andy Smith that it was never fully baked.
• Classes are no longer prefixed with OAuth.
• The Request class now extends from dict.
• The library is likely no longer compatible with Python 2.3.
• The Client class works and extends from httplib2. It's a thin wrapper that handles automatically signing any normal HTTP request you might wish to make.
5.3. Example of The proposed work :
Here I have presented a simple Twitter Three-legged OAuth Example which can be easily incorporated into pywikimedia bot.
Below are some snippets of oauth which can be included (with some modifications) into the LoginManager class of login.py. This will allow to authenticate the bot using access keys securely generated by oauth-2 using the HMAC_SHA1 Signature method.
Note : The code below is just to explain the workflow that will be incorporated into pywikibot.
import oauth2 as oauth consumer_key = 'user_key_for_pywikimedia' consumer_secret = 'my_secret_from_twitter' request_token_url = 'http://google.com/oauth/request_token' access_token_url = 'http://google.com/oauth/access_token' authorize_url = 'http://google.com/oauth/authorize' consumer = oauth.Consumer(consumer_key, consumer_secret) client = oauth.Client(consumer)
Step 1: Get a temporary request token which will be used for the user to authorize an access token and to sign the request.
resp, content = client.request(request_token_url, "GET") if resp['status'] != '200': raise Exception("Invalid response %s." % resp['status']) request_token = dict(urlparse.parse_qsl(content))
Step 2: Redirect to the provider. Since this is just a demo example I didn’t wrote redirect method but can be done very easily.After the user has granted access to you, the consumer, the provider will redirect you to pywikimedia. This can be defined in the oauth_callback argument.
Step 3: Once the consumer has redirected the user back to the oauth_callback
URL pywilibot can request the access token the user has approved. It can use the request token to sign this request. After this is done it throw away the request token and use the access token returned. Pywikibot should store this access token safely, in a database, for future use.
token = oauth.Token(request_token['oauth_token'], request_token['oauth_token_secret']) token.set_verifier(oauth_verifier) client = oauth.Client(consumer, token)
It's probably a good idea to put your consumer's OAuth token and OAuth secret into your project's settings.
consumer = oauth.Consumer(settings.TWITTER_TOKEN, settings.TWITTER_SECRET) client = oauth.Client(consumer)
5.4. Test Plan
To complete this task, a unit test would be added to the test suite to perform a login and logout using OAuth with assertions that verify APISite._userinfo is correct, and a second unit test should login, edit a userpage, and confirm the edit was performed using the OAuth-authenticated account. The unit test would be configured to run on travis-ci when the secret key is available in the Travis configuration, and skipped when it isnt.
- Required Deliverables :
- Pywikibot should be able to perform login and logout using pyoauth2.
- By the end of this project Pywikibot should be able to use OAuth for secure authentication and successful login.
- Pass regression and unit tests proposed above.
- Timeline
This is tentative timeline :
7.1. For Mid-evaluation (3 July 2015) :
• Implement OAuth support to Pywikibot. • Create successful login and log out sessions using OAuth.
7.2. For Final Evaluation (21 August 2015) :
• Further Modifications if asked by mentors. • Unit and Regression Tests.
- Additional Info :
Project Wiki Page : http://www.mediawiki.org/wiki/Manual:Pywikibot
Codebase : https://github.com/wikimedia/pywikibot-core
Py-oauth2 : https://github.com/simplegeo/python-oauth2
- References
[1]. https://www.mediawiki.org/wiki/OAuth/For_Developers
[2].”py-oauth2”, https://github.com/simplegeo/python-oauth2
[3]. “pywikimedia”, http://en.wikiversity.org/wiki/Pywikipediabot
[4]. http://tools.ietf.org/html/rfc6749
- Past Experience
I have worked on a number of open-source projects and have sound knowledge of version control systems like Git.
10.1. I have interned at Samsung Research Institute Bangalore where I did some of the following tasks :
10.1.1. Integrated OAuth to one of the core modules of their intranet system. It was done in python and I used pyoauth2 for that (the same I am proposing here). 10.1.2. Wrote python version of Android Debug Bridge (ADB) as pyadb.
I cannot disclose the links because of Samsung's non-disclosure agreement.
10.2. Ecommerce Website using Django and python. The project is here.
10.3. I have also worked with a social network group called AroundYoga based in Philadelphia where I have used OmniAuth for authentication which is just a Rails gem of OAuth. Here is the link.
10.4. I have also removed a bug in Gedit. Here is the link.
- About Me
Email : sampadmedda@gmail.com
Phone No : (+91) 9739570362
College : National Institute of Technology Karnataka
Class : B.Tech, Computer Engineering, Final Year 2015
Location: India
Time Zone: UTC+5:30
Typical working hours: 6PM to 2AM before 9th May, 12PM to 1AM after 9th May (Indian Standard Time)