Page MenuHomePhabricator

Pywikibot OAuth/BotPassword authentication fails when login to third-party Wikimedia sites (Superset, Commons Query)
Open, MediumPublicBUG REPORT

Description

Pywikibot authentication with username and password works as expected when users need to authenticate to third-party sites such as superset.toolforge.org and commons-query.wikimedia.org. However, when OAuth or BotPassword authentication methods are used, the followup login to these third-party sites fails.

This issue is not Pywikibot-specific and also affects other tools like PAWS. The root cause is likely related to how mediawiki OAuth and BotPassword logins are implemented. Ie. followup Superset and Commons-query would require active web login session to make 3rd party Oauth login work.

Other related tickets

  • T395664 TestSupersetWithAuth.test_login_and_oauth_permission tests fails after moving to Botpassword

Steps to Reproduce:

Working scenario (username/password):

  1. User logs into https://meta.wikimedia.org using web browser
  2. User logs into https://superset.toolforge.org using web browser
  3. User configures Pywikibot with plain username/password:

user-config.py:

usernames["meta"]["meta"] = "WIKIMEDIA_USERNAME"
  1. User runs the following script:

superset_test.py:

import pywikibot
from pywikibot.data.superset import SupersetQuery

sql_query = "SELECT page_title FROM page LIMIT 1"
site = pywikibot.Site('meta', 'meta')
site.login()
superset = SupersetQuery(site=site)
pages = superset.query(sql_query)
print(pages)

Expected output:

[{'page_title': '!vote'}]

Failing scenario example (BotPassword):

  1. User logs into https://meta.wikimedia.org using web browser
  2. User logs into https://superset.toolforge.org using web browser
  3. Register the Botpassword credentials on https://meta.wikimedia.org/wiki/Special:BotPasswords
  4. Configure user-config.py to use bottpassword: (see. https://www.mediawiki.org/wiki/Manual:Pywikibot/BotPasswords )
  5. User runs superset_test.py

user-config.py:

usernames["meta"]["meta"] = "WIKIMEDIA_USERNAME
password_file = "user-password.py"

user-password.py:

('zache-test', BotPassword('BOTNAME', 'BOTPASSWORD'))

Actual Result:

  • Script enters an HTTP redirect loop
  • When opening the redirect URL in a browser, it shows a login form with the message: "The request to sign in was denied" (see screenshot)

*Error log*

ERROR: An error occurred for uri https://meta.wikimedia.org/w/index.php?title=Special:OAuth/approve&returnto=%2Fw%2Frest.php%2Foauth2%2Fauthorize&returntoquery=client_id%3D__ID_REMOVED__%26redirect_uri%3Dhttps%253A%252F%252Fsuperset.wmcloud.org%252Foauth-authorized%252Fmediawiki%26response_type%3Dcode%26scope%3Dmwoauth-authonlyprivate%26state%3D__STATE_REMOVED__&client_id=__CLIENT_ID_REMOVED__&oauth_version=2&scope=mwoauth-authonlyprivate
Traceback (most recent call last):
  File "/Users/wiki/79/PendingChangesBot-ng/app/../foo.py", line 16, in <module>
    superset.login()
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/pywikibot/data/superset.py", line 88, in login
    self.last_response = http.fetch(url)
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/pywikibot/comms/http.py", line 460, in fetch
    callback(response)
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/pywikibot/comms/http.py", line 346, in error_handling_callback
    raise response from None
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/pywikibot/comms/http.py", line 451, in fetch
    response = session.request(method, uri,
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/requests/sessions.py", line 724, in send
    history = [resp for resp in gen]
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/requests/sessions.py", line 724, in <listcomp>
    history = [resp for resp in gen]
  File "/Users/wiki/79/PendingChangesBot-ng/venv/lib/python3.9/site-packages/requests/sessions.py", line 191, in resolve_redirects
    raise TooManyRedirects(
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
CRITICAL: Exiting due to uncaught exception TooManyRedirects: Exceeded 30 redirects.

Expected Result:
Third-party site authentication should work with OAuth and BotPassword the same way it works with username/password authentication.

Environment:

  • Pywikibot version: 10.6.0
  • Python version: Python 3.9.6
  • Authentication methods tested: Username+Password, OAuth, BotPassword
  • Affected third-party sites: superset.toolforge.org, commons-query.wikimedia.org

Event Timeline

Zache updated the task description. (Show Details)

Seems this is the same as T395664 found during tests.

Yes, it is afaik same. Do you have any idea if it would be possible to get the Oauth/Botpassword working for third part oauth logins?

I was not able to find any easy way to do it and i wrote the ticket mostly for documenting the issue and the practical task would be to add error handling and error messages to superset/WCQS logins so that user will know why it fails.

Zache updated the task description. (Show Details)

@Zache: I tested some accounts. First I had to logout the account from wikimedia site, login to wikimedia via superset and grant the access for this message:

In order to complete your request, superset needs permission to access information about you, including your email address, on all projects of this site. No changes will be made with your account.

The results:

account typeaccess
Regular account
2FA
Oauth account
Bot password

I'll try bot password again later. Maybe I have overseen something.

@Xqt If you have time i would you like to write step-by-step guide how you got the Oauth working? ( I am trying to figure out what i am doing differently )

Bot Password accounts get the following response:

looks like https://meta.wikimedia.org/w/rest.php/oauth2/authorize?response_type=code&client_id=…&redirect_uri=…&scope=mwoauth-authonlyprivate&state=…

And there is another problem with login: If there is a different site from meta superset takes it and access may fail if the other site has a different user registerd. As an example U can take this test:

def test_overriding_site(self):
    """Test overriding schema using site"""
    sql = 'SELECT page_id, page_title FROM page LIMIT 2;'
    superset = SupersetQuery(schema_name='enwiki_p', database_id=2)
    testsite = pywikibot.Site('wikipedia:fi')
    rows = superset.query(sql, site=testsite)
    self.assertLength(rows, 2)

@Zache: I have checked the cookie files. I think BotPassword accounts may not work for web or OAuth logins because they do not receive a central OAuth session (centralauth_Session). Can you (or someone) confirm?

@Zache: I have checked the cookie files. I think BotPassword accounts may not work for web or OAuth logins because they do not receive a central OAuth session (centralauth_Session). Can you (or someone) confirm?

This how i understood it also, but I do not understand it well enough to know if something can be done about it.

@Zache: I have checked the cookie files. I think BotPassword accounts may not work for web or OAuth logins because they do not receive a central OAuth session (centralauth_Session). Can you (or someone) confirm?

This how i understood it also, but I do not understand it well enough to know if something can be done about it.

I fear we cannot do anything useful on the Pywikibot side. Maybe a clientlogin for a BotPassword account would help, as long as you can enter the email auth code, but this is not practical for automatically running bots or the test suite.

taavi subscribed.

Superset is not intended for non-interactive SQL queries like this, please see https://wikitech.wikimedia.org/wiki/Help:Wiki_Replicas for supported ways to query the wiki replicas. Thus untagging us.

@taavi Adding Superset support to Pywikibot was pretty one of the first use cases for Superset when it published the replica access, and the point of it is that it doesn't require a Toolforge account as because managing the new Toolforge accounts doesn't scale. For example, there are security reasons not to create Toolforge accounts for random workshop participants or, in our current case, for Outreachy contribution period volunteers, if nothing else. Using OAuth login and HTTP API will scale to that usecase.

Another thing is that this affects other tools too, not just superset. For example, Wikimedia Commons Query Service, which is used and it is meant to be used from bots. Also If the general idea is to put tools behind OAuth, then it will become a problem pretty fast for the bots and other server side tools, which you need to address. (i.e., https://meta.wikimedia.org/wiki/Third-party_resources_policy)

Superset is not intended for non-interactive SQL queries like this, please see https://wikitech.wikimedia.org/wiki/Help:Wiki_Replicas for supported ways to query the wiki replicas. Thus untagging us.

In that case, I’m wondering why Superset exposes API endpoints such as https://superset.wmcloud.org/api/v1/me/ if it’s not meant for non-interactive use.

This comment was removed by Xqt.