Page MenuHomePhabricator

_getUserDataOld call from low-level getUrl
Closed, ResolvedPublic


Originally from:
Reported by: valhallasw
Created on: 2013-04-13 20:51:59
Subject: _getUserDataOld call from low-level getUrl
Original description:
From :

I just wanted to put\(\) a simple page on a MediaWiki 1.16
instance, where I have to use screen scraping \(use\_api=False\).

There is something strange however:

There is an API call invoked by \_getBlocked:


Here's my backtrace:

File "pywikipedia/", line 693, in get
expandtemplates = expandtemplates\)

File "pywikipedia/", line 743, in \_getEditPage
return self.\_getEditPageOld\(get\_redirect, throttle, sysop, oldid, change\_edit\_time\)

File "pywikipedia/", line 854, in \_getEditPageOld
text =\(\).getUrl\(path, sysop = sysop\)

File "pywikipedia/", line 5881, in getUrl
self.\_getUserDataOld\(text, sysop = sysop\)

File "pywikipedia/", line 6016, in \_getUserDataOld
blocked = self.\_getBlock\(sysop = sysop\)

File "pywikipedia/", line 5424, in \_getBlock
data = query.GetData\(params, self\)

File "pywikipedia/", line 146, in GetData
jsontext = site.getUrl\( path, retry=True, sysop=sysop, data=data\)

getUrl\(\), which is also called from API, seems always
to call \_getUserDataOld\(text\) where text is ... API output
so it tries to do strange things on that and gives warnings

Note: this language does not allow global bots.

WARNING: Token not found on wikipedia:pl. You will not be able to edit any page.

which is nonsense since the analyzed text is not HTML - only API output.

If getUrl\(\) is supposed to be a low-level call, why call \_getUserDataOld\(\)

has introduced this call there.

It's easily reproducable by this:

import wikipedia
import config
config.use\_api = False
wikipedia.verbose = True
s = wikipedia.getSite\("pl", "wikipedia"\)
p = wikipedia.Page\(s, u"User:Saper"\)
c = p.get\(\)
c += "<\!-- test -->"
p.put\(c, u"Testing wiki", botflag=False\)


Version: compat-(1.0)
Severity: minor
See Also:

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:10 AM
bzimport set Reference to bz54548.
bzimport added a subscriber: Unknown Object (????).

reproduced, since it's in compat I set priority to minor

jayvdb set Security to None.
jayvdb added a subscriber: saper.

Find wpEditToken HTML in action=edit page, and line 7511 in current in compat....regex not match so _getUserDataOld() cannot get edit token.
I fixed it in my local, but I don't have permission to push into gerrit.

Change 215587 had a related patch set uploaded (by Gerrit Patch Uploader):
Fix T56548, regex failure in _getUserDataOld().

This comment was removed by Alexsh.

Change 215587 had a related patch set uploaded (by Xqt):
Fix T56548, regex failure in _getUserDataOld().

Aklapper lowered the priority of this task from Low to Lowest.Jun 5 2015, 1:41 PM
Aklapper added a subscriber: Aklapper.

Pywikibot has two versions: Compat and Core. This task was filed about the older version, called Pywikibot-compat, which is not under active development anymore. Hence I'm lowering the priority of this task to reflect the reality. Unfortunately, the Pywikibot team does not have the manpower to retest every single bug report / feature request against the (maintained) Pywikibot code base. Furthermore, the code base of Pywikibot-Compat has changed a lot compared to the code base of Pywikibot-Core so there is a chance that the problem described in this task might not exist anymore. Please help: Unfortunately manpower is limited and does not allow testing every single reported task again. If you have time and interest in Pywikibot, please upgrade to Pywikibot-Core and add a comment to this task if the problem in this task still happens in Pywikibot-Core (or directly edit the task by removing the Pywikibot-compat project and adding the Pywikibot project to this task). To learn more about Pywikibot and to get involved in its development, please check out Thank you for your understanding.

Change 215587 merged by jenkins-bot:
Fix T56548, regex failure in _getUserDataOld().