Page MenuHomePhabricator

Bulk loading pages using API broken due to namespace aliases
Closed, DeclinedPublic

Description

It seems this user (User:Elvire) has gender specified as "female", and it changes the default User namespace alias (Portuguese language: default is Utilizador; female is Utilizadora).
The bot cannot find the gender-mapped page in the list of fetched pages:

touch.py -family:wiktionary -lang:pt Utilizador_discussão:Elvire

'git' is not recognized as an internal or external command,
operable program or batch file.
Getting 1 page via API from wiktionary:pt...
BUG>> title Utilizadora Discussão:Elvire ([[pt:Utilizador Discussão:Elvire]]) no
t found in list
Expected one of: [[pt:Utilizadora Discussão:Elvire]]
Traceback (most recent call last):

File "D:\Work\pywikipedia\pagegenerators.py", line 1234, in __iter__
  for loaded_page in self.preload(somePages):
File "D:\Work\pywikipedia\pagegenerators.py", line 1253, in preload
  pywikibot.getall(site, pagesThisSite)
File "D:\Work\pywikipedia\wikipedia.py", line 5512, in getall
  _GetAll(site, pages, throttle, force).run()
File "D:\Work\pywikipedia\wikipedia.py", line 5128, in run
  self.oneDoneApi(vals)
File "D:\Work\pywikipedia\wikipedia.py", line 5406, in oneDoneApi
  raise PageNotFound

PageNotFound

Pywikibot: wikipedia.py (r-1 (unknown), 1b27881, 2013/10/14, 11:11:46, OUTDATED)

Release version: 1.0b1
Python: 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit (AMD64)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok


Version: compat-(1.0)
Severity: normal
OS: Windows 7
Platform: PC

Details

Reference
bz55996

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 2:23 AM
bzimport set Reference to bz55996.
bzimport added a subscriber: Unknown Object (????).

same problem with other scripts (replace.py, template.py) and
[[diskuse s wikipedistkou:Username]] (female user talk page) in cs.wiki

This should be fixed (inadvertently) due by https://gerrit.wikimedia.org/r/#/c/92068/ - pwb will use Special:Export again.

Keeping this bug open, however, as it's *really* strange to trigger behavior depending on a -debug flag...

Script to reproduce:

pywikibot.logger.setLevel(pywikibot.DEBUG)
import pywikibot
pywikibot.logger.setLevel(pywikibot.DEBUG)
pywikibot.getall(pywikibot.Site('pt', 'wikipedia'), [pywikibot.Page('pt', u'Utilizador_discussão:Elvire')])

Getting 1 page via API from wikipedia:pt...
BUG>> title Usuária Discussão:Elvire ([[pt:Usuário(a) Discussão:Elvire]]) not found in list
Expected one of: [[pt:Usuária Discussão:Elvire]]
Traceback (most recent call last):

File "<stdin>", line 1, in <module>
File "wikipedia.py", line 5956, in getall
  _GetAll(site, pages, throttle, force).run()
File "wikipedia.py", line 5528, in run
  self.oneDoneApi(vals)
File "wikipedia.py", line 5829, in oneDoneApi
  raise PageNotFound

pywikibot.exceptions.PageNotFound

Pywikibot has two versions: Compat and Core. This task was filed about the older version, called Pywikibot-compat, which is not under active development anymore. Hence I'm lowering the priority of this task to reflect the reality. Unfortunately, the Pywikibot team does not have the manpower to retest every single bug report / feature request against the (maintained) Pywikibot code base. Furthermore, the code base of Pywikibot-Compat has changed a lot compared to the code base of Pywikibot-Core so there is a chance that the problem described in this task might not exist anymore. Please help: Unfortunately manpower is limited and does not allow testing every single reported task again. If you have time and interest in Pywikibot, please upgrade to Pywikibot-Core and add a comment to this task if the problem in this task still happens in Pywikibot-Core (or directly edit the task by removing the Pywikibot-compat project and adding the Pywikibot project to this task). To learn more about Pywikibot and to get involved in its development, please check out https://www.mediawiki.org/wiki/Manual:Pywikibot/Development Thank you for your understanding.

Xqt subscribed.

use core instead