Page MenuHomePhabricator

Account required in wiki being processed (starting wiki)
Closed, ResolvedPublic

Description

In compat, one could run a script starting in a wiki for which he/she didn't have an account specified in user-config.py. The starting wiki just wouldn't be updated, but others would.
In core, it seems you immediately get an error if you didn't specify the user name for it:

pwb.py interwiki -lang:ru -recentchanges

Traceback (most recent call last):
  File "pwb.py", line 239, in <module>
    if not main():
  File "pwb.py", line 233, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 111, in run_python_file
    main_mod.__dict__)
  File ".\scripts\interwiki.py", line 2641, in <module>
    main()
  File ".\scripts\interwiki.py", line 2591, in main
    site.login()
  File "D:\Work\pywikipedia-core\pywikibot\site.py", line 1826, in login
    user=self._username[sysop])
  File "D:\Work\pywikipedia-core\pywikibot\tools\__init__.py", line 1248, in wrapper
    return obj(*__args, **__kw)
  File "D:\Work\pywikipedia-core\pywikibot\login.py", line 104, in __init__
    'wiki_code': self.site.code})
**pywikibot.exceptions.NoUsername: ERROR: Username for wiktionary:ru is undefined.**
If you have an account for that site, please add a line to user-config.py:
 
usernames['wiktionary']['ru'] = 'myUsername'
<class 'pywikibot.exceptions.NoUsername'>

Event Timeline

Malafaya raised the priority of this task from to Needs Triage.
Malafaya updated the task description. (Show Details)
Malafaya subscribed.

P.S. In case it matters, I do have a global account, so in fact I could have an account specified, but I don't want to update pages in that wiki as I don't have permission to do so.

This is not a general difference between compat and core but an interwiki.py issue. That script and some others force login. But other script does not. I remember such lines where added with r10404 [1] as a kind of auto-login whereas compat needs an explicit login.py run. But not all scripts have that forced login. On the other hand core also has a login.py and the given implementation might be not an appropriate one.

I guess removing line 2591 would the script keep running except write access is denied.

[1] https://mediawiki.org/wiki/Special:Code/pywikipedia/10404

I think I understood what you're saying and my latest finding also support that.
As I have a global account, I am "implicitely" logged in to ru.wikt, so I can retrieve pages even without an explicit username in the user-config.py. And because there is not explicit username, there will be no edits either, so everything is fine, just as intended.
But with the explicit site.login() in the code, the script is explicitely trying to login to ru.wikt, without a username for it. Hence the error.
What was the purpose of site.login() in every script? To avoid having to run login.py beforehand? If a login is required in that wiki, how does the bot retrieve pages from other linked wikis for which is doesn't have an explicit login?
I removed the line you mentioned and everything ran smoothly. The bot doesn't try to edit ru.wikt pages as its username is not configured, so no "write access error" occurs.

I guess this is a left over from compat times? In core reading operations usually work without being logged in and only certain operations, like editing a page, require a username and will log in not already logged in.

Also as far as I know, pywikibot does not understand that when you have a login on Wikipedia you also one for Wikitionary. You can tell pywikibot that your login works on all languages of a wiki by using * as the language:

usernames['wikipedia']['*'] = '…'

This obviously won't solve your problem on wiktionary but you could define it for it as a workaround.

PS:

You can search for revisions in git and get the commit hash (16b7fb3) which is linked in phab automatically:

$ git log --grep=pywikipedia/10404 --oneline 
16b7fb3 Add explicit Site().login() calls to all scripts that write to the wiki.
XZise set Security to None.

Change 230995 had a related patch set uploaded (by Merlijn van Deen):
interwiki: log in to all configured sites in family

https://gerrit.wikimedia.org/r/230995

As I see it, these options are acceptable:

  • ignore the pywikibot.exceptions.NoUsername in site.login() when starting the script
  • login to all configured wikis (and only those) instead of site.login() for the starting wiki
  • login to site only when editing the first page in that site

I just realized that logging in to all configured wiki at startup creates some delay. I use custom scripts to invoke interwiki.py several time and this kind of delay every time is performance-breaking. Could there be a configuration parameters which enables startup login vs lazy login?

Change 230995 merged by jenkins-bot:
interwiki: do not automatically log in

https://gerrit.wikimedia.org/r/230995

jayvdb assigned this task to valhallasw.
jayvdb added a project: Pywikibot-interwiki.py.

Change 243040 had a related patch set uploaded (by John Vandenberg):
interwiki: do not automatically log in

https://gerrit.wikimedia.org/r/243040

Change 243040 merged by jenkins-bot:
interwiki: do not automatically log in

https://gerrit.wikimedia.org/r/243040