User name normalization does not take underscores into account
Open, Needs TriagePublic

Description

After a fresh checkout from git,
python pwb.py login -family:i18n -lang:i18n
asks for a password, says the bot was logged in,
but subsequent actions ask for a passwort again.

The files mentioned in the documentation are not
created. There is a file pywikibot.lwp that has
the cookies related to the wiki.

purodha@tools-dev:~/pywikibot$ python pwb.py version
Pywikibot: [ssh] pywikibot-core.git (e553f36, g2767, 2014/02/23, 15:09:44, ok)
Release version: 2.0b1
Python: 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3]
unicode test: ok
purodha@tools-dev:~/pywikibot$ python pwb.py login -family:i18n -lang:i18n
Password for user Purbo_T on i18n:i18n (no characters will be shown):
Logging in to i18n:i18n as Purbo_T
Logged in on i18n:i18n as Purbo T.
purodha@tools-dev:~/pywikibot$ python pwb.py pagefromfile -file:/tmp/purodha-pagefromfile-test -appendbottom -nocontent:/try
Reading '/tmp/purodha-pagefromfile-test'...

User:Purodha/try <<<

Password for user Purbo_T on i18n:i18n (no characters will be shown):


Version: core-(2.0)
Severity: normal

Details

Reference
bz61832
bzimport set Reference to bz61832.
bzimport added a subscriber: Unknown Object (????).

Could you post your user-config.py?

Thanks for the hint. Here's the problem:

user-config.py:
usernames['i18n']['i18n'] = u'Purbo_T'

pywikibot.lwp:
Set-Cookie3: translatewiki_net_bw_UserName="Purbo+T"; path="/"; domain="translatewiki.net"; path_spec; expires="2014-08-22 21:18:35Z"; httponly=None; version=0

The equivalence of " ", "_", "+" inside the user name in various contexts
is not properly taken unto account.

Altering user-config.py to:
usernames['i18n']['i18n'] = u'Purbo T'

finds the user logged in.

It's just the _ in the user name -- we determine user names like this:

if not self.nocapitalize:                                                                                                                                                                                          if user:
         user = user[0].upper() + user[1:]                                                                                                                                                                          if sysop:
         sysop = sysop[0].upper() + sysop[1:]

(site.py)

and compare that to what the API returns (which is a capitalized-or-not name with spaces, not underscores).

The better solution would be to either
a) normalize the username (with a Page object -- which I guess is OK because Page objects are used more often in Site)
or
b) comparing the usernames with site.sametitle (which currently does not take underscores into account, but should)

Marking as 'easy' for anyone willing to pick up option a)

Change 117689 had a related patch set uploaded by Purodha:
Bug: 61832 - fixed.

https://gerrit.wikimedia.org/r/117689

Change 119882 had a related patch set uploaded by Tim Landscheidt:
become: Add --help option

https://gerrit.wikimedia.org/r/119882

scfc added a comment.Mar 20 2014, 9:51 PM

Sorry, comment #5 was a typo of mine.

Change 150872 had a related patch set uploaded by John Vandenberg:
WIP: Introduce static method Link.normalize(title)

https://gerrit.wikimedia.org/r/150872