Page MenuHomePhabricator

Bug: Page.botMayEdit() returns incorrect result if "allow=" syntax is used
Closed, InvalidPublic

Description

Page.botMayEdit() should return False if the page contains {{nobots|allow=Foo}} unless "Foo" matches the bot's username. But it will always return True in this situation.

The bug is in the following code snippet in page/__init__.py:

for template, params in templates:
    title = template.title(with_ns=False)
    if restrictions:
        if title in restrictions:
            return False
    if title == 'Nobots':
        if not params:
            return False
        else:
            bots = [bot.strip() for bot in params[0].split(',')]  ### BUG
            if 'all' in bots or pywikibot.calledModuleName() in bots \
               or username in bots:
                return False

On the line tagged with ### BUG, the value of bots in the example given will be the string "allow=Foo"; this should then be split on the '=' to get the actual list of allowed bots. Because it is not split, the if statement on the next line always matches when the "allow" parameter is present. After splitting the arguments, the method should use logic similar to that in the next section (under elif title == 'Bots':) to parse the parameters and return the appropriate value.

Details

Related Changes in Gerrit:

Event Timeline

{{nobots}} should not have parameters per https://en.wikipedia.org/wiki/Template:Bots#Syntax, so it should be {{bots|allow=Foo}} instead of {{nobots|allow=Foo}}.

If we follow that, then the code could be:

[...]
        for template, params in templates:
            title = template.title(with_ns=False)
            if restrictions and title in restrictions:
                return False
            if title == 'Nobots': # Could add `and not params` condition too
                return False
[...]

If we're going (or intend) to parse {{nobots}} parameters the same as {{bots}} parameters, then the code could be:

[...]
        for template, params in templates:
            title = template.title(with_ns=False)
            if restrictions and title in restrictions:
                return False
            if title not in {'Bots', 'Nobots'}:
                continue
            if not params:
                return title == 'Bots'
            (ttype, bots) = [part.strip() for part in params[0].split('=', 1)]
            bots = [bot.strip() for bot in bots.split(',')]
            if ttype == 'allow':
                return 'all' in bots or username in bots
            if ttype == 'deny':
                return not ('all' in bots or username in bots)
            if ttype == 'allowscript':
                return ('all' in bots or pywikibot.calledModuleName() in bots)
            if ttype == 'denyscript':
                return not ('all' in bots
                            or pywikibot.calledModuleName() in bots)
        # no restricting template found
        return True

I'd go for the first one. Other bots that implement this as shown at https://en.wikipedia.org/wiki/Template:Bots#Example_implementations will not support {{nobots}} with parameters. On the English Wikipedia such cases are reported as broken.

So you suggest we can merge the behavior? {{nobots}} will do one thing, {{bots}} another thing, but attributes will be shared?

So we can merge the behavior? {{nobots}} will do one thing, {{bots}} another thing, but attributes will be shared?

We can, but I don't think we should.

Other bots that implement this as shown at https://en.wikipedia.org/wiki/Template:Bots#Example_implementations will not support {{nobots}} with parameters. On the English Wikipedia such cases are reported as broken.

Better have a look at this doc.
{{nobots}} must not have any named parameters.
And on the other hand {{bots}} always must have a named parameter if any.
{{bots|allow=Foo}} is a valid directive but {{nobots|allow=Foo}} wasn't ever.
{{nobots}} is not an alias for {{bots}} because it has an opposite meaning.
Years ago I introduced the parameterless option for {{nobots}} (which wasn't included to the en-wiki doc I guess) like {{nobots|botname}} which is similar to {{bots|deny=botname}} and {{nobots|scriptname}} which is similar to {{bots|denyscript=scriptname}}. See https://www.mediawiki.org/wiki/Special:Code/pywikipedia/10774 for this.

Probably we should always return False if {{nobots}} has a named parameter or {{bots}} has not a named parameter. The wrong usage wouldn't have any impact than; mabe print a warning in such cases.

We can, but I don't think we should.
On the English Wikipedia such cases are reported as broken.

It's not just about enwiki. People on Czech Wikipedia complained before about this meaningless custom of having two templates doing the same thing. They suggested to have just one template (nobots) with all the params (allowbot, denybot, allowscript, denyscript) available to use with {{nobots}} being used as a shortcut for {{nobots|deny = all}}. I would also consider this a better solution than the current one.

Other bots that implement this as shown at https://en.wikipedia.org/wiki/Template:Bots#Example_implementations will not support {{nobots}} with parameters.

This is the only thing that really matters here as Pywikibot is not the only one. People are using AWB, WPCleaner, Cat-a-lot, IABot, CommonsDelinker, Listeria, et cetera et cetera and also their own bots and scripts. Do we want to break the customs? If enwiki (or any other less or more major wiki) decides to change this behavior in an RfC or something, we should respect that and implement that, but until then, why should we even bother discussing this here, on a Phabricator, outside wikis. If someone really wants to change the behavior, they should start an RfC on wiki(s), not request us to make a decision.

Also if it's their fault they didn't read the docs/help pages about these customs and use incorrect syntax, why should we bother at all?

To make my point clear: if there is any doubt, we are not those who should decide how to behave. Either wiki community finds a solution about the issue @russblau suggested and tells us to adapt Pywikibot, or we will implement it just the way official docs/help pages say.

The current implementation matches the docs, specifically {{nobots}} does not allow named parameters.

The current implementation contains a logic error, in that it tries to split a parameter value on a ',' without first checking for the presence of an '='. This is a bug, regardless of whether we think the correct implementation should or should not allow for {{nobots}} to have parameters.

@russblau It's not a bug, have you read @Xqt's comment? Pywikibot supports Nobots unnamed parameter (defined by dewiki).

Change 599271 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [IMPR] Rewrite Page.botMayEdit method

https://gerrit.wikimedia.org/r/599271

Change 599271 merged by jenkins-bot:
[pywikibot/core@master] [IMPR] Rewrite Page.botMayEdit method

https://gerrit.wikimedia.org/r/599271