Page MenuHomePhabricator

New parserfunction that determines whether a string is an IP address
Closed, DeclinedPublic

Description

In templates used on userpages, it is sometimes important to determine whether it is a userpage of an unregistered user (IP address) or a registered one. Currently the following hack is used to do that:

{{#ifeq:{{lc:{{PAGENAME}}}}|{{uc:{{PAGENAME}}}}|IP|not IP}}

This tests if uppercase and lowercase of the pagename are equal. As IP-userpages have only digits and dots in them, they are detected correct. Most usernames contain letters, and those are detected correct as well. However, there are some false positive cases, if the username does not contain any letters (User:123 for example).

Therefor this is not a good solution of the problem, it'd be much nicer if ParserFunctions offered a test like {{#ip:<string>}} which returns true if it is an IP address and nothing if it is not.


Version: unspecified
Severity: enhancement

Details

Reference
bz15352

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:18 PM
bzimport set Reference to bz15352.

added {{#ip: test_ip }} to ParserFunctions. Calls IP::isIPAddress

After messing around a bit with regexps, I found that MediaWiki already has most of the code included... IP::isIPAddress is your friend. I added this to ParserFunctions, perhaps it could even be included in CoreParserFunctions.

Attached:

happy.melon.wiki wrote:

This doesn't strike me as semantically a ParserFunction, which are more concerned with logical and mathematical operations. It would probably make more sense as a string function, or (as suggested) a core function. In fact, the function could almost certainly be constructed from the string functions already present in [[mw:Extension:StringFunctions]]; something like

{{#iferror:

{{#ifexpr:
  {{#expr: 0+{{#explode: {{{ip|}}} |.|0}} >= 0
       and 0+{{#explode: {{{ip|}}} |.|0}} < 256 }} and 
  {{#expr: 0+{{#explode: {{{ip|}}} |.|1}} >= 0
       and 0+{{#explode: {{{ip|}}} |.|1}} < 256 }} and 
  {{#expr: 0+{{#explode: {{{ip|}}} |.|2}} >= 0
       and 0+{{#explode: {{{ip|}}} |.|2}} < 256 }} and 
  {{#expr: 0+{{#explode: {{{ip|}}} |.|3}} >= 0
       and 0+{{#explode: {{{ip|}}} |.|3}} < 256 }} and 
  {{#if: {{#explode: {{{ip|}}} |.|4}} | <span class="error"></span> | 1 }}
|yes
|no
no

}}

This (should) match strings that have exactly four nonempty substrings separated by periods; where each substring is an integer between 0 and 255. Extension to IPv6 would be possible, I expect. This is a very easy pattern to match.

Your patch does not implement the functionality I would expect from your initial comment: it returns "1" when a valid IP is given and "" otherwise. It would then be necessary to wrap this in an #ifexpr: to obtain usable output in the same format as you say is used currently. Much better to follow the same syntax as #ifexpr: or #iferror - that is, {{#ifip: <string> | <then> | <else> }}.

matthew.britton wrote:

(In reply to comment #3)

the function could almost certainly be constructed from the string functions
already present in [[mw:Extension:StringFunctions]]

That extension is not available on Wikimedia wikis, presumably for performance reasons. This particular function is needed there.

happy.melon.wiki wrote:

So is the rest of StringFunctions. The original performance issues are no longer an issue (see bug6455), all that is lacking is a sysadmin to do performance testing and deployment. Since this functionality is easily producible if we *did* have StringFunctions, the solution to the problem is to get that extension installed; there's no need to proliferate single-purpose parser functions if general functionality could be made to produce the same effect.

Created attachment 10477
{{#if_ip:}}

We have no precendents of boolean parser functions, so here's my patch that adds {{#if_ip}} function that works like the other {{#if...}} functions. I added it to string functions so it will still be unavailable on WMF - because we will all believe in Lua, yeah! :P I'll commit it shortly unless someone tells me why shouldn't I do this:)

Attached:

happy.melon.wiki wrote:

Not sure about the underscore; cleaner to just have it all as one word ("ifip" in the same way as "ifexpr" and "ifexist"). That avoids the awkward question of whether {{#if ip:...}} does and/or should also parse.

'ip' is too short and unlike 'expr' and 'exist' doesn't look recognisable if glued together with 'if'.

I think this problem is being approached in the wrong way. The request is to be able to determine whether you're on the user page of someone who has a registered account or not, isn't it? That doesn't really have to do with an "ifip:" parser function, which seems incredibly ambiguous to me. A general magic word to figure out if a particular user is registered (or globally registered, even) seems like a better approach here.

happy.melon.wiki wrote:

(In reply to comment #8)

'ip' is too short and unlike 'expr' and 'exist' doesn't look recognisable if
glued together with 'if'.

So make it longer ("ifipaddress" is more intuitive anyway). No other hashtag function I know of has space in it, and I don't think it's a good idea to start here.

Following on from MZMcBride's attempt to refocus this on the use rather than the cool boolean test, couldnt we add something like {{userid:<username>}} which returns null, or a value such as uid or guid. Return a useful value that only registered users have. e.g. A list of usergroups? Comparing against the indexed username field seems much more sensible than calculating whether a random string is able to be used as an IP.

(In reply to comment #11)

Following on from MZMcBride's attempt to refocus this on the use rather than
the cool boolean test, couldnt we add something like {{userid:<username>}}
which returns null, or a value such as uid or guid.

While this seems like a good idea at first, it does come with one problem: Nonexisting users and IPs are lumped together even though they are two distinct semantical classes.

So, let's suppose you want to modify MediaWiki:Newarticletext so that it displays a warning on IP's user talk pages that the person behind the IP address might have changed. Then this warning about an IP address will also be displayed on all pages not corresponding to a registered user, which is confusing.

On the other hand, I don't see a use case where you'd need a users userid.

Unassigning this, since I'm not going to work on it anytime soon.

(In reply to comment #12)

(In reply to comment #11)

Following on from MZMcBride's attempt to refocus this on the use rather than
the cool boolean test, couldnt we add something like {{userid:<username>}}
which returns null, or a value such as uid or guid.

While this seems like a good idea at first, it does come with one problem:
Nonexisting users and IPs are lumped together even though they are two distinct
semantical classes.

IPs have a user ID of 0. Non-existent users have a user ID of ''. ;-)

That's probably a bit nasty of an implementation, but the point is that there are ways to work around what you're talking about. The initial approach (#if_ip) really is too specialized and ambiguous to be useful, I think.

I think this bug may end up never being resolved if Lua comes to fruition. We'll see.

(In reply to comment #13)

I think this bug may end up never being resolved if Lua comes to fruition.
We'll see.

Is Lua going to be deployed before IPv6? :P

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_IPv6_Readiness

See also

https://en.wikipedia.org/wiki/User_talk:Jasper_Deng/IPv6

audreyt wrote:

Hi Church of emacs & Max, thank you for the patch!

As you may already know, MediaWiki is currently revamping its PHP-based parser
into a "Parsoid" prototype component, to support the rich-text Visual Editor
project:

https://www.mediawiki.org/wiki/Parsoid
https://www.mediawiki.org/wiki/Visual_editor

Folks interested in enhancing the parser's capabilities are very much welcome
to join the Parsoid project, and contribute patches as Git branches:

https://www.mediawiki.org/wiki/Git/Tutorial#How_to_submit_a_patch

Compared to .diff attachments in Bugzilla tickets, Git branches are much easier
for us to review, refine and merge features together.

Each change set has a distinct URL generated by the "git review" tool, which
can be referenced in Bugzilla by pasting its gerrit.wikimedia.org URL as a
comment.

If you run into any issues with the patch process, please feel free to ask on
irc.freenode.net #wikimedia-dev and the wikitext-l mailing list. Thank you!

  • Bug 38750 has been marked as a duplicate of this bug. ***

Can also be done with lua (Scribunto extension)
See http://en.wikipedia.org/wiki/Module:IPAddress for an example

(In reply to comment #17)

Can also be done with lua (Scribunto extension)
See http://en.wikipedia.org/wiki/Module:IPAddress for an example

Amen.