Page MenuHomePhabricator

Feature request: add detection for disambiguation pages to Scribunto
Open, Needs TriagePublic

Description

It would be useful for things like Module:WikiProjectBanner if Scribunto could detect whether a given page is a disambiguation page or not. At present, the only way of doing this is to preprocess all of a page's text and search for the __DISAMBIG__ magic word, which is obviously not practical for performance reasons.

I envision this as an "isDisambig" property of the Scribunto title object, but the exact naming or location isn't so important.

Details

Reference
bz69441

Event Timeline

bzimport raised the priority of this task from to Needs Triage.
bzimport set Reference to bz69441.
bzimport added a subscriber: Unknown Object (MLST).
MrStradivarius added a comment.EditedAug 12 2014, 5:34 PM

Jackmcbarn pointed out on IRC that we can't do this because disambiguation status is determined post-parse:

you could do {{#if:{{#invoke:IsThisADabPage|main|{{FULLPAGENAME}}}}||__DISAMBIG__}} or something and cause a paradox

So I'm closing this as WONTFIX.

kaldari reopened this task as Open.Oct 31 2016, 9:16 PM
kaldari added a subscriber: kaldari.

Reopening, as this is technically possible, and not even that difficult. For a somewhat hacky solution, in Scribunto_LuaTitleLibrary::getExpensiveData() you can add something like:

if ( class_exists( 'DisambiguatorHooks' ) ) {
    $ret[ 'isDisambiguation' ] = DisambiguatorHooks::isDisambiguationPage( $title, false );
}

... and then pick it up in mw.title.lua.

The more correct solution would probably be to add a hook in Scribunto_LuaTitleLibrary::getExpensiveData() and then add a hook handler in DisambiguatorHooks. But either way, it's doable.

Jackmcbarn awarded a token.EditedNov 1 2016, 12:09 AM
Jackmcbarn added a subscriber: Jackmcbarn.

I still don't really like the idea. When is this even useful? (And my point wasn't that it's not possible at all, but that it will necessarily give the wrong answer sometimes.)

I still don't really like the idea.

In my opinion, almost all page metadata, including properties like redirect and disambiguation status, should be available to Scribunto/Lua. This also extends to information about which categories the page is currently in, page length, page protection status, and anything else. I think we should only exclude info if there's a specific reason to.

When is this even useful?

The task description mentions https://en.wikipedia.org/wiki/Module:WikiProjectBanner. Some WikiProject banner templates categorize pages based on whether or not they're disambiguation pages; e.g., https://en.wikipedia.org/wiki/Category:Disambig-Class_video_game_articles. I believe this categorization is currently manual, but could be made automatic.

MZMcBride updated the task description. (Show Details)Nov 1 2016, 12:34 AM
Legoktm added a subscriber: Legoktm.Nov 1 2016, 2:36 AM

Reopening, as this is technically possible, and not even that difficult. For a somewhat hacky solution, in Scribunto_LuaTitleLibrary::getExpensiveData() you can add something like:

if ( class_exists( 'DisambiguatorHooks' ) ) {
    $ret[ 'isDisambiguation' ] = DisambiguatorHooks::isDisambiguationPage( $title, false );
}

... and then pick it up in mw.title.lua.

The more correct solution would probably be to add a hook in Scribunto_LuaTitleLibrary::getExpensiveData() and then add a hook handler in DisambiguatorHooks. But either way, it's doable.

We probably want to have Disambiguator register a Lua library rather than having Scribunto be aware of Disambiguator.

Jackmcbarn pointed out on IRC that we can't do this because disambiguation status is determined post-parse:

you could do {{#if:{{#invoke:IsThisADabPage|main|{{FULLPAGENAME}}}}||__DISAMBIG__}} or something and cause a paradox

So I'm closing this as WONTFIX.

I think there are significantly worse ways that Lua can be used to shoot yourself in the foot like that. However, the fact that it is post-parse does make it a little trickier, as a naive implementation that uses DisambiguatorHooks::isDisambiguationPage would actually be checking if the previous revision was a disambiguation page or not. I'm not convinced that is a deal breaker by itself.

Anomie added a comment.Nov 1 2016, 2:21 PM

Reopening, as this is technically possible, and not even that difficult. For a somewhat hacky solution, in Scribunto_LuaTitleLibrary::getExpensiveData() you can add something like:

if ( class_exists( 'DisambiguatorHooks' ) ) {
    $ret[ 'isDisambiguation' ] = DisambiguatorHooks::isDisambiguationPage( $title, false );
}

... and then pick it up in mw.title.lua.

The more correct solution would probably be to add a hook in Scribunto_LuaTitleLibrary::getExpensiveData() and then add a hook handler in DisambiguatorHooks. But either way, it's doable.

We probably want to have Disambiguator register a Lua library rather than having Scribunto be aware of Disambiguator.

Yes, this is the correct way to do it. Another possibility would be to allow Scribunto to access page properties in general, since Disambiguator adds a 'disambiguation' page property.

I'll also note that making this check would likely need to register a "transclusion" of the checked page, since we don't have any other mechanism to specify that the checking page needs reparsing if the checked page changes.

However, the fact that it is post-parse does make it a little trickier, as a naive implementation that uses DisambiguatorHooks::isDisambiguationPage would actually be checking if the previous revision was a disambiguation page or not. I'm not convinced that is a deal breaker by itself.

Any sort of feature that is checking the results of a parse should probably raise an error when it's being used on the page being parsed. No point in making this foot-shooting easier than it has to be.

I still don't really like the idea. When is this even useful?

I discovered this ticket from a feature request at Module talk:Redirect after being referred there due to my similar request at Template talk:Pagetype.

(And my point wasn't that it's not possible at all, but that it will necessarily give the wrong answer sometimes.)

As Anomie mentioned, we could just have it throw an error if it's being used to check the page that is being parsed. In the case of Template:WPBannerMeta (which seems to be the main use case currently) it should always be checking from the Talk page, not the page itself, so that wouldn't be an issue.

Change 474108 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/extensions/Disambiguator@master] Allow determining disambiguation page status through Scribunto

https://gerrit.wikimedia.org/r/474108