Page MenuHomePhabricator

parser function to detect if the current page is in a given category
Closed, ResolvedPublic

Description

This could be used to refine template appearance, for sorting purposes (especially in maintenance), it would also allow to enable per-category editnotices, more efficiently than using editintro via common.js (doesn't work for sections, can't be hidden, etc).


Version: unspecified
Severity: enhancement
See Also: T50175: Scribunto/Lua should have a built-in method for retrieving categories used on a page

Details

Reference
bz18596

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:32 PM
bzimport set Reference to bz18596.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

What would be the result of {{ifincategory: Foo | Bar | [[Category:Foo]] }} ?

Good point, it could create cycles at evaluations.

We could remove the else parameter and return nothing if the page is not in [[Category:Foo]].

We wouldn't loose much essential functionality. Although, to keep it maximal, the content in the else argument could also not be allowed to categorize at all.

(In reply to comment #1)

What would be the result of {{ifincategory: Foo | Bar | [[Category:Foo]] }} ?

While interesting academically, it's obvious that such a conditional shouldn't be used for real. Simply telling users not to use it because it can lead to unexpected and/or unpredictable results is good enough IMO. That way, the parser can simply ignore this use case and handle it in the most convenient way.

The problem, of course, is that it would require an additional parsing stage to effectively handle (and the conflicts above are only the tip of the iceberg).

Seems certainly not worth the implementation cost.

(In reply to comment #5)

If the else argument doesn't transfer categories to the current page, it
shouldn't create conflicts.

And what would be the result of [[Category{{ifincategory: Foo | Bar |:}}Foo]] ?

Helder

Right; since this is primarily aimed to be used in the interface, via mediawiki pages, there's no need to worry too much about how we handle categories - only if we want to use it in actual pages.

So ifincategory should not work outside of the interface: not be parsed by the software, left in plain text, but work in the interface where categories are not read. Actually, the editnotice interface shouldn't read categories at all, currently it does: categories (when not noincluded) are showed at the bottom of the edit window.

The problem is that the parser function will be evaluated in the same scope as categories are generated, so there're several options, none too good:
*The parser function returns the category membership of the previous revision (would evaluate the current if the page is purged after).
*The parser function returns the category membership with the information from categories defined before the parser function. Since the categories position standard is to place them at the bottom, not too useful.
*Add a new pass to the parser to determine this. Parserfunctions are evaluated before wikitext, and this would require preevaluating the wikitext. We shouldn't pollute the parser with those loops.

This is a problem only if the parser function affects the category membership. It won't be a problem if this parser function is used only in the interface - and is designed to work only there. There would be ample use of this, for example in editnotices and descriptions of actions.

happy.melon.wiki wrote:

Writing a parser function that only works in the interface is a *horrible* idea; there are enough problematic inconsistencies between the parser behaviour in the interface vs the article body already.

Implementationally, however, this information (in the simplest case) can either be drawn from the categorylinks table (corresponding to Platonides' option 1), or from the parser's own category array (option 3). AFAICT, option 2 would require adding another parser pass *anyway*. The first option, which is already what we do with {{REVISIONID}}, {{REVISIONUSER}}, etc, seems by far the sanest option.

Wouldn't checking the parser's own category array result in option 2?

happy.melon.wiki wrote:

As you said yourself, the category array is not populated until *after* the parser functions are expanded, so the result would always be false.

(In reply to comment #12)

Implementationally, however, this information (in the simplest case) can either
be drawn from the categorylinks table (corresponding to Platonides' option 1),
or from the parser's own category array (option 3). AFAICT, option 2 would
require adding another parser pass *anyway*. The first option, which is
already what we do with {{REVISIONID}}, {{REVISIONUSER}}, etc, seems by far the
sanest option.

The difference is that the revision info is not set by the parser, so its possible to know the current revision info before parsing. Categories are set by the parser, so when saving a page, the function would be evaluated based on the categories on the previous revision, which means you could potentially get different behavior when saving vs. a null edit/cache expiry

happy.melon.wiki wrote:

In a sane implementation of the save process, the links tables would be updated as part of the save, and then the page would be parsed based on that information for the subsequent display. Of course, very little about the parser is sane... :-D

But you need to parse the page to determine what to update, so you would need (at least) a 2-pass parser.

happy.melon.wiki wrote:

The page is fully parsed on save to know what to update, as you say. Then the editor is redirected back to the article, so they make a new GET request that *could* prompt a whole new parse, but I expect the on-save parse is cached to prevent that duplication of effort. All we'd have to do is stop that, either overall or just for pages with wierd things like this.

It all falls apart, of course, if the second parse doesn't get the latest categorylinks values, which could be tricky on replicated dbs. Might have to do the second parse from the master? Not sure what Domas would make of that :-D

You wouldn't rely on a browser redirect, you would just directly add the pass yourself. The user GET request is not relevant here.

happy.melon.wiki wrote:

Why? The tables are correct, we just don't know what the resulting HTML looks like. Who cares until someone asks to view the page? We can't avoid the fact that people can build loops and statements that only work through after N parses, no one has any right to complain if they add

{{#incategory: Foo

|{{#incategory: Bar |                  | [[Category:Bar]] }}
|{{#incategory: Bar | [[Category:Foo]] |                  }}

}}

to a page and expect it to do something sensible. One pass is all we need to update the tables for any sane usage, and thereafter we expect that correct HTML can be generated whenever it's needed by a second parse, which can happen whenever the HTML is actually asked for, like any other cache miss.

(In reply to comment #18)

The page is fully parsed on save to know what to update, as you say. Then the
editor is redirected back to the article, so they make a new GET request that
*could* prompt a whole new parse, but I expect the on-save parse is cached to
prevent that duplication of effort. All we'd have to do is stop that, either
overall or just for pages with wierd things like this.

There is no on-save parse. The article's cache is just purged by the save, and the first GET request to the page (which is typically the editor being redirected back to the article view) causes a parse, which is then cached.

(In reply to comment #21)

There is no on-save parse. The article's cache is just purged by the save, and
the first GET request to the page (which is typically the editor being
redirected back to the article view) causes a parse, which is then cached.

Ignore that, I was wrong.

happy.melon.wiki wrote:

*** Bug 27004 has been marked as a duplicate of this bug. ***

  • Bug 44331 has been marked as a duplicate of this bug. ***

(In reply to comment #24)

> *** Bug 44331 has been marked as a duplicate of this bug. ***

That's wrong. This bug requests a *current-page* category defection. Bug 44331 is
about a few (grammatical) properties of one page to be obtained *from another page*.

Note. I wrote an extension that does this. See [[extension:pageInCat]] (I think. The extension is named something like that). It only works for current page though (likely to change)

Only downside is it makes previews take a bit longer then they otherwise would. (Need to do the preview to determine current cats of page. And then again to determine if this affects instances of the parser function. It assumes users are not pathalogical-but even if they are you can already do pathalogical things with cats with current markup)

[[mw:Extension:PageInCat]] exists, so this bug is fixed. BTW, you might want to update the extension page to link to git instead of svn.

Requesting that this be enabled on WMF wikis would be a different bug.