{{int:X}} respects user-defined interface language, breaking link tables etc. (aka {{USERIFCODE}} strikes back)
Closed, ResolvedPublic

Description

The int: core parser function (e.g. {{int:Talkpage}}) retrieves the text using _the user interface language_ (_not_ the contents language).

But that means all reasons why Bug 2085 is marked WONTFIX (caching, link table corruption), are already here! (See the linked URL.)

I guess the proper fix would be to change CoreParserFunctions::intFunction to add one “true” argument to the wfMsgReal call.


Version: unspecified
Severity: normal
URL: http://test.wikipedia.org/wiki/Bug_2085

bzimport added a project: MediaWiki-Parser.Via ConduitNov 21 2014, 10:15 PM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz14404.
Mormegil created this task.Via LegacyJun 4 2008, 12:59 PM
IAlex added a comment.Via ConduitJun 9 2008, 5:58 PM

fixed in r36093.

brion added a comment.Via ConduitJun 11 2008, 2:52 AM

Reverted in r36185 -- caused regression to parser cache consistency.

User-specific options such as stub threshold were still applying in the parser, but not taken into account in the parser hash key. As a result, the caches were corrupt, saving different options into the anonymous-default options cache.

bzimport added a comment.Via ConduitAug 6 2008, 4:32 AM

brad9626 wrote:

No! Don't fix this! I always assumed this was the desired behavior. We have multilingual templates on Commons that make great use of this. See [[commons:Template:Edit-int]] and [[commons:Template:See also]]. We also utilize translated messages from our upload form, such as [[commons:MediaWiki:UploadFormSourceLabel]]. I was planning to do this for our {{Information}} template: [[commons:Template:Information (Internationalised)]]. Now I learn this great feature is a bug! I thought {{MediaWiki:... was for when you wanted the actual contents regardless of language.

...if only we had a {{USERLANGUAGE}} variable. And it's entirely possible if this is.

Mormegil added a comment.Via ConduitAug 6 2008, 8:52 AM

(In reply to comment #3)

No! Don't fix this! I always assumed this was the desired behavior.
[…]
...if only we had a {{USERLANGUAGE}} variable. And it's entirely possible if
this is.

Did you read Bug 2085? And that it is marked WONTFIX and why? In the current MediaWiki architecture, it would be (IMHO) terribly difficult to make parser’s behavior dependent on the current viewer’s preferred language.

You could implement simple preference-dependent translation using JavaScript, nothing too difficult in that.

(Personal sidenote: do you realize that by voting for this bug means you want it to be _fixed_, i.e. the {{int:X}} hack to be removed?)

bzimport added a comment.Via ConduitAug 6 2008, 9:10 AM

brad9626 wrote:

Unvoted. :) I originally misunderstood what this was... I was thinking it was more of a sequel to Bug 2085. It's suggesting that a magic word is possible because "all reasons why Bug 2085 is marked WONTFIX (caching, link table corruption), are already here!" I'm not sure what would be terribly difficult about it. I mean, we already can archive the same with int:, it's just that might get "fixed" now. :/

bzimport added a comment.Via ConduitAug 6 2008, 2:19 PM

bugs wrote:

(In reply to comment #3)

No! Don't fix this! I always assumed this was the desired behavior. We have
multilingual templates on Commons that make great use of this. See
[[commons:Template:Edit-int]] and [[commons:Template:See also]]. We also
utilize translated messages from our upload form, such as
[[commons:MediaWiki:UploadFormSourceLabel]]. I was planning to do this for our
{{Information}} template: [[commons:Template:Information (Internationalised)]].
Now I learn this great feature is a bug! I thought {{MediaWiki:... was for when
you wanted the actual contents regardless of language.

...if only we had a {{USERLANGUAGE}} variable. And it's entirely possible if
this is.

I agree with Rocket here, this is used already in a bunch of places. Try requesting a different function for the content language.

Mormegil added a comment.Via ConduitAug 6 2008, 6:39 PM

(In reply to comment #5)

I'm not sure what would be terribly difficult
about it. I mean, we already can archive the same with int:, it's just that
might get "fixed" now. :/

The fact that {{int:}} seems to work, is just a bug, causing database integrity violation. The requested behavior cannot be easily implemented into MediaWiki _in a correct way_. That is the reason Bug 2085 has been closed as WONTFIX, i.e. the feature request has been rejected, such functionality is _not_ going to get into MediaWiki. Comments by Brion Vibber at Bug 2085 explain why. In short (see also the attached URL for an example): If you write [[{{int:History short}}]], you are creating a page that links to either [[History]], or [[Historie]], or [[Historique]], or … etc., according to the user’s preferred language. OK, you might think, what’s the problem? The problem is twofold, but the most difficult part to solve is link tables: MediaWiki needs to know which page links to which (e.g. because of Special:Whatlinkshere etc.). But it is unable to decide in this case – the linked page depends on the user who views the page! And, if you check the linked example, you can see the broken behavior – only one of the linked Whatlinkshere pages lists the example page (and which one is “random”, it depends on the language of the user who saved the last edit).

I am not saying this problem is completely impossible to solve, just that it would be IMHO really difficult to do (implementation-wise, and probably even performance-wise after that).

(In reply to comment #6)

I agree with Rocket here, this is used already in a bunch of places. Try
requesting a different function for the content language.

You misunderstand the situation. It’s not like I’m asking for a new feature. I am just reporting this behavior, which is a bug, not a feature (see above and Bug 2085 for explanation why).

siebrand added a comment.Via ConduitAug 16 2008, 11:08 PM

Assigning to Nikerabbit.

Tgr added a comment.Via ConduitSep 11 2008, 7:26 PM

(In reply to comment #7)
Would using the the content language for the link table, and the interface language for the actual display be acceptable?

IAlex added a comment.Via ConduitSep 11 2008, 7:32 PM

(In reply to comment #9)

Would using the the content language for the link table, and the interface
language for the actual display be acceptable?

Already tried, see comment 1 and comment 2 :)

brion added a comment.Via ConduitNov 25 2008, 11:47 PM

Created attachment 5541
Change to use the parser's function language instead of UI language

Replaces the wfMsgGetKey() and wfMsgReplaceArgs() calls with a call to wfMsgExt() using $parser->getFunctionLang().

This _ought_ to cause no changes in behavior in UI message usage, but would render page content material using the site content language. This does what was originally planned on this bug, but... people *do* like to put little UI thingies in their pages, and it is useful, so I don't want to break it just yet.

This change, or something like it, is also needed in order for {{int:}} to do what's expected in UI messages being pulled for something that's not the *general* UI language (eg, not $wgLang). To work around this in CentralNotice I'm currently temporarily overriding $wgLang while doing message renders, and this kind of sucks.

Maybe what we want is some way to mark a page or a part of a page as being in a different language, either a specific one or the selected UI language, so that parser functions (including PLURAL and GRAMMAR as well as int) can use the appropriate language for their individual bits of content.

Attached: test-int-behavior-change-14404.diff

bzimport added a comment.Via ConduitNov 29 2008, 11:52 PM

alno wrote:

(In reply to comment #7)

(In reply to comment #5)

The fact that {{int:}} seems to work, is just a bug, causing database integrity
violation.
(...)
In short (see also the attached URL for an example): If you write [[{{int:History
short}}]], you are creating a page that links to either [[History]], or
[[Historie]], or [[Historique]], or … etc., according to the user’s
preferred language. OK, you might think, what’s the problem? The problem is
twofold, but the most difficult part to solve is link tables: MediaWiki needs
to know which page links to which (e.g. because of Special:Whatlinkshere etc.).
But it is unable to decide in this case – the linked page depends on the user
who views the page! And, if you check the linked example, you can see the
broken behavior – only one of the linked Whatlinkshere pages lists the
example page (and which one is “random”, it depends on the language of the
user who saved the last edit).

I'd suggest that in such case, MediaWiki should always store every existing page that would actually be seen by users.

For instance, when encountering [[{{int:History (short)}}]], and having at the same time actual pages at [[History]], [[Historique]] and [[Historie]] and nothing for it in any other language, we'd store as backlinks to the current page the three said ones (as if there were three real links). Actually, these *are* virtually real links: any user could really get to one of these page, so making a link to all of them makes sense.

Ths would also be independent of the user's actual settings, and wouldn't store too much backlinks for the vast majority of the pages.

I hope I didn't write something completely stupid! :)

brion added a comment.Via ConduitNov 30 2008, 12:03 AM

(In reply to comment #12)

I'd suggest that in such case, MediaWiki should always store every existing
page that would actually be seen by users.

For instance, when encountering [[{{int:History (short)}}]], and having at the
same time actual pages at [[History]], [[Historique]] and [[Historie]] and
nothing for it in any other language, we'd store as backlinks to the current
page the three said ones (as if there were three real links). Actually, these
*are* virtually real links: any user could really get to one of these page, so
making a link to all of them makes sense.

Well, this could require parsing every page that used {{int:}} several hundred times every time it's saved. Yeouch! :)

bzimport added a comment.Via ConduitNov 30 2008, 4:33 PM

alno wrote:

(In reply to comment #13)

(In reply to comment #12)
> I'd suggest that in such case, MediaWiki should always store every existing
> page that would actually be seen by users.
>
> For instance, when encountering [[{{int:History (short)}}]], and having at the
> same time actual pages at [[History]], [[Historique]] and [[Historie]] and
> nothing for it in any other language, we'd store as backlinks to the current
> page the three said ones (as if there were three real links). Actually, these
> *are* virtually real links: any user could really get to one of these page, so
> making a link to all of them makes sense.

Well, this could require parsing every page that used {{int:}} several hundred
times every time it's saved. Yeouch! :)

I see... You mean something like:

FOREACH language l DO

set int: to l
parse to get links
FOREACH link li DO
  IF exists(li) THEN
    cache li
  ENDIF
ENDFOR

ENDFOR

> this would led to parse N(int:) times the page, then check if N(int:)*N(links) pages exist

I'd see something like:

parse to get links, without resolving their 'int:' part
FOREACH link l DO

FOREACH language lang DO
  set int: to lang
  set possible_link to the resolution of l with current value of int:
  IF exists(possible_link) THEN
    cache possible_link
  ENDIF
ENDFOR

ENDFOR

> then you would parse the page only once, letting the hundreds of tests made separately.

I understand this would be still quite long... :(

bzimport added a comment.Via ConduitNov 30 2008, 11:26 PM

herd wrote:

(In reply to comment #14)

=> this would led to parse N(int:) times the page, then check if
N(int:)*N(links) pages exist

Ahh, but this would recurse too. Imagine if {{ {{int:foo}} }} called potentially up to N templates, each of which had another {{int:}}. Two deep would be N^2

Ilmari_Karonen added a comment.Via ConduitDec 10 2008, 7:57 AM

Note that {{int:}} isn't the only case where page content can end up depending on the user's interface language, although it's probably the most visible one. A number of parser error and warning messages, for example, are also embedded in page content using the user's interface language.

I suspect the only practical resolution, if we want to retain the current behavior of {{int:}}, would be to only update the link tables when the interface language matches the content language. If the page is saved or purged by a user with a different interface language, we'd have to reparse the page in the background using the content language in order to correctly refresh the links.

Yes, this would leave the possibility of having links visible in different interface languages that are not recorded in the database, but that may be acceptable: after all, there are plenty of other cases where links are not recorded either. The biggest problem is that updates to templates that are only used on "localized" versions of pages may not propagate fully -- but that can be worked around either by manual purging or simply by not doing that in the first place.

(Incidentally, {{int:}} suffers from a similar problem anyway: I don't believe transcluding an interface message using {{int:}} creates a templatelinks entry, so changes to the transcluded interface message won't propagate automatically. In practice, we just live with that limitation.)

Mormegil added a comment.Via ConduitDec 10 2008, 10:20 AM

(In reply to comment #16)

(Incidentally, {{int:}} suffers from a similar problem anyway: I don't believe
transcluding an interface message using {{int:}} creates a templatelinks entry,
so changes to the transcluded interface message won't propagate automatically.
In practice, we just live with that limitation.)

Note that not only {{int:}} does not create a templatelinks entry, it would not be enough, anyway. Most of the interface messages are not pages, but come directly from the PHP files. And when you update those files (e.g. during MediaWiki upgrade), you don’t know what/if you have changed anything (you would need to rerender all pages using any message).

Ilmari_Karonen added a comment.Via ConduitApr 15 2009, 5:57 PM

There's a similar link table inconsistency issue with time-based parser functions or magic words, which I've filed separately as bug 18478.

Jidanni added a comment.Via ConduitApr 23 2009, 3:04 AM
  • Bug 17629 has been marked as a duplicate of this bug. ***
demon added a comment.Via ConduitJul 15 2009, 7:28 PM
  • Bug 19638 has been marked as a duplicate of this bug. ***
Verdy_p added a comment.Via ConduitJul 17 2010, 7:16 AM

I don't know why {{int:}} would corrupt the cache. In fact the cache just has to remember the language for which it generaetd the page (i.e. the value of the "uselang=" query parameter, or by default the language code infered from the "Accept-Language:" header in the HTTP query.

Yes this means that pages may be cached multiple times, but only if they are visited by different users using different preferences for their language. All users will see a coherent page in their own language, the cache of prerendered pages will remain a FIFO and, instead of indexing just on "{{FULLPAGENAME}}", it will index on "{{USERLANGUAGE}}:{{FULLPAGENAME}}".

Note that pages in the cache should also have a short lifetime, if they use any one of the builtin magics that access to the current time:

  • if {{CURRENTMONTH}} or {{CURRENTYEAR}} is used, the lifetime should not exceed the current month or year on the server (but anyway, any page in the cache woyuld probably be flushed before, to make room for other cached pages, or jsut because the server was upgraded)
  • if {{CURRENTDAY}} is used, the lifetime in cache should not exceed the current day on the server ;
  • if {{CURRENTHOUR}} is used, the lifetime in cache should not exceed the current hour on the server ;
  • if {{CURRENTMINUTE}} is used, the lifetime in cache should not exceed the current minute on the server ;
  • if {{CURRENTSECOND}} is used, the lifetime in cache theoretically should not exceed the current second on the server, but a minimum lifetime of pages in the cache may still be increased, to avoid too much work on frequently accessed pages
  • if {{#time:}} is used and takes as default parameters the current time on the server (instead of being specified as constants in additional parameters), the same logic should be applied by detecting which date element was used in the format string (take the shortest element to reduce the lifetime while scanning the format string)
  • if some other magic keywords that return server statistics (such as number of pages, number of edits...) are used in a page, these statistics should have a reasonable lifetime.

This means that builtin functions and magic keywords must be able to decrease the default lifetime of pages (but not be able to increase it), according to their semantics, and only after they have evaluated their parameters: their return value is not just a string, but a structure containing the parsed text and the maximum lifetime.

All builtin functions (as well as the template expansor) will see what to do with their parameters : if the parameter is not used (for example because of a #if:, or #switch: that skips some parameters, the builtin functions will not reduce the lifetime of the output string they are creating. In other words, the lifetime for each template parameter, or each builtin function parameter, or the output of any of them are independant.

For example, the #if parser function evaluating:

{{#ifeq:a|b| The current second is: {{CURRENTSECOND}}. |}}

will still return the maximum lifetime, even if one of its parameters has a short lifetime (because its actual value si not used in the output of #if).

On the opposite, with:

{{#ifeq:{{CURRENTSECOND}}|00| I'm up to the minute! |}}

the result will always be dependant of the value of {{CURRENTSECOND}}, because the 1st and 2nd parameter of #ifeq: always needs to be evaluated. This means that the the lifetime of #ifeq: is first initialized as the minimum value of its 2 first parameters being compared (because they are always evaluated), before determining if the conditional 3rd or 4th parameter will be evaluated and returned : the #ifeq: builtin will then reduce (but not increase) the initialized value according to the lifetime computed and returned separately by either the 3rd or 4th parameter.

The same logic should be applied to the lifetime of parameters of conditional builtin functions like: #ifexpr:, #ifeq: #switch:, when computing the lifetime of their returned value : unused parameters should have their lifetime simply ignored, and the returned value will be the minimum lifetime of ONLY the parameters they effectively use in their returned texts.

In all cases, once you have computed the maximum lifetime by taking the minimum of all these values above, check that this lifetime is not below a tuning parameter for the minimum lifetime of validity of ALL pages on the server cache (you may tune it to one minute for example, or less if the server can support it : this may depend on the server or project on which MediaWiki is installed, and on the policy needed for the global page caches used directly in the server, or within front proxy servers). And let the MediaWiki rendering server instruct the page cache about this lifetime of prerendered pages that will be stored there.

Verdy_p added a comment.Via ConduitJul 17 2010, 7:34 AM

Ahh, but this would recurse too. Imagine if {{ {{int:foo}} }} called

potentially up to N templates, each of which had another {{int:}}. Two deep
would be N^2

Actually no ! The UI language is always the same while rendering a page. So all dependencies are computed within the restricted set of the contant UI language. This will never multily the number of rendered pages to manitain in the cache, but will only store different versions (and different lists of backlinks for each UI language) ONLY when that UI language is used.

What does this means ? Backlinks are all dropped from a page when it is edited, but as it is saved, it is always within the contect of a specific UI language. If later a visitor comes that wants another UI language, it won't be present in the cache and the page will be regenerated.

You can still minimize the impact of #int: and of {{USERLANGUAGE}}: when evaluating end rendering pages, detect if one of them is used (use the same algorithm used for computing lifetimes of pages) : first start evaluating as if the page was generated within a locale-neutral root language. Then if one of these #int: or {{USERLANGUAGE}} is being evaluated, set the language code in the result structure (that contains the generated text, the lifetime, and the UI language code).

After evaluating all the page, you immediately see if the result is dependant from a UI language, and if so, you'll index the generated page in the cache as

{{UILANGUAGE}}:{{FULLPAGENAME}}

and you'll drop:

:{{FULLPAGENAME}}

from the cache.

Otherwise you'll index it as

:{{FULLPAGENAME}}

and you'll drop all pages in the cache that match:

*:{{FULLPAGENAME}}

With such algorithm, you can significantly reduce the workload because a lot of pages or templates do not depend on the UI language.

And you absolutely don't need to regerate at the same time all the pages for all supported UI languages: generate them only on the fly, as they are effectively demanded by users (the first version that will be indexed will be the version built for the UI language used by the page editor saving it, but only when it will be requested through as standard GET request after saving it and being redirected to it.

Additional backlinks (to templates or page names that depend on the UI language) can be added as well on the fly very long after the page has been saved, but of course, you also opt for rendering the page immediately in and save the backlinks for the page being rendered for the default Project language (my opinion is that it would complicate things for no benefit, and would increase the response time for the user saving a page while its UI language is not the default UI language of the project).

Nikerabbit added a comment.Via ConduitJul 17 2010, 10:45 AM

It's not about caching, it works already pretty well (at least if exclude cache fragmentation), but it's about the link tables in the database which should not change depending on the user language.

Catrope added a comment.Via ConduitJul 17 2010, 1:55 PM

(In reply to comment #21)

I don't know why {{int:}} would corrupt the cache. In fact the cache just has
to remember the language for which it generaetd the page (i.e. the value of the
"uselang=" query parameter, or by default the language code infered from the
"Accept-Language:" header in the HTTP query.

Yes this means that pages may be cached multiple times, but only if they are
visited by different users using different preferences for their language. All
users will see a coherent page in their own language, the cache of prerendered
pages will remain a FIFO and, instead of indexing just on "{{FULLPAGENAME}}",
it will index on "{{USERLANGUAGE}}:{{FULLPAGENAME}}".

We already do this for various parameters.

Note that pages in the cache should also have a short lifetime, if they use any
one of the builtin magics that access to the current time:

  • if {{CURRENTMONTH}} or {{CURRENTYEAR}} is used, the lifetime should not exceed the current month or year on the server (but anyway, any page in the cache woyuld probably be flushed before, to make room for other cached pages, or jsut because the server was upgraded)
  • if {{CURRENTDAY}} is used, the lifetime in cache should not exceed the current day on the server ;
  • if {{CURRENTHOUR}} is used, the lifetime in cache should not exceed the current hour on the server ;
  • if {{CURRENTMINUTE}} is used, the lifetime in cache should not exceed the current minute on the server ;
  • if {{CURRENTSECOND}} is used, the lifetime in cache theoretically should not exceed the current second on the server, but a minimum lifetime of pages in the cache may still be increased, to avoid too much work on frequently accessed pages

When any of these time-dependent magic words is used, the page is only cached for one hour. This was implemented ages ago.

This means that builtin functions and magic keywords must be able to decrease
the default lifetime of pages (but not be able to increase it), according to
their semantics, and only after they have evaluated their parameters: their
return value is not just a string, but a structure containing the parsed text
and the maximum lifetime.

They already are able to do so.

Platonides added a comment.Via ConduitJul 17 2010, 9:39 PM

(In reply to comment #23)

It's not about caching, (...) but it's about the link tables in the database
which should not change depending on the user language.

In fact, I think we fixed it by doing a second parse in the content language. So... RESOLVED FIXED?

Nikerabbit added a comment.Via ConduitJul 17 2010, 9:43 PM

(In reply to comment #25)

In fact, I think we fixed it by doing a second parse in the content language.
So... RESOLVED FIXED?

In which commit?

Platonides added a comment.Via ConduitJul 18 2010, 1:35 PM

Sorry, it's not fixed.

Verdy_p added a comment.Via ConduitJul 19 2010, 5:03 PM

(In reply to comment #23)

It's not about caching, it works already pretty well (at least if exclude cache
fragmentation), but it's about the link tables in the database which should not
change depending on the user language.

Yes but I addressed this already in the last paragraph of comment #23 (speaking about "backlinks").

And forcing all pages that use magic time-based keywords to use only a 1-hour lifetime is not the best option, when ONLY the day (or week, month, year) precision is used.

In addition, you still don't consider when a function or template parameter that depends on time (or server statistics like number of pages in categories or namespaces) will be actually be used to generate the result.

Reread what I wrote about the parameters of #if/#ifexpr/#ifeq/#switch, where only the first parameters and the effective conditional result is important for the cacheability of the result which could be made much longer if a conditional output parameter is not used. And this may be applied as well within the evaluation of #expr/#ifexpr expressions containing the "a ? b : c" ternary operator (only the lifetime of "a", and of EITHER "b" OR "c" should restrict the cache lifetime of the result):

It is best to effectively track the lifetime of builtin functions and templates in order to get consistant results, but still a maximal cachability of pages because it can save lots of ressourcs on the servers (one hour is not enough in most cases when it could be even a full year or month, for heavily visited pages that are NOT modified, such as project and portal pages).

The choice of one hour seems quite arbitrary (even if it's good only as a PROJECT-SPECIFIC policy for the minimum lifetime to consider for the final rendered page).

  • some projects will still want to accept 1-second lifetime for a few very active pages such as a few pages of discussions (or within very specific namespaces with restricted modification policies such as "mediawiki:", indirectly referenced by "{{int:}}" and that may also include server-wide notices), or pages giving status information about the server,
  • and some projects will even consider that some pages should never be cached and rendered gain each time it is requested, when it contains time-dependant or statistics-dependant information (the "mediawiki:" namespace is such a candidate namespace whose cachability should be tracked as precisely as possible, but there are a few other "special:" pages that may benefit of a more precise cachability).
Catrope added a comment.Via ConduitJul 19 2010, 7:40 PM

(In reply to comment #28)

(In reply to comment #23)
> It's not about caching, it works already pretty well (at least if exclude cache
> fragmentation), but it's about the link tables in the database which should not
> change depending on the user language.

Yes but I addressed this already in the last paragraph of comment #23 (speaking
about "backlinks").

And forcing all pages that use magic time-based keywords to use only a 1-hour
lifetime is not the best option, when ONLY the day (or week, month, year)
precision is used.

In addition, you still don't consider when a function or template parameter
that depends on time (or server statistics like number of pages in categories
or namespaces) will be actually be used to generate the result.

Reread what I wrote about the parameters of #if/#ifexpr/#ifeq/#switch, where
only the first parameters and the effective conditional result is important for
the cacheability of the result which could be made much longer if a conditional
output parameter is not used. And this may be applied as well within the
evaluation of #expr/#ifexpr expressions containing the "a ? b : c" ternary
operator (only the lifetime of "a", and of EITHER "b" OR "c" should restrict
the cache lifetime of the result):

It is best to effectively track the lifetime of builtin functions and templates
in order to get consistant results, but still a maximal cachability of pages
because it can save lots of ressourcs on the servers (one hour is not enough in
most cases when it could be even a full year or month, for heavily visited
pages that are NOT modified, such as project and portal pages).

I'm not exactly sure how smart the mechanisms we currently have are, that is, whether they recognize a {{CURRENTMONTH}} in a branch that isn't taken.

  • some projects will still want to accept 1-second lifetime for a few very active pages such as a few pages of discussions (or within very specific namespaces with restricted modification policies such as "mediawiki:", indirectly referenced by "{{int:}}" and that may also include server-wide notices), or pages giving status information about the server,
  • and some projects will even consider that some pages should never be cached and rendered gain each time it is requested, when it contains time-dependant or statistics-dependant information (the "mediawiki:" namespace is such a candidate namespace whose cachability should be tracked as precisely as possible, but there are a few other "special:" pages that may benefit of a more precise cachability).

While some projects may indeed want uncacheable or 1-second lifetime (there's hardly a difference between these two) pages, I'm pretty sure the servers wouldn't like that very much. At Wikimedia, we err on the side of caching over correctness in quite a few situations.

Verdy_p added a comment.Via ConduitJul 19 2010, 8:02 PM

While some projects may indeed want uncacheable or 1-second lifetime (there's
hardly a difference between these two) pages, I'm pretty sure the servers
wouldn't like that very much. At Wikimedia, we err on the side of caching over
correctness in quite a few situations.

You're right. But it's a matter of project-specific policy about their local use of caches for prerendered pages.

The policy will just have the effect of increasing the computing the final lifetime to a sustainable level for most frequent pages, when some limited subsets of pages (most probably within specific namespaces with stronger modification policies) will require to be able to track smaller precisions.

Note that heavily used and modified discussion pages could have a high precision as long as they are modifiable, but later they will be archived, or they may be edited in separate subpages, for example one for each day, so that a container page will still avoid transcluding older pages or archived pages that will have a long lifetime (because they will no longer depend on the use of magic keywords like {{CURRENTSECOND}} or {{PAGESINCAT:1}}):

Those page archives, even if they remain modifiable would be moved to a namespace where they have a longer cachability, or where they will be frozen (by blocking administratively all later modifications), so that they won't be impacted by their smaller lifetime in caches.

Special statistics pages on the server, for example, can perfectly have a very short lifetime as their layout is easily fixed and these pages are not directly modifiable (so there would be no risk that an included template would have to cause the page to be rendered again each time the template is modified. Their HTML or wiki code will be built from stable PHP server scripts (which can't be modified without administrative access to the server, using special admin tools that will specifically flush their cached rendering, if it is stored).

If ever these special pages (with low lifetime and constantly updated dynamically) are transcluded within user-modifiable pages, the policy applicable to these user pages will still reincrease the lifetime to the minimum acceptable litefime. So there will be no problem at all, even if those user pages do not reflect the most instantaneous state of these special pages.

But when the renderer will rebuild the wiki source of these user pages, it will get access to the instant state of these pages, and will cache it for a longer time than what you would get by visiting directly these special pages. In fact, these short-time special pages could even be denied direct access to normal unpriviledged users, even if these pages may be transcluded, or the server may choose to expose to these users only their cached version.

The same can be applied to stable (patrolled) versions of pages which could benefit a lot of longer lifetimes in the cache, as they live within a stable and unmodifiable update id which does not need to reflect any current state of the server.

Some wiki projects only work with stable versionned pages, and edits are only visible by selected patrolling users (that can validate a version), or only by users that performed these updates (so these newer updates don't even have to be cached).

Verdy_p added a comment.Via ConduitJul 19 2010, 8:15 PM

Another note about patrolled versioned page: as these pages were committed at a very precisely known time, the dynamic values of magic keywords used in them should also be stored :

  • the timestamp of the version is already stored, just use it as the source of time instead of the current time on the server: the view should NEVER change depending on the time where the page is rebuilt (if it was flushed out from the cache, or if the server is restarted because of cache corruption).
  • the magic values of other server statistics could be stored as well with the version, in a list of properties attached to the stored and timestamped version, for example the values of each used {{PAGESINCAT:...}} when the page was first submitted.
Nikerabbit added a comment.Via ConduitSep 11 2010, 4:33 PM

Taking myself of as an assignee. We can't break existing functionality without making lots of people unhappy. There are some other ways for reaching multilingualism (Translate extension has some solutions), but more is needed.

bzimport added a comment.Via ConduitNov 3 2010, 3:12 AM

mathsmart9 wrote:

If we break existing functionality but create another solution that creates the same results, we could have a steward replace templates on wikis (or any user for templates that are not admin-protected) that use the outdated {{int:}} code.

Platonides added a comment.Via ConduitDec 28 2010, 9:50 PM

Fixed in r79122

tstarling added a comment.Via ConduitMar 14 2011, 4:32 AM

Reopening. Fix reverted in r83868. The proposed fix causes bug 27891, which is more severe than this one.

Peachey88 added a comment.Via ConduitApr 30 2011, 12:10 AM

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

Platonides added a comment.Via ConduitJun 24 2011, 4:36 PM

Recommited in r89706

Add Comment