Page MenuHomePhabricator

Section transclusions are limited to 4 sections: 5th breaks the references (due to reaching the Post‐expand include size limit)
Closed, InvalidPublic

Description

Section transclusions works for only four sections, when transcluding five sections the references section of the article gets broken and only displays "Template:Reflist" instead of the references.

Examples:
2 sections
4 sections
5 sections

Why is this? Such a limitation is not explained in the Help pages for transclusions in Wikipedia and Mediawiki.
Could this please be resolved so that more than 4 section can be transcluded?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 14 2020, 6:42 PM
Pppery closed this task as Invalid.Sep 14 2020, 7:17 PM
Pppery added a subscriber: Pppery.

This is the post-expand include size limit, and is not a bug.

Prototyperspective added a comment.EditedSep 14 2020, 8:46 PM

@Pppery So this limit is there because of this?

Very long or complicated pages are slow to parse. Not only is this an inconvenience for users, but it can also be used to mount a denial of service (DoS) attack on the servers, in which a page request forces the MediaWiki software to parse an unreasonably large quantity of data. The limits help to prevent this kind of attack, and ensure that pages are rendered in a reasonable time. (Nevertheless, sometimes a complex page within the limits gives a time-out error; this depends on how busy the servers are.)

The html is only 2 MB which is not much in 2020 and according to this comment the images are supposed to lazy-load.
If the Wikimedia servers can't handle requests of 2 MB of data which is about the size of images commonly shared over the Web I think there's something really wrong with the hard/software.
There are websites with a lot more text and image content. Furthermore, I'd be interested in what needs to be "parsed" here by the Wikimedia servers - don't they mostly only need to serve the prerendered/preparsed plain (or almost plain) html? Shouldn't the templates only render when editing and previewing?

If the quoted rationale for the limit is currently warranted, I think there should probably be tasks about improving / modernizing Wikimedia software / performance so that it can handle requests for effectively miniscule amounts of data. Please link them here if they already exist.

I read that this has also been / is a substantial problem for editors of COVID-19-related articles.

Is there any near-term solution for including more text via transclusions (up to 12 section-transclusions) ?

Pppery removed a subscriber: Pppery.Sep 14 2020, 9:19 PM

The HTML size itself is irrelevant. See the HTML source of your link: Post‐expand include size: 2097151/2097152 bytes.
(Previous examples: T158242: One pl.wikisource page including the text of 700 other pages hits parser limits, T222578: Module imported from en.wp triggers parser expansion size limit (MediaWiki:Post-expand-template-inclusion-category) on te.wp.)

I think there should probably be tasks about improving / modernizing Wikimedia software / performance

Anyone is enouraged and very welcome to write more performant template code (that's my interpretation of T21262#238156).

Prototyperspective added a comment.EditedSep 15 2020, 10:44 AM

The HTML size itself is irrelevant. See the HTML source of your link: Post‐expand include size: 2097151/2097152 bytes.

Sorry: ~600 kB was the current size, 2 MB is the size of the limitation. Why is the "Post‐expand include size" relevant/established? It seems to be an artificial limitation which doesn't make sense to me and if it's necessary indicates very outdated hard/software that can't serve/handle ~2MB of data in 2020.

(Previous examples: T158242: One pl.wikisource page including the text of 700 other pages hits parser limits, T222578: Module imported from en.wp triggers parser expansion size limit (MediaWiki:Post-expand-template-inclusion-category) on te.wp.)

No explanation for this artificial limit has been provided there either.

Anyone is enouraged and very welcome to write more performant template code (that's my interpretation of T21262#238156).

I think it's not about performance of template code but performance of the MediaWiki software in general / its template-related design: are templates not prerendered/preparsed after a change to the template or target article or does including templates and transclusions always require servers to redundantly process every single request? If templates need to be parsed every time for every request, a larger server-load for small amounts of data may be unavoidable - but even then the load should be negligible, especially considering the millions of dollars Wikimedia annually collects from donators and the loads other servers can handle easily in 2020.

A Post‐expand include size of 2097151 bytes means 2 MB of data, and has nothing to do with the resulting HTML

When you do a section transclusion, the software is reparsing the entire wikitext of the page being transcluded, to delimiter each of its sections, and then return the desired section. Parsing a page like this doesn't take a negligible amount of time. Doing a preview of the page April–June 2020 in science gives: Real time usage: 4.238 seconds.

That page can contain <noinclude>, <includeonly> and other tags or parser functions that modify the resulting output, and that's why it needs to be reparsed and can't use the cached page.

Prototyperspective added a comment.EditedSep 15 2020, 11:16 AM

@Ciencia_Al_Poder Thanks for the explanations.

A Post‐expand include size of 2097151 bytes means 2 MB of data, and has nothing to do with the resulting HTML

Could you please add a short explanation for why it has nothing to do with the resulting HTML (which is about 2 MB)?

When you do a section transclusion, the software is reparsing the entire wikitext of the page being transcluded, to delimiter each of its sections, and then return the desired section. Parsing a page like this doesn't take a negligible amount of time. Doing a preview of the page April–June 2020 in science gives: Real time usage: 4.238 seconds.

Okay, but that only answers one of two questions. The second issue with this is that this is far too long, indicating outdated hard/software.

Is this problem limited to preview-times/load and not about page-load times/load?

That page can contain <noinclude>, <includeonly> and other tags or parser functions that modify the resulting output, and that's why it needs to be reparsed and can't use the cached page.

Yes but I don't follow this conclusion: shouldn't these output-modifications be rendered only once (or only a few times on different servers) after a saved change to a transcluded target-article or an included template?

4 seconds seems to be very likely to be too long for the current design, and if things get parsed at every page-load / the design can't handle data of such small sizes the design seems to be almost certainly outdated/highly suboptimal.

Aklapper renamed this task from Section transclusions are limited to 4 sections: 5th breaks the references to Section transclusions are limited to 4 sections: 5th breaks the references (due to reaching the Post‐expand include size limit).Sep 15 2020, 12:45 PM
Prototyperspective added a comment.EditedSep 22 2020, 9:43 PM

Afaik this has been a substantial problem for quite a few articles already and will probably become an issue for many other articles in the near and mid-term future, so I really think this is something that should be worked on: especially as this seems to be a major factor in server-loads. I think the solution would be to rerender articles on the server-side once an included template gets changed and the serve the prerendered page. Are there any tasks related to this?

The issue seems to be with transcluding the reference templates like Template:Reflist an Template:Cite_journal. Is it about the templates or about transcluding templates? Because if it's not about templates, it seems to be a bug with Transclusions: shouldn't these just transclude the full section fairly quickly like one requested the section's content in addition to the article one has openend instead of doing something that load-heavy? It's not a problem of the "post expand size" but the "Transclusion expansion time", isn't it? Why does it take so long due to the references of the section?:

<!--
Transclusion expansion time report (%,ms,calls,template)
100.00% 7733.218      1 -total
61.75% 4774.901      1 Template:Reflist
23.74% 1836.005    293 Template:Cite_journal
22.62% 1749.421    502 Template:Cite_news
5.67%  438.425      4 Template:Fix
5.50%  425.630      1 Template:Overly_detailed_inline
5.47%  422.810     12 Template:Category_handler
4.52%  349.686      1 Template:CVE
3.32%  257.085     80 Template:Cite_web
1.89%  145.880      7 Template:Convert
-->

Asked about it here.