TemplateData: page_props limits value length to 65535 bytes (MySQL 'blob' field)
Closed, ResolvedPublic

Description

TemplateData for [[pl:Template:Związek chemiczny infobox]] is not shown in VE. No idea why. The prop is visible on [[pl:Special:PagesWithProp]], so this must be a VE bug.


Version: master
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=52582

bzimport added a project: TemplateData.Via ConduitNov 22 2014, 1:54 AM
bzimport set Reference to bz51740.
matmarex created this task.Via LegacyJul 20 2013, 10:42 AM
matmarex added a comment.Via ConduitJul 20 2013, 2:38 PM
  • Bug 51671 has been marked as a duplicate of this bug. ***
matmarex added a comment.Via ConduitJul 20 2013, 2:39 PM

Whoops, wrong bug, please disregard the comment above.

jayvdb added a comment.Via ConduitJul 23 2013, 6:27 AM

In the background is the following response from the server:

{"servedby":"mw1205","error":{"code":"templatedata-corrupt","info":"Page #837180 templatedata contains invalid data:Syntax error in JSON."}}

https://pl.wikipedia.org/w/api.php?format=jsonfm&action=templatedata&titles=Template:Zwi%C4%85zek_chemiczny_infobox

It is registered

https://pl.wikipedia.org/w/index.php?title=Specjalna:PagesWithProp/templatedata&limit=500&offset=0

This looks like the syntax error:

https://pl.wikipedia.org/w/index.php?title=Szablon:Zwi%C4%85zek_chemiczny_infobox/opis&diff=37123204&oldid=37115636

That diff needs to be approved, and check the update appears at:

https://pl.wikipedia.org/w/index.php?title=Specjalna:PagesWithProp/templatedata&limit=500&offset=0

(If that doesnt work, Maybe '?' is not allowed? *shrug*)

matmarex added a comment.Via ConduitJul 23 2013, 9:51 AM

Thanks John; I approved that revision, but it doesn't seem to have solved the issue :(

jayvdb added a comment.Via ConduitJul 23 2013, 10:16 AM

I notice that the PagesWithProps listing ends at:

je\u015bli podano warto\u015b\u0107 w parametrze \u201e3. napi\u0119cie powierzchniowe\u201d)."},"type"

Maybe the property has a buffer limit.

I have remove many parameters, and that appears to have fixed the problem

matmarex added a comment.Via ConduitJul 23 2013, 11:53 AM

http://www.sadtrombone.com/ …nice catch.

TemplateData is using the page_props table, whose pp_value field is a
'blob', which is limited to 65535 bytes [1]. Of course MySQL, being
MySQL, doesn't complain and just silently truncates the data on
insert.

I see three ways to do something about this:

  • If over 65535 bytes, show a big fat red error to the user instead of silently failing.
  • Store compressed data. JSON compresses well due to the redundancy in object keys and due to being mostly text in this case. This would probably raise the effective maximal length a few times and should be easy to implement.
  • If over 65535 bytes, chunk the data in multiple properties 'templatedata1', 'templatedata2' etc., with some marker stored in regular 'templatedata'. Handle this transparently in the API. Nasty, but would work.

Adjusting bug summary and component accordingly.

[1] https://www.mediawiki.org/wiki/Manual:Page_props_table

jayvdb added a comment.Via ConduitJul 23 2013, 12:05 PM

Option 4: store the JSON in a dedicated wikipage, and fiddle with TemplateData so it supports it.

matmarex added a comment.Via ConduitJul 23 2013, 12:08 PM

I chatted briefly with Krinkle and I'll try to implement solution #1 and #2.

I have no preference on #4, but it would be a large-ish architectural change now and would probably require migrating all of the templatedatas already added on wikis to the new format (or supporting both ways).

He7d3r added a comment.Via ConduitJul 23 2013, 12:12 PM

(In reply to comment #7)

Option 4: store the JSON in a dedicated wikipage, and fiddle with
TemplateData so it supports it.

This is what I suggested on bug 50512 (to solve another problem).

gerritbot added a comment.Via ConduitJul 23 2013, 12:31 PM

Change 75324 had a related patch set uploaded by Matmarex:
Bail when JSON length exceeds database limits

https://gerrit.wikimedia.org/r/75324

gerritbot added a comment.Via ConduitJul 23 2013, 1:33 PM

Change 75330 had a related patch set uploaded by Matmarex:
Store compressed JSON since size is limited

https://gerrit.wikimedia.org/r/75330

matmarex added a comment.Via ConduitJul 23 2013, 3:42 PM

Core patch to avoid showing weird stuff: https://gerrit.wikimedia.org/r/75346

matmarex added a comment.Via ConduitJul 23 2013, 4:13 PM

I tested the patches above with [[pl:Template:Związek chemiczny infobox]]: size of raw processed JSON is ~112 kB, size of compressed JSON is ~7.5 kB. Looks good enough for practical use.

gerritbot added a comment.Via ConduitAug 1 2013, 9:41 PM

Change 75324 merged by jenkins-bot:
Bail when JSON length exceeds database limits

https://gerrit.wikimedia.org/r/75324

SalixAlba added a comment.Via ConduitAug 28 2013, 11:25 AM

This problem can occur with templates with a large number of numbered fields
Just a raw skeleton without descriptions for http://en.wikipedia.org/wiki/Template:Infobox_officeholder
take 174,407 bytes, as it has very many numbered fields vicepresident2 ... vicepresident14 etc. Fixing bug 52582 allowing autonumbered parameters might solve problems when this occurs.

gerritbot added a comment.Via ConduitSep 5 2013, 9:07 AM

Change 75330 merged by jenkins-bot:
Store compressed JSON since size is limited

https://gerrit.wikimedia.org/r/75330

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.