Details
Related Objects
- Mentioned Here
- T108255: Enable MariaDB/MySQL's Strict Mode
Event Timeline
Change 512984 had a related patch set uploaded (by Alaa Sarhan; owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Wire up DatabasePropertyTermStore in WikibaseRepo
while working on this, and testing locally the rebuild script on some properties I previously imported from Wikidata (using Importer extension) the following text appeared in a term of P2
વ્યક્તિનું મુખ્ય કાર્ય ક્ષેત્ર (ભૌતિકવિજ્ઞાન, ઈતિહાસ), વ્યવસાય નહિ (ભૌતિકવિજ્ઞાની, ઈતિહાસવિદ્...તેથી જુઓ ગુણધર્મ:P૧૦૬)
That text is 308 bytes using strlen (118 characters using mb_strlen). It failed to insert with db complaining that it is too long to store in wbx_text VARBINARY(255) in wbt_text table.
I wonder how those are stored currently in wb_terms table. We have to fix this (= decide what to do with those cases) before migration can happen in production anyway.
Change 513110 had a related patch set uploaded (by Alaa Sarhan; owner: Alaa Sarhan):
[mediawiki/extensions/Wikibase@master] Wire up PropertyTermStore in WikiebaseRepo
It’s truncated:
MariaDB [wikidatawiki_p]> SELECT term_text FROM wb_terms WHERE term_full_entity_id = 'P101' AND term_language = 'gu' AND term_type = 'description'; +------------------------------------------------------------+ | term_text | +------------------------------------------------------------+ | વ્યક્તિનું મુખ્ય કાર્ય ક્ષેત્ર (ભૌતિકવિજ્ઞાન, ઈતિહાસ), વ્યવસાય નહિ (ભૌતિકવિજ્ઞાની, ઈતિહાસવિદ્...ત | +------------------------------------------------------------+ 1 row in set (0.04 sec)
We might not even explicitly truncate in Wikibase – we don’t run MariaDB in “strict mode” (see T108255), so any overlong values are just silently truncated. Or perhaps we do truncate in Wikibase, I didn’t check yet.
Change 513158 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Wire up DatabasePropertyTermStore in WikibaseRepo
Change 512984 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Fix for utf8 texts, using StringNormalizer.
We might not even explicitly truncate in Wikibase – we don’t run MariaDB in “strict mode” (see T108255), so any overlong values are just silently truncated. Or perhaps we do truncate in Wikibase, I didn’t check yet.
Looks like we do not truncate cause I could reproduce the issue locally, in which the first attempt to insert it it error with a message 'value is too long'.
Not quite sure then how this should be handled. I can think of two options here:
- Make wbx_text column bigger.
- Truncate programmatically before trying to find values to avoid false-negatives when Acquirer is searching for values before inserting them.
Change 513110 abandoned by Alaa Sarhan:
Wire up PropertyTermStore in WikiebaseRepo
Reason:
in favor of If5fb399c9dafb59dbd39669f7dcf360fcee15100
Change 513158 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Wire up DatabasePropertyTermStore in WikibaseRepo