Page MenuHomePhabricator

Increase the size of wbt_text_in_lang.wbxl_language
Open, Needs TriagePublic

Description

Follow up on T232393: Find out why rebuilding some items in new term store failed
We can't store any terms on new term store when we have terms in 'zh-classical' language because the limit for size is ten and wbt_text_in_lang.wbxl_language should not be more than 10. This needs to be fixed ASAP as it's a blocker of turning on reading the new term store.

We agreed on 20 as the limit

Details

Related Gerrit Patches:
mediawiki/extensions/Wikibase : masterIncrease the size of wbt_text_in_lang.wbxl_language

Event Timeline

Ladsgroup created this task.Fri, Nov 1, 1:40 PM
Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptFri, Nov 1, 1:40 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

After this is done we need to rebuild anything that has terms in 'zh-classicial' or 'nl-informal'

Change 547730 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@master] Increase the size of wbt_text_in_lang.wbxl_language

https://gerrit.wikimedia.org/r/547730

Maintenance_bot moved this task from incoming to in progress on the Wikidata board.Fri, Nov 1, 2:15 PM
Maintenance_bot moved this task from Incoming to In progress on the User-Ladsgroup board.

Change 547730 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Increase the size of wbt_text_in_lang.wbxl_language

https://gerrit.wikimedia.org/r/547730

Mentioned in SAL (#wikimedia-releng) [2019-11-01T16:40:04Z] <Amir1> ladsgroup@deployment-deploy01:~$ mwscript sql.php --wiki=wikidatawiki /srv/mediawiki-staging/php-master/extensions/Wikibase/repo/sql/increase_wbxl_language.sql (T237102)

After talking to @Amire80 it seems that we need to drop zh-classical in favor of lzh. They are not being used much which means community is already cleaning it up:

mysql:research@s8-analytics-replica.eqiad.wmnet [wikidatawiki]> select * from wb_terms where term_language = 'zh-classical' limit 55;
+-------------+----------------+---------------------+------------------+---------------+-------------+--------------------------------------------------------------------------------------+-----------------+-------------+
| term_row_id | term_entity_id | term_full_entity_id | term_entity_type | term_language | term_type   | term_text                                                                            | term_search_key | term_weight |
+-------------+----------------+---------------------+------------------+---------------+-------------+--------------------------------------------------------------------------------------+-----------------+-------------+
|  2995932324 |              0 | Q1321               | item             | zh-classical  | description | 印歐一語,隸羅曼語族,本源西班牙,通於拉美列國,字以羅馬                             |                 |           0 |
|  2995934271 |              0 | Q334351             | item             | zh-classical  | description | 清帝,年號道光                                                                       |                 |           0 |
|  2995933714 |              0 | Q35                 | item             | zh-classical  | description | 北歐一國,都哥本哈根                                                                 |                 |           0 |
|  2995931300 |              0 | Q45                 | item             | zh-classical  | description | 南歐一國,都里斯本                                                                   |                 |           0 |
|  2995933519 |              0 | Q55                 | item             | zh-classical  | description | 西歐一國,都阿姆斯特丹,實都海牙                                                     |                 |           0 |
|  2987203583 |              0 | Q65924886           | item             | zh-classical  | label       | 關聖帝君戒淫經                                                                       |                 |           0 |
|  3050087841 |              0 | Q71582643           | item             | zh-classical  | label       | 分類:待選卓著                                                                        |                 |           0 |
|  3056413184 |              0 | Q72699512           | item             | zh-classical  | label       | 模板:IPA pulmonic consonants/table                                                   |                 |           0 |
|  3056419466 |              0 | Q72700587           | item             | zh-classical  | label       | 模板:IPA chart/core2                                                                 |                 |           0 |
|  3056452840 |              0 | Q72706479           | item             | zh-classical  | label       | 模板:IPA vowels/table                                                                |                 |           0 |
|  3056453724 |              0 | Q72706663           | item             | zh-classical  | label       | 模板:IPA vowels/styles.css                                                           |                 |           0 |
|  3057879044 |              0 | Q72942543           | item             | zh-classical  | label       | 維基大典:投票/汝同意刪除文言文維基否                                                 |                 |           0 |
|  3058451089 |              0 | Q73033088           | item             | zh-classical  | label       | 模板:~w                                                                              |                 |           0 |
+-------------+----------------+---------------------+------------------+---------------+-------------+--------------------------------------------------------------------------------------+-----------------+-------------+
13 rows in set (0.00 sec)

(If you change your language to zh-classical, it goes to lzh instead. I'm certain it's not a valid language code)

OTOH, we can't drop nl-informal because it's a valid language code but it's not used "much":

mysql:research@s8-analytics-replica.eqiad.wmnet [wikidatawiki]> select * from wb_terms where term_language = 'nl-informal' limit 55;
+-------------+----------------+---------------------+------------------+---------------+-----------+----------------------------------------------------------------------------------------------------------------+-----------------+-------------+
| term_row_id | term_entity_id | term_full_entity_id | term_entity_type | term_language | term_type | term_text                                                                                                      | term_search_key | term_weight |
+-------------+----------------+---------------------+------------------+---------------+-----------+----------------------------------------------------------------------------------------------------------------+-----------------+-------------+
|  3053621138 |              0 | Q72165760           | item             | nl-informal   | alias     | mijn broer in Washington spreekt nederlands en chinees mijn ouders zijn wel eens in het plaatsje oving geweest |                 |           0 |
|  3053621137 |              0 | Q72165760           | item             | nl-informal   | label     | nicolaas herman oving washington                                                                               |                 |           0 |
+-------------+----------------+---------------------+------------------+---------------+-----------+----------------------------------------------------------------------------------------------------------------+-----------------+-------------+
2 rows in set (0.00 sec)

I cleaned up all of usages of zh-classical now. Please note that it's a redirect to lzh in mediawiki and in frontend people can't add a term in zh-classical (I tried) but they can add it through API. Given that nl-informal is used but it's used only in Q72165760 I don't think there's anything preventing us from turning on reading from the new store for items up to let's say Q1k though.

Thoughts @alaa_wmde @Addshore?

Will leave this on the campsite verify column until the schema change is done?

Will leave this on the campsite verify column until the schema change is done?

Yeah, let's leave it here for a bit.