Page MenuHomePhabricator

The special pages 'LongPages' and 'ShortPages' did not list pages in the main namespace in some sites
Closed, ResolvedPublicBUG REPORT

Description

On French Wiktionary Spécial:Pages longues and Spécial:Pages courtes list pages by namespace in this order: 106, 100 and 110 instead of starts with pages in the main namespace. For example, even if this article is clearly longer than the first page listed here.

Wikis may be affected:

'wgContentNamespaces' => [
	'default' => [ NS_MAIN ],
	'+arwikisource' => [ 102 ],
	'+aswikisource' => [ 102 ], // T45129, T72464
	'+bgwikisource' => [ 100 ],
	'+bnwikisource' => [ 100 ],
	'+bnwikibooks' => [ 104 ], // T236840
	'+brwikisource' => [ 104 ],
	'+cawikisource' => [ 106 ],
	'+cswikiquote' => [ 100 ],
	'+cswikisource' => [ 100 ],
	'+dawikisource' => [ 102 ],
	'+dewikiversity' => [ 106 ], // T93071
	'+enwikibooks' => [ 102, 110 ],
	'+enwikisource' => [ 102, 114 ], // T52007
	'+eswiki' => [ 104 ], // T41866
	'+etwikisource' => [ 106 ],
	'+euwiki' => [ 104 ], // T191396
	'+fawikibooks' => [ 102, 110 ], // T76663
	'+fawikisource' => [ 102 ],
	'+frrwiki' => [ 106 ], // T40023
	'+frwikisource' => [ 102 ],
	'+frwikiversity' => [ 104 ], // T125948
	'+frwiktionary' => [
		106, // T97228
		100, 110, 114, 116, 118, // T270821
	],
	'+hewikisource' => [ 100, 106, 108, 110 ], // T98709
	'+hrwiki' => [ 102 ], // T42732
	'+hrwikisource' => [ 100 ],
	'+huwikisource' => [ 100 ],
	'+hywikisource' => [ 100 ],
	'+idwikibooks' => [ 100, 102 ], // T4282
	'+idwikisource' => [ 100 ],
	'+itwikisource' => [ 102 ],
	'+itwikivoyage' => [ 100, 104 ], // T57620
	'+kowikisource' => [ 100, 114 ], // T183836 for 114
	'+lawikisource' => [ 102 ],
	'+ltwiki' => [ 104 ], // T144118
	'+mediawikiwiki' => [ 100, 102, 104, 106 ], // Manuals, extensions, Api & skin - T86391
	'+metawiki' => [ NS_HELP ], // T45687
	'+mlwikisource' => [ 100 ],
	'+nlwikisource' => [ 102 ],
	'+nowikisource' => [ 102 ],
	'+nowikibooks' => [ 102 ], // T274265
	'+plwikisource' => [ 104, 124 ], // T154711
	'+ptwikisource' => [ 102 ],
	'+rowikisource' => [ 102 ], // Follow-up for T31190
	'+srwikibooks' => [ 102, ], // T17282
	'+srwikisource' => [ 100 ],
	'+svwikisource' => [ 106 ],
	'+tewikisource' => [ 102 ],
	'+thwikibooks' => [ 104, 106 ], // T308376
	'+thwikisource' => [ 102, 114 ], // T275282
	'+trwikibooks' => [ 100, 110, ],
	'+trwikisource' => [ 100 ],
	'+ukwikibooks' => [ 102 ], // T310940
	'+ukwikisource' => [ 102, 116 ], // T52561, T53684
	'+vecwikisource' => [ 100 ],
	'+viwikibooks' => [ 104, 106 ],
	'+viwikisource' => [ 102 ],
	'+wikitech' => [ NS_HELP, 116 ], // Tools - T122865
	'+zhwikisource' => [ 102, 114 ], // T66127
	'+zhwikiversity' => [ 100, 102, 104, 106, 108 ], // T201675, T212919 (106, 108)
	'+dewikivoyage' => [ 104 ],
	'+commonswiki' => [ 6 ], // T167077
	'+testwikidatawiki' => [
		120, // Property (T321282)
		146, // So that Lexeme is indexed in the content index (Cirrus)
	],
	'+wikidatawiki' => [
		120, // Property (T321282)
		146, // So that Lexeme is indexed in the content index (Cirrus)
	],
],

Event Timeline

Hi. In fact, it seems that instead of sorting all content pages by size, they are grouped by namespaces, then sorted by size.

XANA000 renamed this task from The special pages 'LongPages' and 'ShortPages' show pages in namespaces other than the main namespace to The special pages 'LongPages' and 'ShortPages lists pages by namespace and not by page length.Jun 29 2023, 6:17 PM
XANA000 updated the task description. (Show Details)
Aklapper renamed this task from The special pages 'LongPages' and 'ShortPages lists pages by namespace and not by page length to The special pages 'LongPages' and 'ShortPages' list pages by namespace instead of page length.Jun 29 2023, 6:27 PM

I suspect this is because the $wgContentNamespaces of French Wiktionary is [ 106, 100, 110, 114, 116, 118, 0 ] per configuration, which is unsorted and the main namespace is the last one because it's merged from the default value.
The result is then grouped by namespaces in this order, and the number of pages from the first few namespaces already exceeds the query limit, so we can not see results from the main namespace.

Func renamed this task from The special pages 'LongPages' and 'ShortPages' list pages by namespace instead of page length to The special pages 'LongPages' and 'ShortPages' did not list pages in the main namespace in some sites.Jun 30 2023, 6:14 AM
Func updated the task description. (Show Details)

Change 934446 had a related patch set uploaded (by Func; author: Func):

[mediawiki/core@master] SpecialShortPages: Sort namespaces for query

https://gerrit.wikimedia.org/r/934446

@Func Hi. The problem is not only the French Wiktionary. The Esperanto wikisource does the same. It was always the case that pages are first ordered by namespace, then by size? Or has the sorting function been changed?

@Func Hi. The problem is not only the French Wiktionary. The Esperanto wikisource does the same. It was always the case that pages are first ordered by namespace, then by size? Or has the sorting function been changed?

They changed to group by namespaces due to performance issues since 2017 (T168010), and I didn't see the Esperanto Wikipedia or Wikisource have the real issue I mentioned.

@Func With your explanations, I don’t think there is an issue on Esperanto Wikipedia or Wikisource. Thanks!

Possible a regression from T334661 / 2a501f65b825c6565620152a57c7f23aafd8684c

The new query does not have a order by of the union result set. This also no longer use the limit for the whole result set and multiple the limit by content namespaces

now
(SELECT page_namespace AS `namespace`,page_title AS `title`,page_len AS `value` FROM `page` FORCE INDEX (page_redirect_namespace_len) WHERE page_is_redirect = 0 AND page_namespace = 0 ORDER BY page_len LIMIT 51 )
UNION (SELECT page_namespace AS `namespace`,page_title AS `title`,page_len AS `value` FROM `page` FORCE INDEX (page_redirect_namespace_len) WHERE page_is_redirect = 0 AND page_namespace = 6 ORDER BY page_len LIMIT 51 )
old
(SELECT page_namespace AS `namespace`,page_title AS `title`,page_len AS `value` FROM `page` FORCE INDEX (page_redirect_namespace_len) WHERE page_namespace = 0 AND page_is_redirect = 0 ORDER BY page_len LIMIT 51 )
UNION ALL (SELECT page_namespace AS `namespace`,page_title AS `title`,page_len AS `value` FROM `page` FORCE INDEX (page_redirect_namespace_len) WHERE page_namespace = 6 AND page_is_redirect = 0 ORDER BY page_len LIMIT 51 )
ORDER BY value LIMIT 51

Change 934446 abandoned by Func:

[mediawiki/core@master] SpecialShortPages: Sort namespaces for query

Reason:

Seems I get it wrong

https://gerrit.wikimedia.org/r/934446

Change 935155 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] rdbms: Add support for limit, offset and order by in UnionQueryBuilder

https://gerrit.wikimedia.org/r/935155

This fixes the issue ^ tested locally and works fine. Now someone needs to review it.

Ladsgroup added a project: DBA.
Ladsgroup moved this task from Triage to In progress on the DBA board.

Change 935155 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Add support for limit, offset and order by in UnionQueryBuilder

https://gerrit.wikimedia.org/r/935155