Page MenuHomePhabricator

DynamicPageList with notcategory producing duplicates
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:
DPL shows a list of newest articles with only one title, repeating through the whole list.

What should have happened instead?:
The list should have distinct articles on it.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Event Timeline

You're using DPL in a weird way there - you're not passing a valid category name but instead an undefined parameter. While it shouldn't GIGO like this that's hardly an important issue.

You're using DPL in a weird way there - you're not passing a valid category name but instead an undefined parameter. While it shouldn't GIGO like this that's hardly an important issue.

Oh, okay. On this very page the category parameter is empty. However, the behavior is the same when this template is used anywhere (with category specified), eg. https://pl.wikinews.org/wiki/Szablon:Polityka

Minimum reproducer:

<dynamicpagelist>
notcategory=Tworzone
category=Polityka
</dynamicpagelist>

The notcategory needs to be present and non-empty but can be garbage.. And each page is listed between 6 and 9 times.

I suspect categorylinks normalization ...

Pppery renamed this task from DynamicPageList outputs many links to the newest page to DynamicPageList with notcategory producing duplicates.Jul 8 2025, 7:34 PM
Pppery added a subscriber: Zabe.

Yeah is caused by the migration.

wikiadmin2023@10.64.32.196(plwikinews)> SET STATEMENT max_statement_time=10 FOR SELECT  page_namespace,page_title  FROM `page` LEFT JOIN `categorylinks` `c1` ON ((page_id = c1.cl_from)) LEFT JOIN `linktarget` `linktarget1` ON ((c1.cl_target_id = linktarget1.lt_id) AND linktarget1.lt_title = 'Tworzone' AND linktarget1.lt_namespace = 14) LEFT JOIN `categorylinks` `c2` ON ((page_id = c2.cl_from)) LEFT JOIN `linktarget` `linktarget2` ON ((c2.cl_target_id = linktarget2.lt_id) AND linktarget2.lt_title = 'Archiwalne' AND linktarget2.lt_namespace = 14)   WHERE page_namespace = 0 AND page_is_redirect = 0 AND linktarget1.lt_title IS NULL AND linktarget2.lt_title IS NULL  ORDER BY page_id DESC LIMIT 8  ;
+----------------+---------------------------------------------------------------------------------+
| page_namespace | page_title                                                                      |
+----------------+---------------------------------------------------------------------------------+
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby         |
+----------------+---------------------------------------------------------------------------------+
8 rows in set (0.001 sec)

wikiadmin2023@10.64.32.196(plwikinews)> SET STATEMENT max_statement_time=10 FOR SELECT  page_namespace,page_title  FROM `page` LEFT JOIN `categorylinks` `c1` ON ((page_id = c1.cl_from) AND c1.cl_to = 'Tworzone') LEFT JOIN `categorylinks` `c2` ON ((page_id = c2.cl_from) AND c2.cl_to = 'Archiwalne')   WHERE page_namespace = 0 AND page_is_redirect = 0 AND c1.cl_to IS NULL AND c2.cl_to IS NULL ORDER BY page_id DESC LIMIT 8  ;
+----------------+------------------------------------------------------------------------------------------------------------------------------+
| page_namespace | page_title                                                                                                                   |
+----------------+------------------------------------------------------------------------------------------------------------------------------+
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby                                                      |
|              0 | Zmarli_7_lipca_2025                                                                                                          |
|              0 | Duńskie_ministerstwo_digitalizacji_przeprowadza_pilotaż_LibreOffice’a_na_lato                                                |
|              0 | Awaria_systemów_technicznych_Wikinews                                                                                        |
|              0 | Do_Senatu_wpłynęła_petycja_postulująca_wprowadzenie_podatku_kościelnego                                                      |
|              0 | Świnoujście:_koń_padł_podczas_prowadzenia_bryczki                                                                            |
|              0 | Centrum_Lokalne_Żoliborz_-_28_tydzień_2025                                                                                   |
|              0 | Donald_Trump_zapowiedział_rozmowy_z_prezydentem_Chin,_Xi_Jinpingiem,_na_temat_wykupienia_TikToka_przez_amerykańską_firmę     |
+----------------+------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.001 sec)

wikiadmin2023@10.64.32.196(plwikinews)>

Ok. So the issue here is the double left join.

A fixed query would look like the following, but it is quite complex. It might be simpler to just use DISTINCT.

wikiadmin2023@10.64.32.196(plwikinews)> SELECT p.page_namespace, p.page_title
    -> FROM page p
    -> LEFT JOIN (
    ->   SELECT cl.cl_from
    ->   FROM categorylinks cl
    ->   JOIN linktarget lt ON cl.cl_target_id = lt.lt_id
    ->   WHERE lt.lt_title = 'Tworzone' AND lt.lt_namespace = 14
    -> ) AS excluded_pages ON p.page_id = excluded_pages.cl_from
    -> WHERE p.page_namespace = 0 AND p.page_is_redirect = 0
    -> AND excluded_pages.cl_from IS NULL
    -> ORDER BY p.page_id DESC
    -> LIMIT 8;
+----------------+------------------------------------------------------------------------------------------------------------------------------+
| page_namespace | page_title                                                                                                                   |
+----------------+------------------------------------------------------------------------------------------------------------------------------+
|              0 | Grok_zaczął_nadużywać_wulgaryzmów_i_wprost_obrażać_niektóre_znane_osoby                                                      |
|              0 | Zmarli_7_lipca_2025                                                                                                          |
|              0 | Duńskie_ministerstwo_digitalizacji_przeprowadza_pilotaż_LibreOffice’a_na_lato                                                |
|              0 | Awaria_systemów_technicznych_Wikinews                                                                                        |
|              0 | Do_Senatu_wpłynęła_petycja_postulująca_wprowadzenie_podatku_kościelnego                                                      |
|              0 | Świnoujście:_koń_padł_podczas_prowadzenia_bryczki                                                                            |
|              0 | Centrum_Lokalne_Żoliborz_-_28_tydzień_2025                                                                                   |
|              0 | Donald_Trump_zapowiedział_rozmowy_z_prezydentem_Chin,_Xi_Jinpingiem,_na_temat_wykupienia_TikToka_przez_amerykańską_firmę     |
+----------------+------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.001 sec)

wikiadmin2023@10.64.32.196(plwikinews)>

Change #1167311 had a related patch set uploaded (by Zabe; author: Zabe):

[mediawiki/extensions/intersection@master] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167311

Change #1167311 had a related patch set uploaded (by Zabe; author: Zabe):

[mediawiki/extensions/intersection@master] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167311

Applying this to mw-experimental resulted in the following when visiting plwikinews through it.

Bildschirmfoto_20250709_020445.png (399×312 px, 41 KB)

Change #1167311 merged by jenkins-bot:

[mediawiki/extensions/intersection@master] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167311

Change #1167569 had a related patch set uploaded (by Zabe; author: Zabe):

[mediawiki/extensions/intersection@wmf/1.45.0-wmf.8] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167569

Change #1167570 had a related patch set uploaded (by Zabe; author: Zabe):

[mediawiki/extensions/intersection@wmf/1.45.0-wmf.9] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167570

Change #1167570 merged by jenkins-bot:

[mediawiki/extensions/intersection@wmf/1.45.0-wmf.9] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167570

Change #1167569 merged by jenkins-bot:

[mediawiki/extensions/intersection@wmf/1.45.0-wmf.8] Fix categorylinks read new code for excluding categories

https://gerrit.wikimedia.org/r/1167569

Mentioned in SAL (#wikimedia-operations) [2025-07-09T13:52:37Z] <zabe@deploy1003> Started scap sync-world: Backport for [[gerrit:1167574|ApiQueryCategoryMembers: Try stop forcing index in read new code (T399037)]], [[gerrit:1167573|ApiQueryCategoryMembers: Try stop forcing index in read new code (T399037)]], [[gerrit:1167570|Fix categorylinks read new code for excluding categories (T398861 T398939)]], [[gerrit:1167569|Fix categorylinks read new code for excluding categories (T39886

Mentioned in SAL (#wikimedia-operations) [2025-07-09T13:54:47Z] <zabe@deploy1003> zabe: Backport for [[gerrit:1167574|ApiQueryCategoryMembers: Try stop forcing index in read new code (T399037)]], [[gerrit:1167573|ApiQueryCategoryMembers: Try stop forcing index in read new code (T399037)]], [[gerrit:1167570|Fix categorylinks read new code for excluding categories (T398861 T398939)]], [[gerrit:1167569|Fix categorylinks read new code for excluding categories (T398861 T398939)]] synced

Mentioned in SAL (#wikimedia-operations) [2025-07-09T14:01:20Z] <zabe@deploy1003> Finished scap sync-world: Backport for [[gerrit:1167574|ApiQueryCategoryMembers: Try stop forcing index in read new code (T399037)]], [[gerrit:1167573|ApiQueryCategoryMembers: Try stop forcing index in read new code (T399037)]], [[gerrit:1167570|Fix categorylinks read new code for excluding categories (T398861 T398939)]], [[gerrit:1167569|Fix categorylinks read new code for excluding categories (T3988

Zabe claimed this task.
Zabe moved this task from Triage to Done on the DBA board.