Page MenuHomePhabricator

Re-enable categorization of Ukraine images
Closed, ResolvedPublic

Description

Per commons:User_talk:ErfgoedBot#Categorisation_is_not_performed

Categorisation is not performed

Hi {{ping|Jean-Frédéric}}! Unfortunately for some reason your bot did not categorise WLM Ukraine images this year, while it did in all previous years. As a result, [[:Category:Cultural heritage monuments in Ukraine]] is pretty crowded and is being cleaned up mostly manually. At the same time, respective lists do have a Commons category. Can you please launch the categorisation job for Ukraine or are there any blocking problems? Thanks — [[User:NickK|NickK]] ([[User talk:NickK|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 19:55, 18 January 2020 (UTC)
:Hey {{ping|NickK}} can you give an example of such categorisation in the past? It turns out I could not find any reference to Ukraine dataset in the history of [[Commons:Monuments database/Categorization/Statistics]]. [[User:Jean-Frédéric|Jean-Fred]] ([[User talk:Jean-Frédéric|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 15:53, 23 January 2020 (UTC)
::{{ping|Jean-Frédéric}} For example, [https://commons.wikimedia.org/w/index.php?diff=322347517&oldid=322317512 this edit in 2018] — [[User:NickK|NickK]] ([[User talk:NickK|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 18:32, 23 January 2020 (UTC)
:::{{ping|Jean-Frédéric}} Maybe it is a general problem that none of the ErfgoedBot tasks are executed for a few weeks (since January 4 2020). [[User:HenkvD|HenkvD]] ([[User talk:HenkvD|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 22:25, 24 January 2020 (UTC)
::::I don't know why other tasks were not executed, but the Ukrainian was already not executed during WLM 2019, creating a huge backlog — [[User:NickK|NickK]] ([[User talk:NickK|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 15:33, 26 January 2020 (UTC)
:::::Looking at [https://commons.wikimedia.org/w/index.php?title=Commons:Monuments_database/Categorization/Statistics&diff=next&oldid=364288670 this diff] Ukrain seems to have been dropped from the categorization job on September 4th (together with a bunch of other countries). This might correspond to [https://github.com/wikimedia/labs-tools-heritage/commit/39aec685c2823fd9b0aa7219d936d3cd285e1e14 Kenya being added to the Categorization job] but is more likely caused by another country overcrowding its base category with uncategorizable images. Historically these have been blacklisted but there is sadly no automated process in place for detecting when this happens. /[[User:Lokal Profil|Lokal]][[Special:Contributions/Lokal Profil|_]][[:User talk:Lokal Profil|Profil]] 07:50, 5 February 2020 (UTC)

Event Timeline

A stop-gap fix is to run the categorization job manually for each of the categories which dropped out of the job on the 4th. It doesn't adress the underlying issue but at least unblocks manual work by volunteers.

I was really confused when not seeing Ukraine in [[Commons:Monuments database/Categorization/Statistics]], as I was under the impression that all country_config entries would be displayed 🤔

I was really confused when not seeing Ukraine in [[Commons:Monuments database/Categorization/Statistics]], as I was under the impression that all country_config entries would be displayed 🤔

Its meant to, compare pt-wd in the harvest statistics which shows up even though failed. Looking at T244213: ErfgoedBot doesn't work since 5 January 2020 there seems to be some issues around the statistics output as well but id be surprised I'd that was the case already back in September.

I was really confused when not seeing Ukraine in [[Commons:Monuments database/Categorization/Statistics]], as I was under the impression that all country_config entries would be displayed 🤔

Its meant to, compare pt-wd in the harvest statistics which shows up even though failed.

Looks like the categorization job didn't get that treatment when the others did

Looks like even more countries got dropped with the
latest update

I was really confused when not seeing Ukraine in [[Commons:Monuments database/Categorization/Statistics]], as I was under the impression that all country_config entries would be displayed 🤔

Its meant to, compare pt-wd in the harvest statistics which shows up even though failed.

Looks like the categorization job didn't get that treatment when the others did

Broken this out as T244445: Include failed datasets in categorization statistics

Mentioned in SAL (#wikimedia-cloud) [2020-02-11T20:37:49Z] <wm-bot157> <lokal-profil> Triggered a manual categorisation job for ua_uk (T244333)

Mentioned in SAL (#wikimedia-cloud) [2020-02-11T20:37:49Z] <wm-bot157> <lokal-profil> Triggered a manual categorisation job for ua_uk (T244333)

Seems to have run fine based on the user contributions. Ran until completion and nothing weird in the logs

Lokal_Profil claimed this task.

Per user comment on commons:user_talk:ErfgoedBot