Page MenuHomePhabricator

Process findings of Commons:Monuments database/Images without id
Open, Needs TriagePublic

Description

As summarized at Commons:Monuments database/Images without id, ErfgoedBot has found ca 31,000 Commons file pages used at Wikipedia monument lists, where a monument template with monument ID is missing. Most of them (8500) are related to Czech monuments, as listed here. The list is updated periodically since October 2018 - however, no tool and no bot processes the findings.

As discussed here, some tool exists, but it had some problems (T206398). However, the Czech list is not affected by that problem IMHO. According to the dicussion, I'm noting that task to phabricator to be not forget.

P.S.: Commons category pages of listed monuments, lacking monument ID templates, should be processed in a similar way.

Event Timeline

SJu created this task.Jan 6 2020, 10:22 PM

So I spotted that on the 10th of February the list got emptied. @SJu did you do a massive tagging drive or did the job just fail?

In case it was just a failing job then the following is what we should run to try and add the templates.
jsub -once -j y -o /data/project/heritage/logs/cz_cs_image_templates.log -N cz_cs_image_templates /data/project/heritage/bin/run_erfgoedbot_script.sh erfgoedbot/images_of_monuments_without_id.py -countrycode:cz -langcode:cs -add_template >> /data/project/heritage/logs/cz_cs_image_templates2.log

Spotted that loads of these jobs crashed on the 10th so went ahead and ran the command. Not sure why @Stashbot didn't pick up on the SAL entry.

Spotted that loads of these jobs crashed on the 10th so went ahead and ran the command. Not sure why @Stashbot didn't pick up on the SAL entry.

Stashbot seems to have hit a few issues due to an issue in the eqiad datacentre - ops are aware

Templates seem to be adding fine. I'm going to leave it running in the background for now, it should be done before the regular daily update job kicks in.

Lokal_Profil added a subscriber: JeanFred.
SJu added a comment.Feb 12 2020, 1:05 PM

File:Štramberk, Horní Bašta 293.jpg seems to be a false positive in the list Images_of_cultural_heritage_monuments_in_Czech_Republic_without_id, version 2020-02-12, 05:15 UTC. The list listed this photo to be added {{Cultural Heritage Czech Republic|13170/8-3390}}, while the file page contains this tag since if was uploaded 2014-09-03.

SJu added a comment.Feb 16 2020, 7:01 AM

Images of cultural heritage monuments in Czech Republic without id from 2020-02-15 19:36 contain many false positives again.

SJu added a comment.Feb 16 2020, 2:40 PM

ErfgoedBot added a duplicate monument-ID tempate to the file page. (2020-02-11, 23:31).