Page MenuHomePhabricator

Develop process to extract data from WLM photos (Sweden) and create SDC statements input
Closed, ResolvedPublic

Description

  • Include all photos in Images from Wiki Loves Monuments XXXX in Sweden categories
  • Extract the edition that they participated in (XXXX year) -- P1344
  • Extract what is depicted, using one of the templates
    • BBR for buildings -- via P 1260, with raa/bbr/
      • Note that we can have templates like {{BBR|2=a|1=21300000012942}}. params don't have to be numbered :)
    • Arbetslivsmuseum for museums -- resolve via P3426
    • Fornminne for archaeology -- via P 1260, with raa/fmi/
    • K-Fartyg for ships -- resolve via P2317, but do all have it?

Note that a photo can depict multiple monuments, of the same or different types -> https://commons.wikimedia.org/wiki/File:Hissa_Toppsegel.JPG

Then match the extracted code to Wikidata item, e.g. for https://commons.wikimedia.org/wiki/File:Bildsten_p%C3%A5_Klinteberget_a.jpg -> Q29339659

The final result will be a csv file compatible with the https://github.com/Vesihiisi/batch-SDC tool, including the two statements (competition edition and depicted object).

Modeling notes for reference https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Monuments/Structured_data

Event Timeline

Don't remember if batch-SDC is actually able to interpret two columns with the same header :))

now it does

Things to add

  • include both "wiki loves monuments xxxx" and "wiki loves monuments xxxx in country" as participant in
  • location of creation (P 1071)

Add empty save to Commons page in the adding script, so that any SDC-based templates and categories get refreshed.

Harvesting script that processes both DEPICTS and PARTICIPANT IN (with both local and main competition).