Page MenuHomePhabricator

Import UAE monuments into monuments database
Closed, ResolvedPublic

Description

Monument list is available here: https://en.wikipedia.org/wiki/List_of_cultural_property_of_national_significance_in_the_United_Arab_Emirates

(maybe direct import into Wikidata makes more sense?)

Details

Related Gerrit Patches:

Event Timeline

Effeietsanders updated the task description. (Show Details)
JeanFred claimed this task.Aug 16 2018, 7:35 AM

Adding to monuments database is easy. Will do.

Change 453134 had a related patch set uploaded (by Jean-Frédéric; owner: Jean-Frédéric):
[labs/tools/heritage@master] Add United Arab Emirates in English (uae_en) to monuments_config

https://gerrit.wikimedia.org/r/453134

I have I put a patch up, tested locally. However some remarks:

  • The address field seems to be in many instances just the municipality name − we may want to split it out.
  • There is a district field, however it is never set, and the column for it is named « Emirate ». This should be clarified.
  • The CommonsCat field is set for all but one entry, however there is not necessarily an actual Commons category created.
  • When there is no image, then the placeholder ImageNA.svg is used. Is that supposed to mean that:
    • there is no picture yet? in which case it should be removed − lack of a picture is precisely the indicator of that :)
    • no picture can be taken (eg monument destroyed)
Effeietsanders added a comment.EditedAug 16 2018, 2:44 PM

Thanks @JeanFred . I hope that @Saqib can help clarify a few choices. I agree that commonscat should only be set to a value if the category actually exists, and that the ImageNA can be used as the default in the template itself, no need to specify it in the list page. Even if the monument is destroyed, old pictures may exist.

I also created quickly https://commons.wikimedia.org/wiki/Template:Monument_United_Arab_Emirates for tagging Commons images. The Upload campaign should be made to use it − I can help with that with needed.

  • When there is no image, then the placeholder ImageNA.svg is used.

The placeholder image is a bit worrying. The image should be moved to the template prior to us harvesting, or we need a converter for the image parameter which filters out the placeholder.

All of the prov_iso codes look wrong as well. They are all set to UAE-FA but they should be one of the allowed ISO_3166-2:AE codes. This is the level where the "Emirate" should be encoded, and coincides with the top level headings on the page.

  • The address field seems to be in many instances just the municipality name − we may want to split it out.
  • There is a district field, however it is never set, and the column for it is named « Emirate ». This should be clarified.

Since the tables are grouped by Emirate the next subdivision down would be something akin to municipality/village depending on the Emirate. As such that would be the right field for the address fields that only contain municipality name.

Happy to help replace the prov_iso codes and move the NA image to the template if @Saqib wants a hand. The commonscat and address cleanup likely requires more hands-on work though which I can't commit to.

I took a look at the row template and saw that uploads go to the wlm-pk campaign (the Pakistan campaign). I assume that this is incorrect but I cannot find a wlm-uae campaign.

The list collapses some sub-division tables but not others which is probably confusing for anyone wanting to contribute.
It would be great with a source statement on the page clarifying if it is an official list or a homebrew.

I took a look at the row template and saw that uploads go to the wlm-pk campaign (the Pakistan campaign). I assume that this is incorrect but I cannot find a wlm-uae campaign.

This is supposed to be wlm-ae per Commons:Wiki_Loves_Monuments_2018/Participating_countries. I'll change it but the campaign itself still needs to be created.

Lokal_Profil added a comment.EditedAug 29 2018, 7:37 AM

I was bold and:

  • Fixed the ISO codes,
  • Removed the NA images,
  • Removed all of the non-existant (or incorrect) commonscat entries
  • Relabeled the header of the district column from "Emirate" to "District" since the tables are already split by district.

@Saqib let me know if any of these were done incorrectly.
Splitting the existing values in Locality between address and district is still left to be done but that requires knowledge I don't posses.

The above changes should however ensure that we can harvest the tables and that uploaded images will not be miscategorised,

Saqib added a comment.EditedAug 29 2018, 8:00 AM

I wasn't aware of this discussion, nor received any notification. Thank you Andre for letting me know about this. I filled the categories (even though they're not yet created on Commons) to help understand which sites receives entries during WLM. If possible, I would prefer to retain them.

This comment was removed by Saqib.
Lokal_Profil removed a subscriber: Andre.EditedAug 29 2018, 4:03 PM

@Saqib: Answering your emailed question here to keep things together. The ISO codes in prov_iso are used by the Monuments Database to group the Monuments. They don't show up on the page since you already use the full name of each Emirate as a section header.

The commonscat field is used to automatically categorise images of that monument. Uploaded images get added to this category instead of the standard category Category:Cultural heritage monuments in the United Arab Emirates‎. That is why only existing categories should be used with the commonscat parameter to ensure the image is not completely uncategorized.

There is a bot which makes sure images in the standard category gets moved if a commonscat is added to the list afterwards. The same bot will also highlight new images of monuments that don't have an image in the list today, to make it easy to add these to the list.

I'm also André 😉

Lokal_Profil added a comment.EditedAug 30 2018, 6:28 AM

I updated the Campaign on Commons to make use of the id template @JeanFred created. And I updated the row template on en.wiki to pass commonscat on to the upload Campaign

Change 453134 merged by jenkins-bot:
[labs/tools/heritage@master] Add United Arab Emirates in English (ae_en) to monuments_config

https://gerrit.wikimedia.org/r/453134

This change has been deployed. (Don't know why the task wasn't parsed from the commit message)

@Saqib
I'm going to remove the invalid commonscat entries again as they it will cause problems on Commons when all uploaded images are uncategorised. I'll happily help you out at the end of the competition (if you remind me ;) ) to create a list of which images were uploaded of which monument as part of WLM.

@Saqib
I'm going to remove the invalid commonscat entries again as they it will cause problems on Commons when all uploaded images are uncategorised. I'll happily help you out at the end of the competition (if you remind me ;) ) to create a list of which images were uploaded of which monument as part of WLM.

Done. I also handled categorisation fo the images uploaded before the change went into place.

The data is imported.

Cleanup of adress-field (splitting it between district and address) needs to be handled by volunteers familiar with UAE.

Effeietsanders closed this task as Resolved.Nov 18 2018, 2:00 AM