Page MenuHomePhabricator

Add Brazilian monuments in portuguese to Monuments database
Closed, DeclinedPublic

Description

Hello, all. I am co-managing the Wiki Loves Monuments in Brazil this year and I am following the steps outlined in Commons:Monuments_database/Harvesting to add the Brazilian lists to the tracking of the Monuments database. We will use only Wikidata as source of our lists of monuments (all of them are displayed here) , and MonumentID on Commons.

I made a copy of one of the .json files at erfgoedbot/monuments_config but as I have doubts in some blank fields, I came to ask you for help implementing this. I'm sending the modified json file I made. Thank you in advance!

Event Timeline

The first thing we need to know is which criteria you ise for having a Wikidata item be considered a Brazilian Monument? I.e. does it need to have a certain Property or Property/value combination?

Further questions:

  1. Have the old lists been replaced or do they exist in parallel?

Hi, @Lokal_Profil, thanks for answering so quick! The criteria of our lists is the following: An item for a Brazilian monument in Wikidata has:

  • Brazilian monument identifier(s) (e.g. P4360 or P4372) or heritage designation (P1435);

and

  • is located (P131) in a Brazilian city or state (district or city in case of municipal lists). For example, in a list of monuments of the State of Minas Gerais, a monument in our list would have to have P131 = Minas Gerais (Q39109) or P131 = Qxxx with Qxxx P131's = Q39109.

I think it is worth to mention (It seems to be a point made on your email on WLM announce this morning) that we are not depending on importation directly from Wikidata to Monuments Database, we are still doing lists on Portuguese Wikipedia (Wikipédia:Wiki Loves Monuments 2019/Brasil , or namespace 4), generated by Listeria, with header and row templates and everything required by Monuments database just like a static list on other Wikipedias have.

We are also using a different template (MonumentID, which is not Brazilian specific) than the one used in other years (Cultural Heritage Brazil) so the images on Commons from previous Brazil WLM are not affected by this new process. All the old static lists from previous Brazil WLM are still there in their pages, they use a different ID than the QID and still have the button to upload to Commons (Is the point you are calling that they should get the button removed so no misunderstandment on ID happen?). I don't know if that answer your question, so please let me know and once again, thank you for your quick response!

[...](Is the point you are calling that they should get the button removed so no misunderstandment on ID happen?). I don't know if that answer your question, so please let me know and once again, thank you for your quick response!

The upload button should be removed from the old lists yes. Since it points to the same upload campaign the ids that it passes on will otherwise be misinterpreted.

  • Brazilian monument identifier(s) (e.g. P4360 or P4372) or heritage designation (P1435); and
  • is located (P131) in a Brazilian city or state (district or city in case of municipal lists). For example, in a list of monuments of the State of Minas Gerais, a monument in our list would have to have P131 = Minas Gerais (Q39109) or P131 = Qxxx with Qxxx P131's = Q39109.

[...]
We are also using a different template (MonumentID, which is not Brazilian specific) than the one used in other years (Cultural Heritage Brazil) so the images on Commons from previous Brazil WLM are not affected by this new process.

While I think the sparql bit of this will be straight forward there are a few other issues I think might complicate things (@JeanFred what do you think). They shouldn't be blockers though I believe.

  1. the template on Commons is shared by multiple countries (think it has always been uniqe to the dataset before)
  2. the base category will contain all the images from previous years, which will have the old template.
  3. by the nature of point 1 there will be no tracking category for images with ids for the dataset.

I think it is worth to mention (It seems to be a point made on your email on WLM announce this morning) that we are not depending on importation directly from Wikidata to Monuments Database, we are still doing lists on Portuguese Wikipedia (Wikipédia:Wiki Loves Monuments 2019/Brasil , or namespace 4), generated by Listeria, with header and row templates and everything required by Monuments database just like a static list on other Wikipedias have.

While we could probably havest those templates its probably best to change so thatvwe harvest directly from Wikidata. If nothing else some of the instructions on the reports become different since any updates need to be done on Wikidata rather than the lists.

  • Brazilian monument identifier(s) (e.g. P4360 or P4372) or heritage designation (P1435);

and

  • is located (P131) in a Brazilian city or state (district or city in case of municipal lists).

While I think the sparql bit of this will be straight forward

Something like this should work https://w.wiki/7go as the primary filtering (can be optimised of course) @Ederporto does the number of results look right?

We are also using a different template (MonumentID, which is not Brazilian specific) than the one used in other years (Cultural Heritage Brazil) so the images on Commons from previous Brazil WLM are not affected by this new process.

While I think the sparql bit of this will be straight forward there are a few other issues I think might complicate things (@JeanFred what do you think). They shouldn't be blockers though I believe.

Oh, cool concept, that generic template. However I’m afraid this might ruins some assumptions made by ErfgoedBot. But that may not be blockers, worst thing is limited functionality :)

  1. the template on Commons is shared by multiple countries (think it has always been uniqe to the dataset before)

That will almost certainly defeat populate_image_table and unused_monument_images.

  1. the base category will contain all the images from previous years, which will have the old template.

That may not necessarily be an issue − these should be categorized anyways, and it there are too many of them, then the country should be on the skip-list.

  1. by the nature of point 1 there will be no tracking category for images with ids for the dataset.

The MonumentID could figure it out based on the Wikidata invoke and add the relevant country-based category.

I think it is worth to mention (It seems to be a point made on your email on WLM announce this morning) that we are not depending on importation directly from Wikidata to Monuments Database, we are still doing lists on Portuguese Wikipedia (Wikipédia:Wiki Loves Monuments 2019/Brasil , or namespace 4), generated by Listeria, with header and row templates and everything required by Monuments database just like a static list on other Wikipedias have.

While we could probably havest those templates its probably best to change so thatvwe harvest directly from Wikidata. If nothing else some of the instructions on the reports become different since any updates need to be done on Wikidata rather than the lists.

Thought about that before too ; it is an option, but one that would be quite... funky :)

Something like this should work https://w.wiki/7go as the primary filtering (can be optimised of course) @Ederporto does the number of results look right?

Hi, @Lokal_Profil and @JeanFred. The identifiers I gave you are just two (of many existing and more to come) examples of Brazilian cultural heritage identifiers. We have 5570 municipalities and 27 states, in theory and in a lot of cases, each one can mark an monument as cultural heritage. Currently, our work in the lists is exactly improve them first by state, then by municipality, creating new identifiers where it is possible and there is information. That's why I don't think harvesting from Wikidata directly is not gonna be ideal, at least for now. Through the lists we have a better control.

  1. by the nature of point 1 there will be no tracking category for images with ids for the dataset.

The MonumentID could figure it out based on the Wikidata invoke and add the relevant country-based category.

I'll ping @Mike_Peel here, as he is the creator of the MonumentID template on Commons. What do you think, Mike?

  1. by the nature of point 1 there will be no tracking category for images with ids for the dataset.

The MonumentID could figure it out based on the Wikidata invoke and add the relevant country-based category.

I'll ping @Mike_Peel here, as he is the creator of the MonumentID template on Commons. What do you think, Mike?

Sure, what country tracking category would you like it to add? I've done a demo at https://commons.wikimedia.org/wiki/File:At_Paraty,_Brazil_2017_107.jpg that currently puts it into "Category:Monuments by ID in Brazil", but any other format of "Category:prefix <country> postfix" is straightforward to implement. (But different prefixes/postfixes for different countries would be messy.)

  1. by the nature of point 1 there will be no tracking category for images with ids for the dataset.

The MonumentID could figure it out based on the Wikidata invoke and add the relevant country-based category.

I'll ping @Mike_Peel here, as he is the creator of the MonumentID template on Commons. What do you think, Mike?

Sure, what country tracking category would you like it to add? I've done a demo at https://commons.wikimedia.org/wiki/File:At_Paraty,_Brazil_2017_107.jpg that currently puts it into "Category:Monuments by ID in Brazil", but any other format of "Category:prefix <country> postfix" is straightforward to implement. (But different prefixes/postfixes for different countries would be messy.)

Thanks! The convention follows https://commons.wikimedia.org/wiki/Category:Cultural_heritage_monuments_with_known_IDs, in this case https://commons.wikimedia.org/wiki/Category:Cultural_heritage_monuments_in_Brazil_with_known_IDs

Sure, what country tracking category would you like it to add? I've done a demo at https://commons.wikimedia.org/wiki/File:At_Paraty,_Brazil_2017_107.jpg that currently puts it into "Category:Monuments by ID in Brazil", but any other format of "Category:prefix <country> postfix" is straightforward to implement. (But different prefixes/postfixes for different countries would be messy.)

Thanks! The convention follows https://commons.wikimedia.org/wiki/Category:Cultural_heritage_monuments_with_known_IDs, in this case https://commons.wikimedia.org/wiki/Category:Cultural_heritage_monuments_in_Brazil_with_known_IDs

OK, the template now uses that convention.

We are also using a different template (MonumentID, which is not Brazilian specific) than the one used in other years (Cultural Heritage Brazil) so the images on Commons from previous Brazil WLM are not affected by this new process. All the old static lists from previous Brazil WLM are still there in their pages, they use a different ID than the QID and still have the button to upload to Commons (Is the point you are calling that they should get the button removed so no misunderstandment on ID happen?). I don't know if that answer your question, so please let me know and once again, thank you for your quick response!

Something like this should work https://w.wiki/7go as the primary filtering (can be optimised of course) @Ederporto does the number of results look right?

Hi, @Lokal_Profil and @JeanFred. The identifiers I gave you are just two (of many existing and more to come) examples of Brazilian cultural heritage identifiers. We have 5570 municipalities and 27 states, in theory and in a lot of cases, each one can mark an monument as cultural heritage. Currently, our work in the lists is exactly improve them first by state, then by municipality, creating new identifiers where it is possible and there is information. That's why I don't think harvesting from Wikidata directly is not gonna be ideal, at least for now. Through the lists we have a better control.

The basic sparql would then end up being something like https://w.wiki/85U where we could add on Properties as needed. the better way of doing it however would be if you also ensured that every monument imported into Wikidata had a P1435 statement where the value of that statement in turn had P17:Q155. Since each of the properties indicates an item has been designated a heritage monument it should by definition also be possible to add a P1435 statement (although you might have to create the appropriate value first).

Change 534972 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[labs/tools/heritage@master] [WIP]Add br_pt as a sparql harvest

https://gerrit.wikimedia.org/r/534972

Change 534972 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[labs/tools/heritage@master] [WIP]Add br_pt as a sparql harvest

https://gerrit.wikimedia.org/r/534972

The blocker for getting this set up with the above is that I don't know enough sparql to select the ?item qid as ?id (i.e. a string/literal)

Change 534972 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[labs/tools/heritage@master] [WIP]Add br_pt as a sparql harvest

https://gerrit.wikimedia.org/r/534972

The blocker for getting this set up with the above is that I don't know enough sparql to select the ?item qid as ?id (i.e. a string/literal)

See T170788 - the work-around is to use STRAFTER( STR( ?item ), STR( wd: ) )

The blocker for getting this set up with the above is that I don't know enough sparql to select the ?item qid as ?id (i.e. a string/literal)

See T170788 - the work-around is to use STRAFTER( STR( ?item ), STR( wd: ) )

Thanks! I was, like a lot of other people it seem, expecting there to be a cleaner way of getting to it.

Managed to get a local harvest to work.

The basic sparql would then end up being something like https://w.wiki/85U where we could add on Properties as needed. the better way of doing it however would be if you also ensured that every monument imported into Wikidata had a P1435 statement where the value of that statement in turn had P17:Q155. Since each of the properties indicates an item has been designated a heritage monument it should by definition also be possible to add a P1435 statement (although you might have to create the appropriate value first).

I spotted that some of the "monuments" such as Q1455932 would not technically qualify for WLM as it's intangible rather than built cultural heritage. Again that would be a reason to use P1435 instead.

@Ederporto are there other Properties you would like to add on in the meantime?

I've set this up to produce reports at

these can be change to another page on pt.wikipedia if you want.

Lokal_Profil changed the task status from Open to Stalled.Sep 23 2019, 7:26 AM

This is blocked/stalled by the following:

@Ederporto are there other Properties you would like to add on in the meantime?

This is likely part of a larger issue which should be addressed after the competition

I spotted that some of the "monuments" such as Q1455932 would not technically qualify for WLM as it's intangible rather than built cultural heritage. Again that would be a reason to use P1435 instead.

This open task is tagged with Wiki-Loves-Monuments 2019 which was a year ago. If this task was/is resolved, then please update the task status. If this task was not resolved but is still valid, then please update the project tags to either general Wiki-Loves-Monuments, or to the Wiki-Loves-Monuments (2020) project tag if you plan to actively work on this task in the year 2020. Thanks a lot!

Boldly removing WLM 2019 and setting Wiki-Loves-Monuments (2020). (Feel free to remove or correct.)

@Ederporto: Could you please answer T231621#5476802? Otherwise this task might get declined. Thanks in advance!

@Ederporto: Could you please answer T231621#5476802? Otherwise this task might get declined. Thanks in advance!

I could swear this task was closed a long time ago. I don't know if this is needed now, as I don't remember what was the issue. Can you close it?

@Ederporto: Thanks for the quick reply! You can close it too via the Add Action...Change Status dropdown. :)