Page MenuHomePhabricator

On-board Malta
Closed, ResolvedPublic

Description

  • Introductory call
  • Getting the list of monuments
  • Creating templates for the list of monuments
  • ...

Event Timeline

leila created this task.Aug 5 2016, 2:22 PM
LilyOfTheWest updated the task description. (Show Details)Aug 6 2016, 6:25 AM
LilyOfTheWest moved this task from Backlog to In-progress on the Wiki-Loves-Monuments (2016) board.

Update: The Government's list of monuments can be found here (http://www.culturalheritage.gov.mt/page.asp?p=21573&l=1) and on the pages falling under the "National Inventory" sub-menu on the left hand side of the page. Neville has requested that they have access to this information in a spreadsheet. While he is waiting to hear back from the government, he has started inputting the data to excel sheet himself.

LilyOfTheWest edited subscribers, added: LilyOfTheWest; removed: leila.Aug 11 2016, 11:58 PM

@Romaine, could you have a quick look at the the list of monuments from Malta and if they look fine, let us (Neville, the POC for Malta) know which header/row template we can use to start creating them in Wikipedia?

@Romaine, I just pinged Neville and asked him to let us know if the long/lat information will be available soon or we should go ahead and create the tables with his help. Let's wait to hear back from him. I will update the task as soon as that happens.

Questions for Neville:

  • Which one of the monuments in the list have an item in Wikidata? If you can add a column to , call it for example monument_wd_item, and copy the Q... ID from Wikidata to there, that would help us create the missing monuments in Wikidata.
  • Let's plan to create the list of monuments in English Wikipedia as well, given that you already have the information about the monuments in English. This will make sure that more people can participate in your contest. :)

Questions for Neville:

  • Which one of the monuments in the list have an item in Wikidata? If you can add a column to , call it for example monument_wd_item, and copy the Q... ID from Wikidata to there, that would help us create the missing monuments in Wikidata.
  • Let's plan to create the list of monuments in English Wikipedia as well, given that you already have the information about the monuments in English. This will make sure that more people can participate in your contest. :)

I haven't yet received the latitude/longitude data from the Government, so I think it might be best to proceed with the list I compiled at this stage.

To reply to your questions:

  1. I don't really know which of the monuments have an item in Wikidata. I suspect that it is very few of them - is there any easier way to check this, other than by manually searching for them? I will add the column and add any info I find.
  1. English will be our main language, so that would make sense.

@Magnus, the automatic way for checking whether monuments already exist in Wikidata becomes relevant in the case of Malta with more than 2K monuments. Can you help @Nevborg with figuring out how to do this automatically, per your comment in T140992#2508958? (We will have more countries like this, for example, South Korea, Greece, Peru, etc. so if you teach me how to fish, I can help save some of your time. :)

@Magnus, do you think it makes sense to set up a Catalogue in Mix'n'Match for Malta's monuments?

OK, time for an update:
There have been various occasions lately where I had a job that called for a "mix'n'match light", so I started developing a new tool called "tabletop". You can upload, edit, process, and download tables (CSV files).
It's in the early stages, but I managed to implement enough hoodoo to upload the Malta file, run some SPARQL matching on it, and allow downloads. Here goes:
https://tools.wmflabs.org/tabletop/#mode=dataset&dataset=15
You can download this via the top-right button. It is the original data set, plus three columns for name-matching:

  • English name, limited to administrative unit
  • Maltese name, limited to administrative unit
  • Maltese name, no limit

The administrative unit was determined in a similar fashion.
Note that the tool is pre-alpha, and even if you log in, the functionality is quite limited. Here, I only present it as a way to download stuff. But I hope it will become the generic solution @LilyOfTheWest is looking for.

@Magnus this is indeed fantastic! Thank you! :)

@Romaine, can you help with creating the templates for @Nevborg. We have only few days left to the contest and it would be good if we have Malta set up completely as soon as possible. :) The list of monuments with QIDs:

.

LilyOfTheWest updated the task description. (Show Details)

@Nevborg There are some monuments with exact same name but two different IDs. Why does this happen? For example, check monuments with IDs 1849 and 1850. We need to fix this before creating the tables.

@LilyOfTheWest This is because they are different monuments, despite having the same names. I have double checked the cases you mentioned (1849 and 1850) with the official Government list and they are indeed listed as two different statues. I know that this happens on many other occasions, as there are many monuments which share the same name (e.g. "Niche of St Paul") but have different IDs - in all these cases they are different monuments, so we have to work around that somehow.

@Nevborg in that case, something in the table should indicate the difference. You can, for example, provide addresses for these monuments or add some extra description in the name of the monument to differentiate the two. Basically, think about it this way: How do you expect the participants know which of the two rows in the table they should upload a photo for when the two names are exactly the same? However the participants should make that decision, that help should be provided in the list/table. :)

@LilyOfTheWest I don't have the addresses for these monuments, nor do I have any information that could really help distinguish them from one another. The best I could do is number them (e.g. "Niche of St Paul 1", "Niche of St Paul 2" etc.), but that doesn't really help the participants identify which monument each number is referring to, so I don't think that would be of any great help.

I can go through the list to see if there are any exceptions where addresses are available, but the majority of them will not. However, I can't seem to access the list from the link you sent me (I am being told that I do not have permission to edit the file) - could you perhaps email me the latest version of the list?

@Nevborg, what you can do to resolve the issue of monuments with the same name but different IDs, is to extract more information from the pdf files (or ask the government directly for more information for the spreadsheet they already gave you). For example, if you look at http://www.culturalheritage.gov.mt/filebank/inventory/Chapels%20and%20Niches/1849.pdf and http://www.culturalheritage.gov.mt/filebank/inventory/Chapels%20and%20Niches/1850.pdf, you see that although the names of these monuments are the same, their geo-coordinates are not. How about adding geo-coordinates to the tables to help people differentiate? You can also link to the pdf files.

Given the little time left before Sep. 1, let's prioritize. Let's first make tables on Wikipedia and make sure you have a functioning list/table for Sep. 1 (for this, we can keep Wikidata Q IDs out). Then once you hear back from the government and/or when you have more information extracted from the pdf files, you can update the tables on Wikipedia.

It seems that the pdf files could be linked semi-automatically, that would be a great help. Because they contain a lot of info, including address etc. Maybe someone could add those links in the file, and later scrape the files? Just thinking out loud :)

@Effeietsanders If it is possible for the pdf files to be linked semi-automatically that would be fantastic - does anybody know how I could go about getting this done?

@LilyOfTheWest, having the functioning tables set up first sounds like a good idea, then we can add more information (such as geo-coordinates) in the coming week or so. I haven't obtained any list from the government (the list I sent was compiled by ourselves), but perhaps I can convince them to send me a list including the geo-coordinates, if nothing else.

Also, apologies for missing the Hangouts chat earlier - I am currently in Berlin for work (and with limited access to my personal email), so I am unfortunately unable to dedicate quite as much time to WLM in these final run-up days as I would like!

@Nevborg As far as I can see, the url of the pdf's is pretty consistent. You could use that to your advantage :) Alternatively, someone could scrape it from the overview pages on the govt. If you find someone who can scrape the pdf's, that should probably give you the coordinates, btw.

Tables now have been made on Wikipedia:

https://en.wikipedia.org/wiki/List_of_monuments_in_Malta

All monuments are sorted by local council (locality) and listed. Now links need to be added to the descriptions, more location details to the location field, coordinates to the coordinate fields and where possible images too.

A next step too can be adding these monuments to Wikidata. I would then advise to use the lists on Wikipedia, as I had to correct the database for a lot of monuments. Or I add the monuments myself to Wikidata based on the modified data I now have.

LilyOfTheWest closed this task as Resolved.Aug 31 2016, 1:55 AM

@Romaine Thank you so much for the fantastic work, it is much appreciated! I have linked to the list on the Malta project landing page. Please go ahead and add the monuments to Wikidata using the modified data, but you might want to wait until we have the geo-coordinates in place first.

I'm still slightly lost as to how to link the PDFs and scrape the data, so any guidance would be appreciated :)

@Nevborg given that Romaine & Lily will be swamped the coming while, they are probably not your best shot at resolving that right now. Maybe you could ask around in your group if someone knows how to scrape the webpage for the url's and edit the page by bot, and/or scrape the individual pdf's?