First we need to track what property to use. Let's take the good old Rijksmonumenten. We use the template at https://nl.wikipedia.org/wiki/Sjabloon:Tabelrij_rijksmonument (with Wikidata support) and the property is https://www.wikidata.org/wiki/Property:P359 .
On https://nl.wikipedia.org/wiki/Lijst_van_rijksmonumenten_in_Haarlem_Centrum the first entry is "Welgelegen" id 15966. So we want the Wikidata item that has P359==15966. Sparql can give us that: https://query.wikidata.org/#SELECT%20%3Fitem%20WHERE%7B%20%3Fitem%20wdt%3AP359%20%2215966%22%20%7D%20%20 -> http://www.wikidata.org/entity/Q17186772
Doing this one by one would be a pain. I've been cross referencing quite a few sources on Wikidata. Generating a lookup table is a very inexpensive operation. See for example https://github.com/multichill/toollabs/blob/master/bot/wikidata/biografisch_finder.py#L58
Shouldn't be too hard to add this to the harvester.
Here I really meant
- Parsing https://fr.wikipedia.org/wiki/Liste_des_monuments_historiques_de_Lille
- Identifying as monument article https://fr.wikipedia.org/wiki/Refuge_de_l%27Abbaye_de_Loos
- Store https://www.wikidata.org/wiki/Q17347192 as the wd_item since it is the linked item from the article
It might be OK for the lists which have a separate article parameter but for the ones where the links are extracted from the name field we risk pulling in all kinds of things.
That said I know that for e.g. the Swedish lists på here is not a perfect overlap between article value and wikidata since the list deals with complexes of buildings while the articles are normally about an individual buildings in that complex (example: list entry is for the church and the separate bellflower but the article is just about the church).