Page MenuHomePhabricator

Add India in English (on Commons) to database
Closed, ResolvedPublic

Description

These lists superseed the older en.wp lists. The identifier to use is the wikidata id (not the ID field) as the latter is allowed to be empty for non-designated monuments

project : commons
lang : commons
headerTemplate : India heritage list header
rowTemplate : India heritage list row
commonsTemplate : WLM India Wikidata ID
commonsTrackerCategory : Indian monuments with Wikidata QIDs
commonsCategoryBase : Cultural heritage monuments in India
unusedImagesPage : Commons:Wiki Loves Monuments in India/Monuments/Unused images
imagesWithoutIdPage : Commons:Wiki Loves Monuments in India/Monuments/Images without ID
missingCommonscatPage : Commons:Wiki Loves Monuments in India/Monuments/Missing commonscat links
namespaces : Commons

Related Objects

Mentioned In
T290670: Coordinates are missing for in-com monuments in heritage database
T289930: html view of API fails to link names for non-wikipedia sites
T289929: Statistics page falsely reports no monuments for some datasets
T278918: Switch heritage over to new replicas design
Mentioned Here
T278918: Switch heritage over to new replicas design
rTHER96c6f5615a09: Localisation updates from https://translatewiki.net.
rTHER841e8e8668ac: Localisation updates from https://translatewiki.net.
rTHER7d3cfbc3c967: Localisation updates from https://translatewiki.net.
rTHER685e4f6402cc: Localisation updates from https://translatewiki.net.
rTHERb1a510fcbda8: Localisation updates from https://translatewiki.net.
rTHER78b1ce7052ed: Localisation updates from https://translatewiki.net.
rTHER7172b5976c51: Localisation updates from https://translatewiki.net.
rTHERa6d188935e6e: Localisation updates from https://translatewiki.net.
rTHER2bfb4a01178d: Localisation updates from https://translatewiki.net.
rTHERa200b30e3745: Localisation updates from https://translatewiki.net.
rTHER62ea8c329d15: Localisation updates from https://translatewiki.net.
rTHER252286794830: Localisation updates from https://translatewiki.net.
rTHER279062392503: Localisation updates from https://translatewiki.net.
rTHERb4eed979441c: Make CI happy
rTHER27c3613d8b04: Localisation updates from https://translatewiki.net.
rTHER6003fc6de0dd: Localisation updates from https://translatewiki.net.
rTHER6bd1a0d8945f: Localisation updates from https://translatewiki.net.
rTHERd577a8451a6e: Move to new replicas
rTHER402b161dd5ec: Localisation updates from https://translatewiki.net.
rTHER7f50cb3cd002: Add Indian lists on Commons
rTHER2be29f11efab: Localisation updates from https://translatewiki.net.

Event Timeline

@Bodhisattwa This is the basic info I could glean. Please fill in the rest.

@JeanFred For info all of these are on Wikidata but there doesn't seem to be a good select to get them all so falling back on the Commons lists

@Bodhisattwa This is the basic info I could glean. Please fill in the rest.

done, please check if that is ok.

Change 711018 had a related patch set uploaded (by Lokal Profil; author: Lokal Profil):

[labs/tools/heritage@master] [WIP]Add Indian lists on Commons

https://gerrit.wikimedia.org/r/711018

Hi @Bodhisattwa sorry that the delay i flagged ended up being even longer than expected.

Made a first stab at a mapping and I spotted that the documentation of Template:India_heritage_list_row is fairly outdated. Seems like many of the parameters in actual use are undocumented and that some of the documented ones are no longer supported by the template.

Aliases for parameters seem to be quite frequently used which I'm not sure the harvester supports (will dig into it unless @JeanFred remembers?)

Hi @Bodhisattwa sorry that the delay i flagged ended up being even longer than expected.

Made a first stab at a mapping and I spotted that the documentation of Template:India_heritage_list_row is fairly outdated. Seems like many of the parameters in actual use are undocumented and that some of the documented ones are no longer supported by the template.

I have updated the documentation page.

Aliases for parameters seem to be quite frequently used which I'm not sure the harvester supports (will dig into it unless @JeanFred remembers?)

I have removed the aliases parameter.

Aliases for parameters seem to be quite frequently used which I'm not sure the harvester supports (will dig into it unless @JeanFred remembers?)

I’m not 100% sure, but I think that aliases are not supported.

Hi @Bodhisattwa sorry that the delay i flagged ended up being even longer than expected.

Made a first stab at a mapping and I spotted that the documentation of Template:India_heritage_list_row is fairly outdated. Seems like many of the parameters in actual use are undocumented and that some of the documented ones are no longer supported by the template.

I have updated the documentation page.

Thanks!

So I think I could get a harvester up and running, it has the following issues though:

  1. Since listeria returns the coordinates as a single {{Inline coordinates}} template the harvester cannot separate latitude and longitude therefore cannot interpret the coordinate. There is probably a workaround for this if we edit every sparql call to (also?) return latitude and longitude explicitly (would not have to be displayed). There might be a SQL clean-up step we could do but it's unclear to me which functions are available to us (CHARINDEX isn't in the docker setup at least)
  2. It's unclear to me how the p131 and district parameters are meant to interact. One of these should be mapped to the adm2 property.
  3. State ISO is currently not included. Could probably be fixed by adding it to the header template.
  4. Two lists seem to have some coding issues, see name, think that is a Listeria bug though which might have been fixed in V2.

I don't think any of the above are blockers though (@JeanFred give a shout if you disagree)

It's unclear to me how the p131 and district parameters are meant to interact. One of these should be mapped to the adm2 property.

For context, p131 is expected to be the lowest tier of administrative unit like municipal wards for city-based monuments (e.g. this one) or community development blocks/mandals etc. for rural monuments (e.g. this one). Unfortunately, we could not have reached that level of detail for all Indian monuments on Wikidata. We have fairly consistent data on districts though, which is the second-level administrative division in India, below the level of states or union territories.

Thanks.

I'll leave out p131 for now then.

I've marked the patch as ready for review (@JeanFred).

This is the quickest implementation I could think of. A more in-depth implementation would be to investigate the sparql queries underlying each of the lists then see if they could be boild down to a handfull of distinct queries and have a config set up for each of these.

A further not is that the en.wp lists will still go on being harvested in parallel I don't believe these should clash and images claimed by more than one dataset isn't new.

Change 711018 merged by jenkins-bot:

[labs/tools/heritage@master] Add Indian lists on Commons

https://gerrit.wikimedia.org/r/711018

Looks like the harvest went ok (i get hits in the API) but the statistics report fails to pick these up.

I'll open up a task for that. And also for the way some of the links in the api are misformatted to go to commons.wikipedia.org.

But unless there is anything else @Bodhisattwa I believe this task can be closed

I'll open up a task for that.

That would be helpful.

But unless there is anything else @Bodhisattwa I believe this task can be closed

Sure. Thanks for working on it.

Ciell subscribed.

Per last comment in thread.
Please feel free to re-open if I am wrong.