When adding claims to Wikidata items, it would be great to source (ideally) each and every one of them. As I see it, there are two options:
- Use Wikipedia as a source. This is less than ideal in terms of being a reliable source, but it's straightfoward to do (and while it's not recommended, it's common practice, for better or worse). Every item in the WLM database contains information about the Wikipedia page it was fetched from, for example //sv.wikipedia.org/w/index.php?title=Lista_%C3%B6ver_arbetslivsmuseer_i_Blekinge_l%C3%A4n&oldid=30834404. As it includes the page revision id, it's easy to link to the correct version.
- Use the registrant_url value. For example, in the Norwegian building data, each item has an url pointing the the Kulturminnesøk service. The advantage is that it's a reliable, official data source. There are two disadvantages:
- The WLM database comprises data downloaded from Wikipedia pages, which have been edited by the community. There's no way of knowing which information is supported by the registrant_url and which was added manually by someone.
- Many of the data sets don't even have a registrant_url.
In the end, pointing back to the Wikipedia page certainly seems better than nothing. In some cases we do know where the data on Wikipedia came from _originally_ -- such as the Swedish museum dataset -- but again, there's always a possibility that the Wikipedia page contains info added by someone manually. The WLM database is updated continuously, so it contains the freshest dump of whatever is included in the Wikipedia page. This makes it tricky to guess which statements are supported by the "official" sources.