Document WLE data preparations, downloaded files and plan of attack.
Outcome
Original data
The original data (for national parks and nature reserves) http://gpt.vic-metria.nu/data/land/NR.zip and http://gpt.vic-metria.nu/data/land/NP.zip.
The metadata for each of these files exists in Miljödataportalen.
These files contain shapefiles. Using http://mapshaper.org/ these files can be converted to other formats and, relevant for us, the embedded data can be extracted as a CSV.
The extracted CSVs (and the metadata from Miljödataportalen) are available here (as a zip file).
Note that there are duplicates in the Nature Reserve data (see below). A spot check seems to indicate that these have different BESLSTATUS though indicating that we probably only want to include those entries which have BESLSTATUS=Gällande.
Wikidata
A demo item exists at Q19978573
- The Property for the unique id (NVID) is P3613. Note that this is the same for both types of objects.
- The property for the IUCNKAT field is P814. The allowed values are those with P31=Q3679744. Note that "0, Områden som ej kan klassificeras enligt IUCN: s system." should be mapped as the special "no_value".
- The property for FORVALTARE is P137 (despite the counter intuitive name).
- URSBESLDAT is most likely the P580 qualifier for the status. Note that you probably want to drop anything more specific than the date.
- For water/forest/land area you might want to review the discussion at Property_talk:P2046 to see which qualifiers to use.
Wikipedia
The national parks are so few that these are probably most easily mapped by going from Kategori:Nationalparker_i_Sverige to the article and get the Wikidata item from there.
The nature reserves are more (~4000). The entry point is Kategori:Naturreservat_i_Sverige which contains subcategories by county which in turn contains:
- List articles (which in turn are subdivided by municipality). E.g. this for Dalarnas län.
- Subcategories by municipality. E.g. this for Dalarnas län.
- Non-list articles. E.g. this for Dalarnas län.
Since the CSV also contains county/municipality information the easiest is probably to use the list articles to find the existing articles per municipality and then match these by name to NVID. Any existing articles which are unmatched after this can likely be manually matched. It might be easiest to do this run separately from the import and create a local NVID to wikipedia_article mapping.
Things to consider:
- Many links are red-links so there is a need to check if article exists during matching.
- Some links are redirects.
- Some articles may not yet be connected to Wikidata items. If one of these are encountered you should first create the connected Wikidata item and then use that.
- Some reserves span over multiple municipalities (and counties?) and may therefore occur in multiple lists.
- I would be surprised if there isn't at least one list article which is formatted differently from the rest.
Other
- All geometries and NVID have already been imported to OSM so it should be possible to run (or ask someone to run) an update on OSM to tie all of these to the newly created Wikidata items and that way get maps one could use in <maplink> or <mapframe>.
- There is a possibility to dig deeper into the data for each item using the open data. E.g. the entry for Q19978573 contains the same data as the CSV but also links to e.g. data on contained nature types.