Page MenuHomePhabricator

Document WLE data preparations
Closed, ResolvedPublic2 Estimated Story Points

Description

Document WLE data preparations, downloaded files and plan of attack.

Outcome

Original data

The original data (for national parks and nature reserves) http://gpt.vic-metria.nu/data/land/NR.zip and http://gpt.vic-metria.nu/data/land/NP.zip.
The metadata for each of these files exists in Miljödataportalen.
These files contain shapefiles. Using http://mapshaper.org/ these files can be converted to other formats and, relevant for us, the embedded data can be extracted as a CSV.
The extracted CSVs (and the metadata from Miljödataportalen) are available here (as a zip file).

Note that there are duplicates in the Nature Reserve data (see below). A spot check seems to indicate that these have different BESLSTATUS though indicating that we probably only want to include those entries which have BESLSTATUS=Gällande.

Wikidata

A demo item exists at Q19978573

  • The Property for the unique id (NVID) is P3613. Note that this is the same for both types of objects.
  • The property for the IUCNKAT field is P814. The allowed values are those with P31=Q3679744. Note that "0, Områden som ej kan klassificeras enligt IUCN: s system." should be mapped as the special "no_value".
  • The property for FORVALTARE is P137 (despite the counter intuitive name).
  • URSBESLDAT is most likely the P580 qualifier for the status. Note that you probably want to drop anything more specific than the date.
  • For water/forest/land area you might want to review the discussion at Property_talk:P2046 to see which qualifiers to use.

Wikipedia

The national parks are so few that these are probably most easily mapped by going from Kategori:Nationalparker_i_Sverige to the article and get the Wikidata item from there.
The nature reserves are more (~4000). The entry point is Kategori:Naturreservat_i_Sverige which contains subcategories by county which in turn contains:

Since the CSV also contains county/municipality information the easiest is probably to use the list articles to find the existing articles per municipality and then match these by name to NVID. Any existing articles which are unmatched after this can likely be manually matched. It might be easiest to do this run separately from the import and create a local NVID to wikipedia_article mapping.

Things to consider:

  • Many links are red-links so there is a need to check if article exists during matching.
  • Some links are redirects.
  • Some articles may not yet be connected to Wikidata items. If one of these are encountered you should first create the connected Wikidata item and then use that.
  • Some reserves span over multiple municipalities (and counties?) and may therefore occur in multiple lists.
  • I would be surprised if there isn't at least one list article which is formatted differently from the rest.

Other

  • All geometries and NVID have already been imported to OSM so it should be possible to run (or ask someone to run) an update on OSM to tie all of these to the newly created Wikidata items and that way get maps one could use in <maplink> or <mapframe>.
  • There is a possibility to dig deeper into the data for each item using the open data. E.g. the entry for Q19978573 contains the same data as the CSV but also links to e.g. data on contained nature types.

1NVRID | NAME_1 | NAME_2
22001560 | Humlarödshus fälad | Humlarödshus fälad
32002273 | Humlenäs | Humlenäs
42002443 | Drevfjällen | Drevfjällen
52000617 | Västra Rossö | Västra Rossö
62000198 | Norra Vätterns skärgård | Norra Vätterns skärgård
72001510 | Hörby fälad | Hörby fälad
82000219 | Kalkberget | Kalkberget
92002636 | Munkö | Munkö
102001110 | Tyresta | Tyresta
112001623 | Alsberget | Alsberget
122002048 | Hjälstaviken | Hjälstaviken
132002409 | Hållnäskusten | Hållnäskusten
142015000 | Tallkullarna | Tallkullarna
152000455 | Eriksberg | Eriksberg
162001688 | Ombergsliden | Ombergsliden
172001564 | Ramnakullabackarna | Ramnakullabackarna
182001437 | Kärna mosse | Kärna mosse
192010879 | Månsberget | Månsberget
202001525 | Sniberups fälad | Sniberups fälad
212001522 | Rövarekulan | Rövarekulan
222014977 | Silverån | Silverån
232044377 | Kronoskogen | Kronoskogen

Event Timeline

Lokal_Profil moved this task from 📆 This week to ☑️ Done on the User-Lokal_Profil board.
Lokal_Profil added a subscriber: Alicia_Fagerving_WMSE.

This should be shuffled into the WLE milesone once that has been set up.

@Alicia_Fagerving_WMSE Let me know if anything needs clarification (or links are broken etc).

for reference, notes on area from the discussion:

Land area: Use P2046 with P518:land (we have to agree about which item for land though)
Water area: same as above
Sea water area: same as above
River/lake water area: same as above
Special cases, like the Swedish example above: Use special designed items in P518.
Total area: Use P2046 without any P518

for reference, notes on area from the discussion:

Land area: Use P2046 with P518:land (we have to agree about which item for land though)
Water area: same as above
Sea water area: same as above
River/lake water area: same as above
Special cases, like the Swedish example above: Use special designed items in P518.
Total area: Use P2046 without any P518

@Alicia_Fagerving_WMSE Is there a subtask for the 'area' implementation? Wondering to see if this task can be closed.