In order to better understand what kinds of entity data dump subsets our users are interested in, we want to take a closer look at how WDumper is being used. Under "recent dumps" is a list of previously generated subsets which includes a JSON representation of the filters that were used to generate the dump.
We want to scrape these dumps and turn the filter data into a human-readable form. The outcome should be a CSV file with one row per dump. Columns:
- dump name
- URL
- filter (in human-readable form including labels for any items and properties used)
- statements included in the dump (in human-readable form)
- labels (yes/no)
- descriptions (yes/no)
- aliases (yes/no)
- sitelinks (yes/no)
- languages