Page MenuHomePhabricator

Purge Europeana entries
Closed, ResolvedPublic

Assigned To
Authored By
Salgo60
May 1 2020, 5:28 PM
Referenced Files
F31790136: image.png
May 1 2020, 6:06 PM
F31790100: image.png
May 1 2020, 5:28 PM
F31790092: image.png
May 1 2020, 5:28 PM
F31790094: image.png
May 1 2020, 5:28 PM
F31790096: image.png
May 1 2020, 5:28 PM
F31790102: image.png
May 1 2020, 5:28 PM
Subscribers

Description

Hmangas did some changes see link

the change to the formatter URI for RDF resource looks like we need to purge items see T112081 and comment on Telegram WIkidata

image.png (262×940 px, 83 KB)

image.png (118×726 px, 28 KB)

image.png (146×882 px, 37 KB)

image.png (210×926 px, 70 KB)

image.png (186×1 px, 59 KB)

Small test running see also script T112081#6100884

Result: it takes 10 sec per purge....feels better waiting

SPARQL used in curl

select ?item ?edited where { ?item wdt:P7704 ?id. ?item schema:dateModified ?edited } order by desc(?edited)

If an object is edit its the same as a purge --> 2020 90% of the 160 000 Europeana objects has been edited

  • WDQS search
  • entries not edited since I added the Europeana property edit date < 1-jan-2020 as Bubblechart ==> 10% has not been edit during 2020

image.png (1×1 px, 66 KB)

curl --header "Accept: text/tab-separated-values" https://query.wikidata.org/sparql?query=select%20%3Fitem%20%3Fedited%20where%20%7B%20%3Fitem%20wdt%3AP7704%20%3Fid.%20%3Fitem%20schema%3AdateModified%20%3Fedited%20%7D%20order%20by%20desc%28%3Fedited%29 > ~/scripts/working/purgescript-test.tsv

for i in cat ~/scripts/working/purgescript-test.tsv
. do echo "$i to be purged"
. python3 ~/pywikibot/pwb.py touch -page:$i -purge
done

OUTPUT
Q169566 to be purged
Retrieving 1 pages from wikidata:wikidata.
Sleeping for 9.1 seconds, 2020-05-01 19:16:47
Page [[Q169566]] purged

1 pages read
0 pages written
0 pages skipped
Execution time: 9 seconds
Read operation time: 9 seconds
Script terminated successfully.
Q3593648 to be purged
Retrieving 1 pages from wikidata:wikidata.
Sleeping for 9.3 seconds, 2020-05-01 19:16:57
Page [[Q3593648]] purged

1 pages read
0 pages written
0 pages skipped
Execution time: 10 seconds
Read operation time: 10 seconds
Script terminated successfully.
Q310924 to be purged
Retrieving 1 pages from wikidata:wikidata.
Sleeping for 9.3 seconds, 2020-05-01 19:17:08
Page [[Q310924]] purged

1 pages read
0 pages written
0 pages skipped
Execution time: 10 seconds
Read operation time: 10 seconds
Script terminated successfully.
Q1610094 to be purged
Retrieving 1 pages from wikidata:wikidata.
Sleeping for 9.3 seconds, 2020-05-01 19:17:19
Page [[Q1610094]] purged

1 pages read
0 pages written
0 pages skipped
Execution time: 10 seconds
Read operation time: 10 seconds
Script terminated successfully

Event Timeline

Salgo60 updated the task description. (Show Details)
Salgo60 updated the task description. (Show Details)
Salgo60 updated the task description. (Show Details)
Salgo60 renamed this task from Change the Wikidata Property for Europeana to Purge Europeana entries.May 1 2020, 5:44 PM
Salgo60 updated the task description. (Show Details)
Aklapper moved this task from Backlog to Done on the User-Salgo60 board.