Page MenuHomePhabricator

Upload photos from Elsinga collection at the Alkmaar archives (+/- 10,000 images)
Open, MediumPublic

Description

CC-0 license for this Elsinga collection is mentioned in the disclaimer in their website: https://www.regionaalarchiefalkmaar.nl/disclaimer?fbclid=IwAR248LwdG9Ecq3micqEqcJwJj3i4AlzmsVVR0b6Plur5tpC4CUu1EKvhNq4

Please use this URL in the source field of your Pattypan upload to refer to the CC0 license.
You can use this banner {{Elsinga Collection}} to add to the source field of uploads.

Example record from the set: https://commons.wikimedia.org/wiki/File:Gezicht_op_Grote-_of_Sint_Laurenskerk,_Alkmaar,_Regionaal_Archief_Alkmaar_RAA011002951.jpg

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 20 2019, 10:40 PM
Ecritures triaged this task as Medium priority.Oct 20 2019, 10:40 PM
Ecritures renamed this task from Upload photos from collection Alkmaar archives to Upload photos from Elsinga collection at the Alkmaar archives (+/- 10.000 images).Oct 21 2019, 9:37 AM
Ecritures reassigned this task from Ecritures to RonnieV.Nov 4 2019, 1:35 PM
Ecritures updated the task description. (Show Details)
Ecritures removed a subscriber: Aklapper.
RonnieV added a comment.EditedNov 5 2019, 2:31 AM

Het uploaden van de hele bups is sowieso geen taak voor dit moment, dus hoort niet op mijn bordje te liggen.
Volgens mij zou Ecritures voor het downloaden zorgen. Leuk dat Ecritures dit opeens aan mij assigned, maar dat is niet hoe het werkt.

RonnieV reassigned this task from RonnieV to Ecritures.Nov 5 2019, 2:31 AM
RonnieV added a subscriber: RonnieV.

Er staat nu een setje van 200 records op https://maior.memorix.nl/api/oai/raa/key/Elsinga/?verb=ListRecords&metadataPrefix=ese . Dat kan ik in vieren hakken, maar komt niet in de buurt van de 10.000 afbeeldingen.

Overigens zijn deze afbeeldingen uit de tachtiger en negentiger jaren van de twintigste eeuw, Zijn deze vrij beschikbaar? Ik zet daar mijn vraagtekens bij.

Ecritures updated the task description. (Show Details)Nov 5 2019, 10:38 PM

@RonnieV, the CC0 license is stated in their disclaimer on the website. I stated this URL in the main task (=this one) so the URL can be added/used during the Pattypan upload.

@RonnieV The amount of 500 images for the batch uploads for the practical sessions seem a reasonable amount. It would indeed be nice if you added the batches of 500 images + their metadata to each of the subtask of this parents task. There is another subtask (one you created yourself if I am correct) for smaller batches (50?) to be created for use in a Commons workshop.

@Ecritures , the file stated above which you call 'metadata' (it's just the data belonging to the specific picture), only contains 200 pictures, not the 10.000+ you said it would contain.
I can make four sets of 50 items each, but that is it. Please give me a file containing all 10k+ records.

@RonnieV : the resumptionToken is set at 200. If you issue subsequent requests by using the resumptionToken value then you will get the next batch. This API does in fact contain all the records (It states completeListSize="12089")
And you are correct: I use the word metadata for the data that are 'attached to' the picture and that provide the info on the creator, source, licence, id number etc)

Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptNov 21 2019, 8:02 PM
MichellevL_WMNL updated the task description. (Show Details)
Reedy renamed this task from Upload photos from Elsinga collection at the Alkmaar archives (+/- 10.000 images) to Upload photos from Elsinga collection at the Alkmaar archives (+/- 10,000 images).Nov 22 2019, 9:59 AM
Ecritures updated the task description. (Show Details)Nov 22 2019, 10:59 AM
siebrand added a subscriber: siebrand.EditedNov 22 2019, 2:05 PM

Working on this. First step is to download the data. Attached is a script to scrape a generic OAI-PMH endpoint and save every batch in a uniquely named file. The basis of the script comes from https://wiki.lyrasis.org/display/DSPACE/OAI+XML+cache+warmup by Ivan Masar, with improvements by me.

siebrand added a comment.EditedNov 22 2019, 5:54 PM

Script to convert the data downloaded by download_T235995.sh to CSV files with 500 items each, including some cleanup of descriptions.

MichellevL_WMNL added a subscriber: Multichill.
MichellevL_WMNL added a comment.EditedNov 24 2019, 7:55 AM

Update: We experienced a problem uploading with Pattypan, uploadproces wont start/is stuck a 1/500. I have tried to upload from home with pattypan (with a whitelisted user account), and still same result. I will try again from home one more time this afternoon, and else ask Yarl. Thanks all! To be continued!
Example .csv:

@Yarl: see above, could you please help out? We have been trying to upload batches of 500 records from CSV with a URL for the image instead of a local file, but the upload won't start (see screenshot, in Dutch). I have been using my GWT-whitelisted account (User:SIryn, see also this list). The domain is also whitelisted, @Reedy checked this.

Can you help us figure out what the problem is?

Thanks so much for helping out!