Page MenuHomePhabricator

Scrape all the pictures of the digital image repository of the National Archive of Curaçao with the Copyright status 'negative'
Open, Needs TriagePublic

Description

The digital image repository of the National Archives of Curaçao contains a lot of images without copyright; this is indicated with the copyright status 'negative'. Is it possible to scrape all these images without any copyright and put them into an OpenRefine file or a CSV file? (I am at the Wikimedia Hackathon as user Ecritures)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 12 2019, 9:33 PM
DanielleJWiki renamed this task from Scrape all the pictures of the digital image repository of the National Archive of Curaçao with the Copyrght status 'negative' to Scrape all the pictures of the digital image repository of the National Archive of Curaçao with the Copyright status 'negative' .May 12 2019, 9:36 PM
DanielleJWiki updated the task description. (Show Details)
Reedy updated the task description. (Show Details)May 13 2019, 12:12 AM
DanielleJWiki added a subscriber: RonnieV.

I am interested in this project and am at the Hackathon. Where can we meet?

Hey, I am in the big room next to the coffee and tea area on the first floor (close to the windows > hugging a plug for charging my laptop)

SIryn added a comment.EditedMay 17 2019, 10:06 AM
This comment has been deleted.
Yupik added a subscriber: Yupik.May 18 2019, 12:07 PM

Are you looking to save all the actual image files too or just their urls? And is this task still available?

Hai Yupik,

As far as I can see, the task is still open.
Ecritures specifically asked for OpenRefinement or CSV. Looks like she would be happy with the urls and other information about the images (maker, date,...). Adding the copyfree images to Commons (and adding the images to the Wikidata-items) would be a nice extra. If you would manage to do both, it would be great.

Right now I'm cleaning up the metadata for the images; it will be CSV. I'll doublecheck the urls later on in the evening or tomorrow when I have some free time again.

I'm not sure about batch uploading the image files, but I can look into it. If someone else wants to use the CSV file to do that, that'd be fine too.

Hi Yupik. I would love to have the CSV file and see what I can do to batchupload the pictures to Wikimedia Commons (by cleaning the metadata and/or urls if necessary)

(Sorry I am also user DanielleJWiki btw, the author of the task)

Good, good, I'll leave the batch uploading to you :)

Hi Yupik, how will you get the CSV file to me?

Yupik added a comment.Sun, May 19, 9:09 AM

@Ecritures : I've sent you an e-mail.

Thanks @Yupik I received the mail. Thank you so very much!

Yupik added a comment.Sun, May 26, 6:07 PM

You're welcome @DanielleJWiki ! It was quite fun to do so thanks for putting it up.