Page MenuHomePhabricator

GSoC 2017 Proposal: glam2commons (previously Single Image Batch Upload)
Closed, ResolvedPublic

Description

Profile

Name : Siddhartha Sarkar
Time zone : UTC +5:30
Email : sid.nitdurgapur@gmail.com
IRC nick : infobliss
Github : https://github.com/infobliss
Location : India
Working hours : 1:00 pm to 9:00 pm UTC +5:30

Synopsis

The project aims to create a tool that uploads images from a GLAM to Wikimedia Commons with the following objectives.

  • To minimize the number of steps.
  • To make the process of uploading easier for the non-technical users by mapping GLAM's metadata to the Commons metadata automatically.
  • To enable the user to choose a set of relevant images from a GLAM site and upload them thus avoiding uploading of unnecessary images and minimizing processing time.

We create a Flask app that uses OAuth for authenticating the Commons user. After the login the app allows the user to choose a set of images from a desired GLAM and upload them in batch to Commons. In addition we would like to do some metadata mapping for some of the GLAMs so that end user doesn't have to worry about entering image details like creator, licenses etc. In summary we intend to provide a tool to the users to upload images from a GLAM to Wikimedia Commons most efficiently.

Mentors: @Basvb(python, batch uploading experience, Commons)
Co-mentors: @zhuyifei1999(Code-review, python and tools), and @tom29739(labs, commons, python)

Timeline

PeriodTask
May 4 to May 29Community bonding period. Studying existing tools for uploading media to commons. Studying the Commons metadata fields in depth. Planning the design of the tool including both the frontend and the backend. Compiling a list of GLAMs. Adding and structuring the corresponding tasks in Phabricator. Requesting access to Tool Labs.
May 30 to June 5Designing the basic UI templates for the FLASK app. Study the API of a number of GLAMs to decide what metadata suits the best for writing a generic file title generator module based on the metadata extracted from the GLAM API. Write the relevant code.
June 6 to June 18Design the core elements of the FLASK app including the modules for license checking, metadata mapping and batch upload. Learn how to do the OAuth authentication using Wikimedia Commons login credentials and integrate that into the tool.
June 19 to June 21Contacting with a number of GLAMs to consider the viability of having a "Upload to Wikimedia Commons" button on their image collection site. Design an action plan based on the input received from this communication.
June 22 to June 26Testing Round 1: Sanity Testing, exploratory testing, automation testing and documentation.
June 26 to June 30Phase I evaluation
June 30 to July 5Work to make the tool have the ability to provide the flexibility of choosing a number of desired files (may be by providing checkboxes against the image thumbnails) and quickly upload them to Commons (may be by a single click button)
July 6 to July 10Design a template for the metadata mappings and easy way to update/edit them.
July 11 to July 16Take input from all the stakeholders including the GLAMs regarding the UI and work on the UI part extensively.
July 16 to July 20Receive feedback and do changes if necessary.
July 21 to July 27Testing Round 2: Exploratory testing, automated tests and bug fixes and documentation.
July 28Phase II evaluation
July 29 to August 7To give the option to search for images based on a searchstring across the selected GLAMs and upload the desired ones.
August 8 to August 16Write metadata mapping for a number of GLAMs.
August 17 to August 21Testing Round 3: testing the robustness of the app with the new metadata mappings already created and bug fixes.
August 22 to August 28Freeze the code. Create a page in the Wikimedia Commons on our tool where people can suggest new GLAMs for metadata-mapping and possible changes in metadata mapping for existing GLAMs. Announce the tool to the wikimedia community somewhere on Commons.
August 29 to September 5Mentors submit final student evaluations.
September 6Final results of Google Summer of Code 2017 announced

Deliverables

  • Early and robust design of the tool.
  • Creating Tasks in Phabricator.
  • Getting Tools Lab access.
  • Creation of the UI templates.
  • Writing a generic file title generator module.
  • Design the FLASK app backend components.
  • Providing OAuth authentication with Wikimedia Commons.

Phase I evaluation

  • Receiving inputs from the relevant GLAMs and possibly add a button for single click image upload.
  • Designing a template for the metadata mappings.
  • Make the final UI.
  • Providing search option across selected GLAMs for uploading.
  • Improving the app based on feedback.

Phase II evaluation

  • Writing metadata mapping for a number of GLAMs.
  • Writing easy-to-read documentation and user guide of the tool.
  • Create a Commons page about the tool for receiving suggestions for future enhancement.

Final evaluation

Participation

  • I will make a new repo on git and maintain two branches. Code will be uploaded to the dev branch periodically and will be merged with the master branch once review and testing is done.
  • I will be online on IRC in my working hours ( 1:00 pm to 9:00 pm UTC +5:30) to collaborate with the mentors.
  • I will use Phabricator for managing bugs and subtasks.
  • I will be available in gmail to be contacted when needed in the non-working hours.

About me

Currently I am pursuing MS in Computer Science in the Indian Institute of Technology, Delhi. In 2015 I graduated with a Bachelor of Technology in Information Technology from the National Institute of Technology, Durgapur, India. I am a patient learner and like to work in collaboration. To me dedication to one's work is the primary ingredient of satisfaction. This is my first participation in GSoC. During the summer GSoC will be my first priority since I won't have any other commitments during this period.

I have been contributing to Wikipedia as a translator for some time now. I am inspired and thrilled by the vision of wikipedia of making contents available in every natural language. I think contributing to wikimedia will impact the world in a very positive manner. At the same time for me it will push the horizon even farther by letting me collaborate with the excellent wikimedia community members. Most importantly I will be making something that will make the world's largest free encyclopedia, among other wikimedia sites, richer in content and organization.

Past Experience

I have experience in working with C++, Java, Python, HTML, CSS, PHP among others. Among databases I mostly worked on mysql, Oracle DB and among VCS I worked with Tortoise SVN. Among OS I mostly work in Ubuntu.

I maintained and enhanced a video server website in IIT Delhi for streaming recorded lecture videos (http://etsc.iitd.ac.in) to the IIT Delhi community.

At IIT Delhi I have made several contributions such as configuring and incorporating a captcha to Moodle, an open source course management system.

Earlier I had also worked for improving the features of a proprietary software as well.

Microtasks carried out

Other Info

My wikimedia global user page is here.

Event Timeline

@Basvb , @zhuyifei1999, @tom29739 this is the first draft of my GSoC proposal. Please comment.

August 21 to August 28 ... Host the tool on https://tools.wmflabs.org/.

I'm not sure why this should be so late (1 week before final evaluation). Besides, it is much more easier to test if the tool is on labs already, IMO. Perhaps hosting could be done earlier and change this to "announce the tool"?

Among databases...

FWIW, if you're gonna work with databases on tool labs, the services most available to work with are MariaDB and Redis, although other kinds may be available on request.

Thank you very much for your proposal. It looks very good overal. I'll give some pointers as requested.

  1. Synopsis: looks good
  2. May 4 to May 29: This is a good moment to request access to tool labs and structure the task which have to be done here on Phabricator
  3. May 30 to June 10: I'm not sure whether starting with the UI is the best idea, I think we should aim to get a working MVP within month 1 then use month 2 and 3 for all the core extensions and optional extra's. So the MVP would contain a minimal UI, a very good UI can be made in month 2.
  4. June 11 to June 20: Designing the back end seems a bit minimal, I would propose to try to implement the most core elements of the back end by this moment. You already got quite an end with this even before we started, so if you have a full month getting a working MVP on toollabs should be within what is possible.
  5. About the weeks with testing: please make testing and documentation a continuous element of you project. If you do testing or documentation at the end of the project it will be harder to do (as you are less used to those parts of the code) and if something else takes more time you won't even get around to it.
  6. July 29 to August 8: I think it is valuable to this a bit earlier and for example switch this with the uploading of multiple images or the search-string functionality (as that is something that can be GLAM-dependent). Combining the work on those elements and using the lessons for the one task in the other is also a good idea.
  7. August 16 to August 28: Documentation and testing should be a continuous effort not done at the end of the project.
  8. Microtasks: I know you went a bit further than just the flask app, maybe a short summary does more right to your efforts. Please also link to your Commons account to show your uploads.
  9. I could try to set up a meeting (skype?) with somebody from a GLAM to discuss if they would be interested in having an "upload to commons" button with their image collection. Would you be interested in that? Somewhere half way during the process seems the best time for this
  10. Maybe including something on structure in Commons is a good idea. With that I mean: Making categories, a landing page for the tool where people can read about it and propose new GLAMs for metadata-mappings and changes to existing metadata-mappings.

Thank you for your proposal, and hopefully the points above can help you in creating an even better proposal.

Infobliss updated the task description. (Show Details)
Infobliss updated the task description. (Show Details)

@zhuyifei1999 and @Basvb thank you so much for your reviews. They were insightful and helped me to refine my proposal. I understood the need to host the app on Tools Lab asap.

@zhuyifei1999 I can assure you that I will make myself familiar working with MariaDB and Redis as and when needed.

@Basvb I agree with you regarding the need to begin with a basic UI initially. I have modified the timeline accordingly. The FLASK app components will be finished on priority. I am very excited to have a communication with some GLAMs so that we can work on a single "Upload to Commons" button on their site. The simultaneous testing and documentation part has also been taken care of. Finally to have a dedicated page on Wikimedia Commons is definitely a great idea to get the work going in the future. I have incorporated that too.

@zhuyifei1999, @Basvb I will be very grateful to you if you can review this proposal once more and provide any other suggestion you have in mind.

Announce the tool on https://tools.wmflabs.org/.

You mean announce on Commons somewhere (eg. Village Pump)? Tool labs is for hosting and I'm unaware of announcing a tool.

Otherwise, LGTM. (I don't have much opinions on the timeline unless it's very unfeasible)

@zhuyifei1999 thanks for your review once again. Yeah I meant to announce the existence of the app to the wikimedia community. We can do it on Village Pump. I will modify that part.

@Infobliss Is you proposal now complete? If so, feel free to move it to the "Proposals Submitted" section.

Thanks @srishakatux. I moved the task to Proposals Sumitted column on the GSoC 2017 Workboard.

Do not hesitate to contact me if I can help you about python in general and more precisely Flask and pywikibot.
Good luck !

@Framawiki thanks a lot for offering help. I will not hesitate to seek your help regarding pywikibot once I start coding for the project. By that time I am looking at the code of the existing upload-to-Commons related apps.

Thanks for accepting my proposal. I am willing to start my project activities at the earliest.

@Basvb , @zhuyifei1999
Please let me know when we can plan an IRC session to get started.

Infobliss renamed this task from Proposal : Single Image Batch Upload (GSoC 2017) to Proposal : glam2commons (previously Single Image Batch Upload).May 23 2017, 10:14 PM

Hi! Is there anything remaining in this task before it can be resolved? thank you!

Aklapper renamed this task from Proposal : glam2commons (previously Single Image Batch Upload) to GSoC 2017 Proposal: glam2commons (previously Single Image Batch Upload).Jul 18 2018, 9:21 AM
Aklapper closed this task as Resolved.

This was a successful GSoC 2017 project: https://www.youtube.com/watch?v=jjHEB5p8xG8

Closing this task as resolved.
Any further work can be tracked in separate tasks under the glam2commons project tag.