Page MenuHomePhabricator

Documenting process of writing Zotero translators through translation-servers
Closed, DeclinedPublic


Outreachy Link:


Name: Soumya Atul Gupta
IRC or IM networks/handle(s): soum on freenode
Web Page / Blog / Microblog / Portfolio: My Github profile and my personal blog
Location: New Delhi/Mumbai
Typical working hours: 1000-1330 hours and 2030-0230 hours, GMT +5.30

The project aims to consolidate the documentation for Zotero's translation servers, which in turn will help Citoid (a service that allows people to easily add references on Wikipedias). Currently, up-to-date documentation exists only for the browser plug-ins not the translation-server and not on wikipedia, but on Zotero. This documentation will be present on and and the translation of the documentation to Hindi and French will be done.
While writing the documentation, site-specific bugs will be solved (in order to establish full rapport with it).

Mentors: @Mvolz and @czar


  1. Write documentation for Zotero plug-in and translation server
  2. Write translators for Zotero
  3. Cleaning up Citoid-related issues site-specific Zotero bugs

Proposed Timeline:

Nov 8 – Dec 6Community Bonding Period, familiarizing oneself with Citoid and Zotero
Dec 6 – Jan 6Writing translators for Zotero; fix bugs related to Zotero/Citoid to understand Zotero/Citoid completelySite specific issuesT137019, T121295, T113262, T106892, T99091, T98675, T105647, T87331
Jan 6 – Feb 15Documentation for Zotero plug-ins, and translation-servers
Feb 15 - Feb 28T106200 and T137440
March 1 – March 6Final revisions; Pencils Down

Mid-term review: Basic Documentation begun, and 6/8 bugs solved

Given below are the screenshots of the Zotero and Scaffold plug-ins in use to translate webpages

zotero (768×1 px, 360 KB)

scaffold1.png (768×1 px, 267 KB)


I've been a regular user of Git for quite a while now, so I've got the basics of code review firmly established.
While I'll be working on the project, I'll be backing up my code everyday to Git and for review to Gerrit. This regularity will help other developers keep a track of my progress, and to point out corrections at every stage.

In addition to this, other forums where I will be available are:
IRC: I'll stay online on freenode in MediaWiki-General and #wikimedia-services as soum during my working hours.
I'll post my progress (weekly summary) on my personal blog.

About you

Education and Work Experience

I am a final (fourth) year student pursuing Bachelor of Engineering in Electronics and Communication at Netaji Subhas Institute of Technology, New Delhi, India . Over the years, I've gleaned knowledge of not only Circuits and Systems, Electronics, Digital Signal Processing, Computer System Architecture, Microprocessors et al but Programming, Data Structures, Algorithms and Machine Learning. I want to pursue Natural Language Processing in the future with a side of Linguistics. My final year project is based on Image Captioning using Neural Networks. I am also a open-source contributor to CLTK (Classical Language Toolkit) (I've talked about this later)

Over the summer (2016), I interned with Royal Bank of Scotland as a Software Development Intern. My work was mainly centered around RESTful HTTP APIs using the language Groovy (a mixture of Java + Python) as a base. I also did a project on Data Analysis wherein I had to organize the messed-up data streaming in through my department, I used SQL and QlikView to streamline the data.
I am also comfortable coding in C++, Python and Java.

HTML is a language that I have been introduced to since 2007, ever since web design became a part of our curriculum.
I am the senior editor of the college newspaper, so I like to think that I would be able to get the documentation done easily. I am not alien to the open-source contribution culture having contributed to a FOSS organization before.
These three skills together, make me an ideal candidate for the project at hand which mainly requires documentation and web scraping.

How did you hear about this program?
I heard about this program from colleagues who have been participants in the previous rounds (Mozilla, Round 12 and The Perl Foundation, Round 9). The learning curve exhibited by both my colleagues is primarily what inspired me to apply here.

Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?
I will have college work from mid-Feb to March, but the work pressure will be minimal as it is mainly a research-based semester. I will easily be able to give 40+ hours a week to this project. Other than that, I will have a classical dance exam (Odissi, that's an Indian Classical Dance Form) and anthropology related NGO work (concentrating on the olden traditions of West Bengal).

We don't just care about your project -- you are a person, and that matters to us! What drives you? What makes you want to make this the most awesomest wiki enhancement ever?
I've been using Wikipedia ever since I've started using the Internet itself. That's 14 years out of a 21 year old existence. Needless to say, it's Google and Wikipedia that singlehandedly define the Internet today. They form the fundamental pillars of the information age. To be a part of something so dynamic and affecting would be unbelievable for me, I would extrapolate it as my way of changing the world (one wiki enhancement at a time).

Moreover, I can observe any contributions I make, in real time. Any change that I make, makes wikimedia easier for millions of people around the globe. The sheer scale of this effect motivates me.

Past experience

Please describe your experience with any other FOSS projects as a user and as a contributor
As a user, I've been using Firefox, Linux, Mediawiki, Python and OpenJDK. They seem to be miles ahead kind of development they bring about by just sharing all the knowledge that they possess. I think it's amazing how people all over the world can cohesively make such a difference in real time (for instance Mozilla nightly builds and Google Canary, it's improved everyday consistently, by the people who use it). This goes for EVERY FOSS organization, I think they grow exponentially not only in terms of the software, but in terms of the camaraderie and the culture they cultivate.

As a open-source contributor, I've been contributing to the CLTK Project (Classical Language Toolkit; this project concerns itself with the documentation with the ancient languages of the world, it employs extensive Natural Language Processing with Python as its base). CLTK was featured in GSoc 2016 for the first time, and that is how I got to know about its existence. I have been importing code from third party libraries into their source code, while converting it from Python 2.7 to Python 3. My contributions can be accessed here.

Related Objects

Event Timeline

Soum213 updated the task description. (Show Details)

I don't think the citoid thing is really in scope. I think it's better to stick to just Zotero related tasks only for this. We also don't need documentation for the plug-in specifically, Zotero already has that documentation, this is more targeted at making translators from the perspective of making them for translation-server.

If you want to add in more detail, maybe add in something about screenshots (for the documentation), where the documentation will go, and maybe translations of the documentation? If you run out of things to do, having the documentation in both english and Hindi would definitely be a nice cherry on top :D.

Soum213 updated the task description. (Show Details)
  1. Translations for the documentation will not be a problem. I think I can pull off doing translations to French also (and Sanskrit, but that wouldn't be very practical)
  2. Wouldn't the documentation be present as a wikipedia page?
  3. What about screenshots? Sorry, I didn't really understand that point
  4. The plug-in documentation exists but not on a wikipedia page, right. When I mentioned the plug-in documentation I mean to document it on a wikipedia page

On, I'd imagine.

You might want screenshots if you're using browser tools like the plug in
docs do.

Linking back to Zoterp plug in doc is sensible, misinterpreted that.

@Mvolz Can we broaden the scope of this project by including T106200 and T137440 too?

That is fine, but as I said previously, I really don't think T93579 is in scope at all. Zotero and citoid already run separately, that task is really just a citoid code thing. I think the scope should remain Zotero-only related tasks.

Also I should point out, in general, that it is better to overshoot your goals than undershoot them. :D

Thanks, I've updated the application likewise.

I wanted to solve the IO and Site specific bugs before the documentation so that I have a panoramic view of Zotero before documenting it. Once, I've finished with the bugs and the documentation, I'd love to try my hand at solving the next mentioned bugs.

Do you think this is a good enough timeline? If not, what changes do you suggest I make?

T115326 (T93561), T121982, T132308 are all citoid tasks, I think it is best to stick to the Zotero code base for this project.

And just a general comment about writing documentation, I find it's easiest to write documentation AS I'm doing a task. And particularly as this is a documentation project, I would expect to see some amount of documentation by the mid term date. Not complete necessarily, but maybe some things like how to install translation-server or the plug-in, etc, stuff that you would have already had to have done to fix the bugs anyway.

Soum213 updated the task description. (Show Details)

Thank you for your proposal. Please add a link to your outreachy proposal in on top of your project proposal. This can be something like Outreachy Link: Good Luck!

@Soum213 do you think that this criteria might be a problem for you ?

Being enrolled in school during a semester when you are taking more than half of the typical number of credits a full-time student takes or having an exams session is considered to be a full-time school commitment.

If not, please explain. Thank you!

Hi @01tonythomas ,
I don't think that this criterion should be a problem for me because:

  1. We have winter vacations through all of December and January, this implies that I will be able to dedicate my time wholly and completely for 8/13 weeks.
  2. As for the remaining 5 weeks, an average semester holds 28 credits, the eighth semester (last semester) holds 24 credits, out of which 8 credits are dedicated to a research project. Therefore, since a major chunk of the semester will be flexible to my timings and needs. This means that I have rigid subjects weighing 20 credits (theory + practical subjects). Out of this, I have only 3 theory subjects weighing 4 credits (as opposed to 5 theory subjects in every semester), the rest is a mix of a mock-paper presentation/seminar and practicals. I can handle this work pressure quite smoothly.
  3. The general pattern of the college syllabus is such that the main weight of the syllabus kicks in after the mid-semester mark has passed (after the passage of the mid-semester exams). This point occurs in the third week of March, well after the completion of the Outreachy internship.

Please do let me know in case of any other doubts, thanks

Hi @Soum213, I understand the frustration but rules are rules and Outreachy binds us to follow them. Vacations or no vacations, if you're taking more than half the number of credits a full-time student takes, then you're ineligible to apply. I'm sorry. We've had to turn down students like you in the past too but it's in the best interests of both you and the org.
Thank you for your interest and I hope to see you apply during the summers. :)

Not selected for Outreachy this time. We hope you will stick around and apply for the next round. Thanks for your interest.

Although I would have loved to have worked under Outreachy, it's alright if it's not so. @Mvolz Would it be alright if I could take up this task regardless and complete atleast 2/3rds of it?

Although I would have loved to have worked under Outreachy, it's alright if it's not so. @Mvolz Would it be alright if I could take up this task regardless and complete atleast 2/3rds of it?

Of course you can do any part of it you like! Although you can also until next round, as @Niharika suggested, if you are free in the summer.