Page MenuHomePhabricator

Proposal: Write a Zotero translator and document process for creating new Zotero translator and getting it live in production - Outreachy-14
Closed, ResolvedPublic


Proposal for
Note: Feedback is welcomed. I have collected some resources that I think will be helpful during this project

Name: Sonali Gupta
IRC handle: mine0901
Web Page:
User Page:
Location: Jaipur, India
Time Zone: UTC+05:30
Typical working hours: 10 AM to 2 AM.(UTC+05:30)
On college days, occupied between 1 PM to 4 PM.(UTC+05:30)

Citoid service relies heavily on Zotero's translation-server which makes use of translators to get citation metadata from specific pages. Translators are written for various browser support, namely, Firefox, Chrome, Safari, Internet Explorer and Bookmarklet. The existing detailed documentation for writing a translator in the browser plugin is almost 10 years old. The documentation on the wiki is also outdated. For Citoid it is necessary for the translators to work in translation-server. This project will focus on preparing documentation on on How to write a Zotero translator that works in translation-server, keeping Citoid in mind and including the current Zotero code updates.

The process of documentation will include the creation of a sample translator. Translation of the above-mentioned documentation in Hindi will be a part of the project. Solving bugs in Zotero upstream and writing the requested translators is also in the scope of this project.

Mentors: @Mvolz, @czar

Experience + Contribution made for the project
WMF has been providing the essential content online that helps students like me to explore and learn new stuff. It is the most reliable and informative source on the internet. I would say I have been a WMF product user since the time I started using the internet.

I have been active on phabricator and I started working and interacting with WMF developers from 23rd February 2017. For this project, it was important for me to first understand what Zotero translators are. So we were provided some minor tasks to complete to familiarize ourselves with the code structure of Zotero project. While working on my initial contributions, I communicated with mentors at WMF and Zotero developers, asked for help and suggestions. The positive responses from them helped me decide that I want to be a part of this team and contribute to the best of my potential.

After getting a basic idea about the use of these translators, I took a step further and wrote a translator from scratch and solved code related bugs.

Past Experience with Open Source Software
I have been a full-time Linux user for 3 years and I use open source software all the time. I started contributing to an open source organization named Zulip during the application period of Outreachy 13. Zulip is a powerful open source group chat platform, with great features making it preferable over other chat platforms. I solved some code based issues in python, the links to those are -

I also did some documentation tasks at Zulip:

Working with Zulip community has always been an enriching experience for me. There are live coding sessions now and then for new contributors, a great initiative by the core developers. I wanted to give back to the community and volunteered to be a Zulip mentor for Google Code-In 2016. The task I handled was the creation of User Documentation for Zulip’s features and improvement of existing User pages. With other mentors, I helped in designing tasks before and during GCI and I guided students in their introductory tasks and helped them in using git and Github. I recently moderated one of the online English-tutorial sessions organized by Zulip.

About the project
I have tried to work out an outline for the documentation that I will prepare. I am planning to include a section that explains translators like Embedded Metadata, BibTex, RDF, Marc and how to call another translator inside a translator.

Table of content

  1. Brief about Citoid and Zotero translators – This will include a synopsis of these software and information about their features.
  2. Relationship between Citoid and Zotero - How Citoid makes use of Zotero will be the focus of this section. What is required in Zotero translators to be compatible with Citoid, the main motivation of the documentation will be explained in this section.
  3. Installation and setup of required software(procedure explanation through screenshots)
  4. Explanation of Concepts required to build a translator – General things to know before writing a translator are HTML, DOM, XPath and JavaScript. What purpose they serve and their use will be included.
  5. Common code blocks in Zotero translators - For each block, code snippet and the result of the block – There are specific code blocks that are included in almost all the new translators and make it easy for new and old developers to come up with a translator.
  6. Information of the Zotero Utilities with examples
  7. Build a sample translator(Screenshots of tools and code snippets) – Following all the above-mentioned points, a sample translator will be built at translation-server side.
  8. Testing a translator - Before a translator is submitted to get live in production, it must be tested by the developer. Use of automated testing by Scaffold.
  9. Getting a translator live
  10. Troubleshooting - Common errors faced during installation of tools. General errors that are thrown by the code. Errors during testing.
  11. Useful links - Links to courses(HTML, XPath, etc.), Links to Videos, Links to developer forums.
  12. See Also

Elements of the documentation will remain as follows:

  • Sections
  • Subsections
  • Captions
  • Images
  • Code snippets
  • References

30th March - 28th April

29th April

  • Celebrations.

30th April - 29th May

  • Community bonding period.
  • Search and finalize the sample translator to be used for documentation.
  • Study the existing translators for blogs, catalog, magazine, newspaper and learn the similarities and differences between them.
  • Solve Zotero bugs and write translators. Mark bugs to solve during the internship period.
  • Study the different types of translators(COinS, BibTex, RDF, etc.)

Week 1(30th May - 5th June)

Week 2(6th June - 12th June)

Week 3(13th June - 19th June)

  • Complete section 4(synopsis of concepts needed).
  • Cover detectWeb and getSearchResults functions(section 5).

Week 4(20th June - 26th June)

Week 5(27th June - 3rd July)

  • Document how to write a translator for the sample site(section 6). This section will have sub-sections holding code snippets specific to the sample site and their outputs. This section can be finished smoothly making use of already provided documentation in the previous sections. (Use of common code blocks)

Week 6(4th July - 10th July) + Week 7(11th July - 17th July)

  • Section 7,8,9,10,11

The documentation will be shared to be reviewed by developers at Zotero and WMF and their ideas/suggestions will be incorporated. Once the content is developed, translation to Hindi will be done. The crucial(and time taking) part for Hindi translation is typing in Hindi and researching words that are commonly used in other pages on MediaWiki.

Week 8(18th July - 24th July)

  • Complete section 1,2,3

Week 9(25th July - 31st July)

  • Complete section 4,5

Week 10(1st August - 7th August)

Week 11(8th August - 14th August)

  • Complete section 7

Week 12(15th August - 21st August)

  • Document section 8,9,10,11
  • Share the documentation for review + improve it.

Week 13(22nd August - 30th August)

  • Give final touch to documentations and get approval from the mentor for English and Hindi.
  • A blog post for my complete journey.

30th August and later

  • Celebrations
  • Start code-based contributions in WMF. Explore Citoid and other projects on WMF to contribute to.
  • Be an active member of the community.
  • Participate in future programs as a volunteer and a mentor.

Every week, I will post blogs about my progress and get my work reviewed by mentors. This will help in cleaning each section before moving to the next in the following weeks.

Other deliverables during the internship

  • Blog posts on my progress every week.
  • Blog posts on my experience with WMF and/or FOSS-related topics at least once every two weeks.
  • Regular communication with my mentors and other community members at WMF.

About me
I am an undergraduate student, pursuing a degree in Bachelor of Technology in Computer Science and Engineering from The LNM Institute of Information Technology, Jaipur, India. I am in the 6th semester of my eight-semester program. I will graduate in May 2018. I am an active member of my institute’s Open Source Society and Mozilla society. As a member, I have attended and taken up the organization of various workshops on the Git, Open Source, How to contribute to Mozilla and Womoz, etc. Our team at FOSS organized the conference of KDE in 2016 which was my first open-source conference. I also met WMF contributors at the WikiToLearn conference 2017. Steve Jobs said, "Everybody should learn how to program a computer because it teaches you how to think" and I feel open source software and communities make it easy for people to join and be a part of software development. The concept of Outreach programs like Summer of Code motivates and helps so many people. The best part is that all the communities welcome new contributors.

How did I learn about Outreachy?
I read about Outreachy in a magazine (OpenSource For You) and had a discussion with a senior in college who explained me much about it. Later I came in contact with many open source enthusiasts and also had interactions with a few past interns of Outreachy. The program is a wonderful initiative. I will be obliged to be a part of this program through Wikimedia.

Other Projects

  • Android App Development for Inter-App Communication: The aim of this project was to develop all possible combinations of Android apps with source and sink varying between activity, service and broadcast receiver. The first app read call logs and sent the data to the second app to save them in sd card(external memory). The intents taken into action were explicit, implicit and pending intents. The apps were hidden (malicious apps, with least or no GUI). I created 52 apps(26 pairs). The apps were created for a larger project where the behavior of different android versions was to be studied.
  • Web application in Django for FOSS community of my institute: Our team developed a web application in Django with Postgresql to provide a platform for blogs, directories of achievements, sessions and conferences along with interesting stuff related to FOSS LNMIIT. This project is under development.
  • Blog in Django: I mentored a group of 3 students for a winter internship program under the student chapter of Computer Society of India of my college wherein I took online sessions during holidays to teach them about Django(through DjangoGirls tutorial) and build a blog. Later we included Facebook plugins to provide more features.
  • Two Pass Assembler: With a team of 4, I prepared an Instruction Set Architecture and wrote mnemonic codes for two C programs. We then created a program to convert those mnemonic codes to binary and another program to execute the binary code. So a two pass assembler for defined ISA was written in C.

Will you have any other time commitments, such as school work, exams, research, another job, planned vacation, etc., between May 30, 2017 and August 30, 2017? Please provide exact dates for these commitments and the number of hours a week these commitments take.
I will have summer vacations from 30th May to 30th July. That will be 8 weeks in the internship for which I will be free from all other commitments. From 31st July to 30th August, I will have school work which will take at max 3-4 hours a day, an average of 24 hours a week.

If a student, please list the courses you will be taking between May 30, 2017 and August 30, 2017, how many credits you will be taking, and how many credits a full-time student normally takes at your school. Please provide a link or upload your program's suggested curriculum by semester, which includes the suggested number of credits in each semester. Please provide a link or upload your school's academic calendar.
A full-time student averagely takes 23 credits per semester. My 7th semester allows me to choose the number of courses as all the core courses will be over by the current semester. I will have to take 3 credits for BTech Project that runs for a year. I will be taking open electives and program electives such that the entire sum of my credits remains 12 or less. There will be no examination during the internship period.

After preparing a few translators and playing with Scaffold before internship, I started to write documentation from the first day. For the initial draft, the documentation covered the how to write a translator using Scaffold. This included development environment, concepts, working example and code details. First major improvement that was done was to use CSS selectors and remove the concept of Xpaths to define HTML nodes. Around mid-sem I got a new task, to explore how to write a translator at server-side. This was the main motive of the whole project, to find out how we can write code and test it on server, but somewhere around this time, Zotero 5.0 was released. So the documentation was updated to cover any changes that might come in the procedure of development through Scaffold.

To figure out how tests can be carried out on server, was the one task that took the most of the time. I learned about containerization and worked with Docker. A new section came into existence – Developing and testing on server, working example of Mediawiki.

For the last task, I documented how we can write translators for blog-like sites, taking the example of Wikimedia blog because it the most common use case that people will come across and for someone who wants to skip all the details and quickly learn through an example, it will be a good landing page. This page concentrates only on development on server and not Scaffold.

I submitted a few patches to the Zotero upstream during the internship period –
The Economic times
The Open library
The Globe and Mail
BBC Newsbeat
Oxford Reference
TV by the numbers

Outside the internship, I will be refining the documentation further, hopefully with inputs from my mentors and the community at WMF. What’s left to do is to translate all this work in Hindi as I proposed in my application. I will also be looking into other projects at WMF that I can contribute to. It was a great experience.

Event Timeline

Looks good; I notice you're going to be in college some part of the internship, are you sure you meet the eligibility requirements? How many weeks does the internship overlap with your enrolment?

Yes, I meet the eligibility criteria. I will be full time available for 8 weeks. College will overlap for last 5 weeks(August). In that semester, I am free to choose the number of courses since all my core courses will be over by this semester. So I will pick credits less than half. No examination session will occur during that month. I will add this information in the proposal above too.

Mine0901 updated the task description. (Show Details)

As @Mine0901 requested in the IRC meeting some feedback from fellow candidates on the document she is working on... First, it is very impressive, and all pieces are coming together very well :)

Here are my thoughts and questions, feel free to take whatever makes the most sense to you:

  • Add a couple of sentences on the top that says what this page is about.
  • Add a section on the top for "What is Zotero?"
  • In Zotero translator section -- "it can be written for any site and then submitted to the repo..." I wonder if you could make this sentence more clear here what submitting means or rather it's contributing and if so how could one do so. Is there an external link that you could include to refer to the process of submitting/contributing scripts? Also, what the term "translation-server" means (might be it's just for a layman like me)?
  • Might be that Required Software & Required Concepts sections go under Creating Zotero Translators
  • Required Software -- is it more like setting up a development environment setup for creating translators?
  • Required Concepts -- Would one need to be familiar with these concepts before getting started? If so, should it go before the setup?
  • Might be you also consider including external links to useful resources to learn about these concepts in depth and add them at the bottom of the document.
  • In code blocks section, add a few sentences explaining what this is about, and might be use a simple word for the section title
  • A working example... of what?
  • I see there is a use of passive voice in some places of the document (for e.g. needs to be written), might be you change that. About grammatical errors -- sometimes I use that helps me fix a lot of grammatical errors in my writing. If you are interested, you could explore that.

Is there a reason to keep this proposal task open, now that GSoC 2017 / Outreachy 14 are done, and as parent task T115158 exists?