**This is a proposal for outreachy(Round-11) to add ZIM support to OCG.**
= Proposal=
**Public URL: **T73660
== Name and contact information ==
**Name: **Adisha Porwal
**Email: **porwaladisha@gmail.com
**IRC Nick:** adisha
**Mediawiki User:** Adishaporwal
**Location: **India
**Time Zone: **UTC+5:30
**Typical working hours: **3:00 PM to 12:30 AM (Indian Standard Time)
==== Internet Presence ====
**Github Profile: ** [[ https://github.com/adishap/ | adishap ]]
**LinkedIn Profile: ** [[ https://in.linkedin.com/pub/adisha-porwal/8a/458/30a | Adisha Porwal ]]
**Twitter : ** [[ https://twitter.com/AdishaPorwal | @AdishaPorwal ]]
== Synopsis ==
Mediawiki is the wiki engine behind Wikipedia, all Wikimedia projects and thousands other Web sites.Mediawiki hosted content can be made available for offline usage through the Collection extension (written in PHP). The [[ https://www.mediawiki.org/wiki/Extension:Collection | Collection ]] extension allows to easily create collection/selection of articles: so called books ; One time created, books can be exported in the PDF format. The PDF exporting backend itself is not provided by the Collection extension, it's done with a JavaScript based solution called [[ https://www.mediawiki.org/wiki/Offline_content_generator | Offline Content Generator ]](OCG). Presently,OCG only supports PDF format.This project will provide functionality to support [[ http://www.openzim.org/wiki/ZIM_file_format | ZIM format ]] to OCG. The ZIM file format allows to store web pages (with images, videos, etc...) in one extremely compressed file.
=== Skills ===
# Node.JS
# HTML
# PHP
# packaging
=== How it will benefit MediaWiki or Wikimedia projects? ===
# Mediawiki hosted content can be availed offline in ZIM format to read everywhere with a reader like Kiwix.
# The project will also help to integrate the functionalities of OCG and [[ https://wikitech.wikimedia.org/wiki/Nova_Resource:Mwoffliner | MWOffliner ]].
=== Possible Mentors ===
# [[ https://phabricator.wikimedia.org/p/cscott/ | C. Scott Ananian ]]
# [[ https://phabricator.wikimedia.org/p/Kelson/ | Kelson ]]
== Milestones and Deliverables ==
| **Milestone ** | **Description** | **Duration** | **Deliverable**
| -------------------------- | -------------------------- | -------------------- | ------------------
| Milestone 1 | Envision Phase | Before 17 November 2015 | Development enviroment setup
| Milestone 2 | Community Bonding Period | 17 November - 7 December 2015 | WorkFlow of project
| Milestone 3 | Downloading ResourceLoader css/javascript dependencies | 7 December 2015 - 9 December 2015 | Modules export functionality by OCG
| Milestone 4 | Build a Custom Loader | 10Generation of standalone HTML tree | 28 December 2015 - 27 December15 January 20156 | Custom loader scriptStandalone HTML tree generation code
| Milestone 5 | Generation of standalone HTML tree | 28Build a Custom Loader | 10 December 2015 - 15 January27 December 20165 | Standalone HTML treeCustom loader script to work correctly whenHTML tree offlinee correctly
| Milestone 6 | ZIM file generation | 16 January 2016 -3 February 2016 | abilty to convert HTML tree to ZIM format
| Milestone 7 | Debian Package creation for zimwriterfs | 4 Febuary 2016 - 25 February 2016 | Debian package of zimwriterfs
| Milestone 8 | Final Code Review and Documentation | 26 February 2016 - 7 March 2016 | Source code, Project Report
== Schedule ==
**Before 17 November - __Milestone 1__**
- Remain in constant touch with mentor(s) and community.
- Getting myself familiar with development environment.
- Getting myself familiar with working of Node.Js and packaging.
- Study required docs.
- Fix some bugs along the way and get my hands dirty with code.
**17-November-2015 to 6 December 2015 - __Milestone 2__**
- Discuss, prepare and finalize workflow for development phase with mentors and community.
- Get myself familiar with architecture and implementation of OCG and MWOffliner.
**7 December 2015
**Actual Coding period begins.
**7 December 2015 to 9 December 2015 - __Milestone 3__**
- Send request to API sandbox to get metadata related to css/js dependencies.This Milestone can be achieved in following steps:
# 1.1 - A "do nothing" stage in mw-ocg-bundler, protected by a command-line flag, For example: http://en.wikipedia.org/w/api.php?action=parse&format=json&page=MathML&prop=modules for getting modules and related metadata of page MathMLto new code.
1.2 - Turn on this flag when running tests.
- Currently # Add code to the stage to download the set of modules required for each page in the collection, the ResourceLoader dependencies are not retrieved by the OCG crawler module. OCG and MWOffliner should download correct modules necessary for each pageand save it to 'modulesDb'.
# 3.1 - Add the list of modules to get the list of *unique* modules required to download.
- There is a possiblity that multiple pages may use the same module 3.2 - To the above list, so to avoid ambiguity of downloading same module multiple times we can list all the modules used by each page and convert the list into set as every element in set is uniqueadd the list of "default modules" required on every page to this set.
- After the conversion into set, we can make request to download every module present in the set # Download this list of modules obtained above from resourceLoader
Step first and second is tried to achieve in the [[ https://gerrit.wikimedia.org/r/247614 | following ]]patch.
It is also a microtask for project, so it is expected to be done before that. If completed, will begin with research on implementation of milestone 4.
**10 December 2015 to 27 December 2015 - __Milestone 4__**
- Develop custom loader(javascript tool) that will load modules when the HTML article is displayedBuild standalone HTML tree( "self-sufficient" HTML content with images, javascript, stylesheets) using zip file(known as bundle) that is genrated by mw-ocg-bundler(mediawiki article spider tool).
**28 December 2015 to 15 January 2016 - __Milestone 5__**
- Extend custom loader to build standalone HTML tree( "self-sufficient" HTML content with images, javascript, stylesheets) using zip file(known as bundle) that is genrated by mw-ocg-bundler(mediawiki article spider tool)Develop custom loader(javascript tool) that will load modules when the HTML article is displayed.
- Rewrite/clean HTML to work offline properly.
**16 January 2016 to 20 January 2016**
- Test the portion of project completed.
- Document the achieved milestones.
- Get familiar with working of MWOffliner.
**20 January 2016 to 26 January 2016**
- Invoke 'zimwriterfs' (console tool to create ZIM files from a locally stored directory containing an HTML tree) using OCG.
- Discuss with mentor(s) about the future work on related milestone.
**26 January 2016**
Mid Term Evaluation
**27 January 2016 to 3 February 2016 - __Milestone 6__**
- Convert HTML tree to ZIM format using 'zimwriterfs'.
**4 February 2016 to 22 February 2016 - __Milestone 7__ **
- Get familiar with packaging.
- Create debian package of zimwriterfs as it is required for easy installation of zimwriterfs.
**23 February 2016 to 29 February 2016 - __Milestone 8__ **
- Code Review by me and mentors.
- Inact information from code review.
- Conduct several tests.
- Document the project.
**1 March 2016 to 7 March 2016**
- A buffer period required for final polishing of work.
**7 March 2016**
Firm Pen Down
== Participation ==
==== Communication of progress ====
- **IRC channel**: I'll stay online on IRC at #kiwix, #mediawiki-parsoid, #wikimedia-dev in freenode, in my working hours.
- **Email**: I shall keep mentors updated of my regular work through direct updates via emails.
- The project progress will be updated weekly on the [[ https://www.mediawiki.org/wiki/User:Adishaporwal | sub-namespace ]] of my user page.
- **Mailing List**: I will use it to communicate progress regularly.
- **Blog** : I shall keep my blog updated with regular updates of my work, ideas and helpful posts.
==== Where I would turn for help? ====
**Search by myself**
- Self-research (initially) through available documentation, articles, blog posts and forums.
**Seek help from community**
- Ask the community at IRC channel for help.
- Post queries to the relevant mailing lists or through direct emails to the mentor and related developers.
==== Source Code ====
- Source code will be pushed to gerrit.
= About Me =
I am Adisha Porwal, a fifth year student of an integrated masters program at [[ http://iips.edu.in/ | IIPS-DAVV ]] with computer science major.
I am enthusiastic and active member of [[ http://iips.edu.in/dc | Development Center ]](DC), a part of my college. Development Center aims at bringing people closer to open source technologies. As a DC member, I have taken workshops on Python, HTML, CSS for college students and went to village for making the girls aware of computer basics.
Programming is something I enjoy a lot!like. I use Python, PHP, Javascript, CSS, HTML, Mysql for my projects. My projects can be found at my [[ https://github.com/adishap/ | github profile ]].
For contributing to open source world, version control systems are something one should be aware of. I am familar with git internals. For my projects, I use Github to version control and share my code with the world. While contributing for Wikimedia, I got aware of Gerrit code review system.
During outreachy internship, I promise to work for at least 40 hours a week.I may have to take holiday for 5 days in February but it is not yet decided. If I will go on a break I will compensate the time lost in other weeks.
== Current Experience with Mediawiki ==
I really enjoy contributing to MediaWiki. The support from community made my contributions possible. Throughout thise time period of making my intial contributions, I have learnt something new every day and sure will learn something new in future also.
Till now I have:
- Set-up the development environment of MediaWiki core, OCG, OCGMWOffliner and MWOfflinerzimwriterfs.
- Basic familiarity with code and coding conventions.
- Understood the process of submitting a patch and review(phabricator, gerrit and git).
==== Microtasks and Bugs====
- Currently working on T114788: OCG should download resourceLoader js/css dependencies.
- T98829: Search input cut off in noJS mode(merged).
- T103727: Empty message on watchlists is not center aligned(merged).
== Past Experience ==
=== FOSS Projects ===
My first encounter with FOSS was Linux.
As being a FOSS user and a huge fan of FOSS, I use Ubuntu 14.04 as my operating system, Mozilla Firefox for browsing, VLC for media files and PHP, Python and other open source languages for programming.
However, I am a newbie to Open source community and this is my first-hand effort to contribution in a FOSS and really excited about it. I began with MediaWiki few months ago and submitted some patches in extension MobileFrontend and mw-ocg-bundler.
=== Other Projects ===
- Alumni Portal for institute ([[ https://github.com/adishap/alumniportal | Github Link ]])
- Complaint Management System for an organisation ([[ https://github.com/adishap/complaint_box | Github Link ]])
- Team and Score Management System for a college event ([[ https://github.com/AkankshaRathore/Event-Result-View | Github Link ]])
My more work can be found at my [[ https://github.com/adishap/ | github profile ]].
== Other Information ==
**Do you meet the eligibility requirements outlined?:** Yes
**Preferred pronoun:** she
**Education:** Student at [[ http://iips.edu.in/ | International Institute of Professional Studies,DAVV ]] graduating in December 2016
**How did you hear about this program:** From a friend [[ http://exploreshaifali.github.io | Shaifali Agarwal]], who was past intern of Outreachy (round 9) and GSoC 2015.