Page MenuHomePhabricator

Add ZIM format support to OCG
Open, LowestPublic

Assigned To
Authored By
Oct 5 2014, 9:44 AM
"Like" token, awarded by geraki."Love" token, awarded by Cxbrx."Love" token, awarded by deltonio2."Love" token, awarded by MichaelSchoenitzer."Love" token, awarded by Pabouk."Love" token, awarded by pyroshroom."Heartbreak" token, awarded by Tadaa3x."Love" token, awarded by Arjunaraoc."Love" token, awarded by Gmehn."Heartbreak" token, awarded by Oznogon.


Mediawiki is the wiki engine behind Wikipedia, all Wikimedia projects and thousands other Web sites. It's a cutting edge free software providing high featured web sites that anybody can edit. Mediawiki hosted content can be made available for offline usage through the Collection extension (written in PHP). The Collection extension allows to easily create collection/selection of articles: so called books ; here is how it works on the Wikipedia in English. One time created, books can be exported in the PDF format. The PDF exporting backend itself is not provided by the Collection extension, it's done with a JavaScript based solution called OCG. OCG is a NodeJS daemon able to transform a book definition in a PDF and it should be able to do the same in the ZIM format. The ZIM format allows to store web pages (with images, videos, etc...) in one extremely compressed file, these pages are then available to read everywhere with a reader like Kiwix. A stub of solution has already been written and the MWOffline is already functional. This task is mostly about merging this two pieces of code.

  • Primary mentor: @cscott
  • Co-mentor:
  • Other mentors: (optional, Phabricator username)
  • Skills: NodeJs , HML, PHP, packaging
  • Estimated project time for a senior contributor: 2-3 weeks
  • Microtasks: T113736



Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rjlabs added a comment.EditedApr 16 2015, 6:06 PM

CScott - If I were a good programmer, I'd dive right in at the coding level to help because its a HUGE gap here, not having .zim output of WikiPedia, and not having easy Admin level .zim output from MediaWiki. I'm happy to do the legwork of trying to find you some qualified help. What programming language are your working in (or what are the choices) and do you need help understanding the .zim file format? I've toured and trying to find active participants. Emmanuel Engelhart kelson at & Tommi Maekitalo tommi at might be able to help get the ball rolling.

Personally, I'm looking to pack Kiwix and .zim files as the primary off-line help system for OSMand (the extremely popular GPS / mobile mapping system that uses Open Street Maps off-line). The help files for all this will be authored on the Open Street Map MediaWiki and spun out to be the off-line help system on the Android, IOS and Destop (java) versions of the software. The global / any language capabilities are really attractive. All projects are free/open source. Kiwix is an obvious solution as long as zim files are easily produced on the fly by MediaWiki admins. Since maps are huge, and cell coverage spotty or completely non existent in many areas, off-line use of maps (and the help system) is critical. So our need dovtails nicely with the whole zim "philosophy".

Some scribbled notes below to perhaps uncover what the next best steps might be? I can't imagine the Kiwix people are happy about having zero zim output from WikiPedia and every other MediaWiki system.... PDF the only alternative? I cringe :)

This is the temporary solution suggested for the casual MediWiki admin to create .zim files? has some tool suggestions:

I also value being able to create zim files directly from mediawiki sites. I'm already spread too thin to be able to help practically unless my circumstances change materially. However I am involved in kiwix and willing to help test and debug whatever's involved. Emmanuel knows how to wake me up if/when the time is right :)


PS: I'm also helping create a wikibook for educational use (Computing aimed at 11 to 14 year old pupils in the UK schooling system) where it'd be great to make the book available as a zim file.

greg removed a subscriber: greg.Apr 24 2015, 3:36 AM
Gmehn awarded a token.May 10 2015, 5:49 PM
Arjunaraoc added a subscriber: Arjunaraoc.
pyroshroom added a subscriber: pyroshroom.
Kelson updated the task description. (Show Details)Sep 24 2015, 8:27 PM

Here's what we have:

IIRC, last time I looked at the code, some tweaks to might be needed as well. I already added one in 55aa6bea33e29053b76b2043d2c96bcb2f4f1964 since the zimwriter backend needs to rewrite redirects. I believe there were other minor issues involving stylesheets & etc -- for example, the Parsoid DOM includes a stylesheet URL, but we don't actually fetch it in the bundler. (And in this case a better solution would be to use the API to query the actual style modules necessary, instead of just stashing the result of ResourceLoader; see T69540: Produce/preserve the metadata about additional ResourceLoader modules required by extension tags). I'm happy to do the mw-ocg-bundler side of this work; just create phab tickets for specific items and link them here as blockers.

Qgil updated the task description. (Show Details)Sep 28 2015, 10:01 AM

If you want to feature this project idea at please edit the description adding the mentors, skills required, and microtasks for candidates. Thank you!

@cscott, would you be interested in mentoring this as an internship project for Outreachy?

Kelson updated the task description. (Show Details)Sep 28 2015, 6:32 PM
cscott updated the task description. (Show Details)Oct 6 2015, 6:24 PM
Adishaporwal updated the task description. (Show Details)Oct 6 2015, 6:25 PM
Adishaporwal updated the task description. (Show Details)Oct 6 2015, 6:31 PM
Kelson updated the task description. (Show Details)Oct 6 2015, 6:35 PM

@cscott, would you be interested in / would time allow mentoring this as an internship project for Outreachy?

01tonythomas added a subscriber: 01tonythomas.

I am shifting this to Outreachy-Round-11 as the project description has two mentors, micro-tasks and looks ready for the 11th edition of Outreachy ( Dec 2015 - Mar 2016 ) . Potential candidates should start by submitting their proposals as a blocker for this task, by November 02.

Feel free to revert it back, if this task has some relevant issues which might block its completion in this term of Outreachy.

rjlabs removed a subscriber: rjlabs.Nov 23 2015, 1:40 PM
Sumit added a subscriber: Sumit.Feb 19 2016, 8:16 PM
NOTE: Outreachy round 12 applications are now open and GSoC 2016 is round the corner. This project was featured for Outreachy round 11 and has a well defined scope. Are you ready to mentor the project this season? If yes, then we'll feature this for Outreachy round 12 and GSoC 2016 as well. Please reply back in comments.
Niharika removed a subscriber: Niharika.Feb 20 2016, 5:19 AM
Qgil removed a subscriber: Qgil.Feb 22 2016, 9:31 PM
Sumit added a comment.EditedMar 2 2016, 1:55 PM

@cscott , @Kelson , you were listed as mentors for this project during Outreachy-11, are you willing to do the same for this round of GSoC/Outreachy ?

Kelson added a comment.Mar 2 2016, 4:43 PM

Thx for proposing but I have unfortunately no time to do that currently.

Sumit updated the task description. (Show Details)Mar 2 2016, 5:03 PM
Sumit added a comment.Mar 2 2016, 5:06 PM
IMPORTANT: Moving to missing mentors as we do not have two mentors confirmed for this at this moment. Interested in co-mentoring ? Do add your name in the task description. A Possible-Tech-Projects task requires a minimum of one primary mentor and a co-mentor to be featured for GSoC/Outreachy. Prospective students ? Do take a look at the Wikimedia mentors pool at, and try connecting this project with a co-mentor, to get featured for this round. A co-mentor needn't necessarily be from a technical background, as per Feel free to change status accordingly, if both mentors are agreed.
Aashaka added a subscriber: Aashaka.Mar 7 2016, 5:32 PM

I would like to work on this project as a part of Outreachy round 12/ GSoC 2016. I am fairly good at PHP and know some Node.JS. I have read about the ZIM format and OCG. I intend to look at the present stub of solution implemented in the next couple days, and in parallel solve the microtasks. @cscott, will you be willing to work as a mentor for this project?

Sumit added a comment.Sep 10 2016, 5:10 PM

This looks like quite a discussed project. @cscott , would you be willing to mentor this project for Outreachy-13(Dec 6-March 6) ?

Eugene233 added a subscriber: Eugene233.EditedMar 22 2017, 4:38 AM

Hi all,
I am a software engineering student and i am quite new to WIkimedia.

While browsing the possible projects, i read through this project and It seems very interesting. I am willing to take this project during this GsoC '17 Please @cscott if you agree with that I can move ahead directly with looking deeply at the project. Thanks

Uhm, sorry for the late notice here: There are ideas to replace OCG by Electron which might turn this task into something to better not spend time on. :-/

@Aklapper @Eugene233 On our side this is still pretty important even if we have no focus on this due to lack of resources. I have posted a comment in that direction here That said, to the contrary to the OCG, the electron-renderer (effort) seems be self-focused an to offer little opportunities to be reused for other formats.

Electron will never support ZIM, AFAIK. I think OCG is still the only option for actual *offline* collection creation.

@cscott Thx for confirming my feeling.

"mwoffliner" is not available as a npm module, so it can be directly/easily used in OCG.

We are currently fixing the problem with mocking the resourceLoader for offline usage in mwoffliner and use also the mobile layout. This should be finished in a few weeks.

Then, it would be smart to move away from zimwriterfs binary call, and use directly node-libzim. One time that's made, it should be relatively easy to bring ZIM export in OCG.

"New Reader" and "global reach" teams are pretty supportive to that feature AFAIK and this is important to Kiwix project too. Looks like we just need to gather supportive people to get enough support to get dev resources to "finish the job".

Nz-jon added a subscriber: Nz-jon.May 8 2017, 5:00 PM
Restricted Application added a subscriber: jeblad. · View Herald TranscriptAug 25 2017, 1:49 AM
jeblad removed a subscriber: jeblad.Aug 25 2017, 11:35 PM

I'm a dev resource and willing to work on this task but I cannot work out whether it's currently parked. I am new to this community and would appreciate being steered in the right direction. @cscott can you help?

@Inveteratransmog: Hi and welcome! :) Wikimedia plans to replace OCG on its servers and OCG might get archived. Hence I would not recommend spending time on this task. I'm not sure about the exact state of ZIM plans - the Kiwix folks or the WMF Readers team might know best? :-/

As already announced in Tech News, OfflineContentGenerator (OCG) will not be used anymore after October 1st, 2017 on Wikimedia sites. OCG will be replaced by Electron. You can read more on

Is this still a valid task or be tagged as Possible-Tech-Projects as per @Aklapper's comment above?

Aklapper lowered the priority of this task from High to Low.Jan 18 2018, 10:49 AM

@srishakatux: OCG is dead, see T150871. This very task awaits a decision in T161312.

Aklapper lowered the priority of this task from Low to Lowest.Apr 22 2019, 6:04 PM
deltonio2 added a subscriber: deltonio2.
Cxbrx awarded a token.Jan 28 2020, 5:12 PM
Cxbrx rescinded a token.
Cxbrx awarded a token.
Cxbrx added a subscriber: Cxbrx.