Page MenuHomePhabricator

Determine where to generate Zim content packs using Production WMF infrastructure
Closed, InvalidPublic

Description

We would like to generate Zim files for use in WMF products. For this we are developing a Zim generation service that will:

  1. Create Zims
  2. Upload Zims to Swift
  3. Track the metadata/location of Zims in a database in order to share with other services.

For a first step, we would like to create a few files to get a handle on the problem. This will be done in a Cloud VPS instance and then uploaded to the Beta Swift instance.

Once we have done this and gathered some data, we want to move the generation service to another place in Production hardware. The hardware has a few requirements:

  1. Run long running jobs (on the order of hours) in order to generate the ZIms
  2. Access the production Swift instance in order to upload and manage files
  3. Host a database or have access to a database that it can insert metadata into. This database MUST be accessible by the Mobile Content Service in SCB
  4. Make a request to an internal API in the Mobile Content Service

Event Timeline

@bd808 I think Kiwix is using our Cloud VPS to generate Zim files. I assume that means we can do the same to generate some new files for our Android app?

Using the apps cloud VPS project in the near term

Actually we can't use Cloud VPS long term to do this if we want to upload to production Swift (also if we want to update the database of ZIm files in the MCS)

@Mholloway to provide stats on how long it takes to generate files and how long it takes to upload to Swift

Fjalapeno renamed this task from Determine how to generate Zim Compilations using WMF infrastructure to Determine where to generate Zim Compilations using WMF infrastructure.Aug 7 2017, 4:43 PM
Fjalapeno renamed this task from Determine where to generate Zim Compilations using WMF infrastructure to Determine where to generate Zim Compilations using Production WMF infrastructure.Aug 9 2017, 7:14 PM
Fjalapeno updated the task description. (Show Details)

@ArielGlenn @bd808 after talking with @GWicke he suggested that the hardware used to generate dumps might fit the bill here. Do you think this is a possibility?

(Note: this is different to our previous conversation on where to host the files, that will be in Swift T170843: Determine where to host zim files for the Android app. This is only about generating the files)

Here is the solution we are currently developing at openZIM/Kiwix. https://github.com/openzim/zimfarm. As far as I can see, this would fit to your needs. We plan to release a first beta in a week (fingers crossed).

Mholloway renamed this task from Determine where to generate Zim Compilations using Production WMF infrastructure to Determine where to generate Zim content packs using Production WMF infrastructure.Aug 11 2017, 12:54 PM

@Kelson thanks that looks interesting, however this ticket is more about finding some production WMF hardware to run MWOffliner to generate ZIM files.

We may want to investigate this for use at some date in the future, but as you noted it is in beta currently, so we will probably wait a while and just use the tried and true MWOffliner in the mean time.

Thanks again

This is stalled, possibly indefinitely. Consider reopening if and when this work picks back up.