Page MenuHomePhabricator

Determine URL paths for Zim files
Closed, InvalidPublic

Description

We do not need any special URLs, but was mentioned in a meeting with @fgiunchedi and @BBlack. So wanted to document this.

The only requirement that has come up recently which could impact this is:

We may want to be able generate a definitive list of all compilations hosted in Swift. I'm not sure what if any facilities Swift has for finding/grouping files, or if it is impacted by URLs. @fgiunchedi do you have any insight here?

Event Timeline

Swift's basic grouping for files are "containers" (or "buckets") and inside a container there can be any number of files and filenames. Containers and objects are separated by / but other than that swift is oblivious to the names themselves. What sort of query you'd like to run ? There are no "search" facilities, only listing facilities, though for listing containers and objects it is possible to specify a marker for prefix-search. I've also added some details to T172123: Determine how to upload Zim files to Swift infrastructure before seeing this :)
I would advise against using swift listings as definitive source, though if the full list is the database then it is easy to audit what's in swift and add/remove containers as needed for example.

What's the sort of listing/query you had in mind?

@fgiunchedi We can keep a separate database / list of all the Zim files outside of Swift. If you think that using Swift as the source of truth is not a good idea, then we can definitely just maintain our own list.

My main concern there was such a database getting out of sync with what is actually stored there. But really thats probably me probably being a little paranoid and pre-optimizing.

Let me talk with @Mholloway about this and figure out a viable solution.

Chatted with @Fjalapeno about this. Sounds like we'll end up keeping a compilation info DB in RESTBase that'll be the ultimate source of truth about what compilation files exist in Swift and where to find them.

@fgiunchedi We can keep a separate database / list of all the Zim files outside of Swift. If you think that using Swift as the source of truth is not a good idea, then we can definitely just maintain our own list.

My main concern there was such a database getting out of sync with what is actually stored there. But really thats probably me probably being a little paranoid and pre-optimizing.

Yeah I think having swift as file storage only is simpler, if it gets out of sync for some reason we could audit swift against the database to see what needs removing and what's missing.

re: restbase sounds good, I'm assuming the collection info there could be reconstructed at any time anyways?

Why not using a standard like OPDS? The WMF is granting development to develop OPDS in Kiwix (both client/server side)...

ema triaged this task as Medium priority.Sep 28 2017, 2:49 PM

This is stalled, possibly indefinitely. Consider reopening if and when this work picks back up.