Page MenuHomePhabricator

Architect separation between Kartotherian, Tilerator, and Admin UI
Closed, ResolvedPublic

Description

Maps has three highly-overlapping functionalities that should be partitioned/combined better:

  • Kartotherian serves tiles via a public endpoint. It simply gets the tiles from the predefined set of tile sources, without much processing.
  • Tilerator generates tiles and stores them. It takes jobs from the Redis que, gets tiles from the tile source (e.g. SQL tile generator), and puts them into a different tile source (e.g. Cassandra). Tilerator does not really need a UI/endpoint access, especially due to the fact that service runner runs multiple instances on the same port, thus making it impossible to control each instance individually.
  • Admin UI is a tool to do job scheduling and que manipulation, but it does not process jobs. Multiple instances of Admin UI should not run on the same port because an admin may change its internal configuration during runtime, and subsequent requests might go to a different instance, causing confusion. Currently, Admin UI is implemented as a special setting in Tilerator -- runs it in a non-tile-processing mode.

Additionally, we need Admin UI to generate and serve tiles, just like Tilerator+Kartotherian, so that admins can preview the results of the tile generation before running expensive or complicated jobs. Currently we do it by setting up a private Kartotherian instance in production, manually configuring it with the new settings, and using it via an SSH tunnel - an error-prone process.

Solutions:

  • Config-based: Most of the functionality in the 3 components is highly intertwined, so we could merge it all into one repository, and control which functionality is enabled via a config parameters. There will be 3 puppet modules, but they will all point to the same gerrit deploy repository. The repo will contain two different tile source files - one for production and one for tile generation/admin. The puppet-created config will specify which tile source file to use, and will also enable admin interface and job processing functionality. On tin, we will have 3 deploy dirs, but all of them will point to the same git repository.
    • PROS: code consistency, no dups, easier to maintain and manage, fewer repositories.
    • CONS: have to be a bit more careful not to expose admin functionality publicly
  • Create 3 separate repositories, one for each "service", and rely on a common lib to do everything. The common lib in turn will rely on all the tile source libs (cassandra, generator, styling, layer mixers, etc), plus the core (utils) lib. This approach is almost the same as first, except that it brings a lot of code duplication (duplicate of the service template and service initialization), and will differ just in the endpoint setup.
    • PROS: Easier to fork services, more hardcoded separation with the admin interface
    • CONS: Harder to maintain and deploy, significant code duplication, possible bugs due to dups, repository proliferation.

Event Timeline

Yurik raised the priority of this task from to Needs Triage.
Yurik updated the task description. (Show Details)
Yurik added a project: Maps-Sprint.
Yurik added subscribers: Yurik, MaxSem.
Yurik updated the task description. (Show Details)
Yurik added a subscriber: akosiaris.

If I understand correctly the roles and interactions, Tilerator generates and stashes the tiles in Cassandra, while Kartotherian retrieves, renders and serves them to the client. Additionally, the Admin UI is served as a sort of control room. Therefore, Kartotherian and Tilerator would need to share code for access to Cassandra, while Tilerator and the Admin UI should share the Redis-related code blocks.

In this case, I think it's better to have three different services / repositories. Since Tilerator and the UI should be kept private, it would be better to allow Kartotherian to push a tile-generation request to Tilerator directly, So, when a user of the UI wants to preview a tile, the Admin UI sends a request to a Kartotherian route (possibly giving it an extra preview = true query parameter), which prompts it to contact Tilerator and obtain that tile. On the security side, a white-list of IPs allowed to supply this extra parameter could be put in place and configured to Admin UI's IP in ops/puppet.

@mobrovac, thanks for the comment. Kartotherian & Tilerator rely on a slightly different code model - there are "sources" -- a Mapbox standard tile source interface that allows getTile() and putTile() calls. Which means Cassandra source simply gets used by both. The core lib configures all those sources based on the configuration file (see readme). Some sources could function as caching - e.g. autogen source - it checks if tile exists from another "storage" source, and if it doesn't, it gets it from the "generator" source and saves it into storage source before returning.

So the whole pipeline of sources can be arbitrary for both Kartotherian and Tilerator. What makes them really different is what causes the pipeline to be used - in Kartotherian, a web request causes the pipeline's initial getTile() to be called. In Tilerator, it is the job queue that causes that, plus it handles dynamic pipeline setup and batch tile processing.

The admin UI needs the ability to dynamically change the source pipeline, add jobs to redis, and handle the request/response of individual tiles. If I go the 3 repos route, I would have to move all the job queue logic from Tilerator and all the web request code from Kartotherian into some shared lib, making them even thinner wrappers without any actual code or real benefits, yet with all the overhead of separate repos maintenance.

Thus, I'm beginning to lean towards a common repo scenario, something that has multiple routes files that will be enabled based on the config parameters.

Yurik moved this task from To-do to In progress on the Maps-Sprint board.

Today @akosiaris and I had a talk about this issue, and the future implementation steps. It has been decided that @akosiaris will work on the puppets needed to separate tilerator into two services - the daemon part (non-service job que processor), and the admin ui part.

Change 244436 had a related patch set uploaded (by Alexandros Kosiaris):
maps: Add mapsadminui service

https://gerrit.wikimedia.org/r/244436

Change 244437 had a related patch set uploaded (by Alexandros Kosiaris):
tilerator: omit the port argument

https://gerrit.wikimedia.org/r/244437

Change 244437 merged by Alexandros Kosiaris:
tilerator: comment about the port argument

https://gerrit.wikimedia.org/r/244437

Accidentally closed it - should be open until the puppet is updated - https://gerrit.wikimedia.org/r/#/c/244436/

Deployed, works. Thanks!

Change 249501 had a related patch set uploaded (by Yurik):
Allow same perms to tileratorui as tilerator

https://gerrit.wikimedia.org/r/249501

Restricted Application added a subscriber: StudiesWorld. · View Herald Transcript

Change 249501 merged by Alexandros Kosiaris:
Allow same perms to tileratorui as tilerator

https://gerrit.wikimedia.org/r/249501