Page MenuHomePhabricator

Change propagation service, phase 1
Closed, ResolvedPublic

Description

We are getting serious about implementing the first phase of the change propagation service, as generally discussed in T102476. In this first phase, the change propagation service should take care of tasks like re-rendering and purging content in response to Event-Platform events. In this first iteration, we focusing on use cases currently covered by the RESTBaseUpdate extension. Specifically, we are leaving elaborate dependency tracking to a later phase.

Requirements

  • Manages subscriptions, with several end points possibly listening for the same kinds of events.
  • Listens to Kafka events in all topics with subscriptions.
  • For each matching event and handler, constructs and dispatches requests (typically HTTP). With services like RESTBase, Cache-Control headers are typically used to trigger a re-render.
  • Possibly, emit events back to the Event-Platform to
    • signal a resource having changed (possibly triggering other updates)
    • continue processing for large jobs (ex: invalidating millions of pages after a template edit)
    • retry a failed request a limited number of times
    • finally, keep permanently failed requests in a dead-letter queue for later inspection

Implementation

Phase 0: MVP

  • /sys/queue module
    • Wraps Kafka
    • Start with a static subscription config very similar to @mobrovac's draft:
      • topics
      • request template

Phase 1: Simple local route subscriptions

  • POST /sys/queue/subscribe registration end point
  • Dynamic registration via x-setup-handler POST on startup
    • Event description (topic / other criteria)
    • Optionally, a request template. Default: Send full event to calling route.

Phase 2: Remote route subscriptions

  • Expose a public subscription API mapping to /sys/queue/subscribe.
    • Share these dynamic subscriptions with other change propagation nodes, for example by updating table storage table with TTL retention policy & periodic reloading.
    • Consider basing this API on https://en.wikipedia.org/wiki/PubSubHubbub (spec).
  • In the API RESTBase cluster, set up /sys/queue/subscribe to register subscriptions with the remote change propagation service using a lease system (ex: renew once per hour).

Phase 3: Dependency storage, streaming support

  • Add dependency graph storage & a public API for dependency updates.
  • Possibly, add streaming subscription support.

Event Timeline

GWicke raised the priority of this task from to Needs Triage.
GWicke updated the task description. (Show Details)
GWicke added a project: Services.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 6 2015, 12:12 AM
GWicke renamed this task from Change propagation MVP to Change propagation service, phase 1.Nov 6 2015, 12:14 AM
GWicke set Security to None.
GWicke added a subscriber: Ottomata.
GWicke added a comment.EditedNov 6 2015, 7:05 PM

The topic / partition -> node mapping problem should actually be fairly easy to address by using the Kafka HighlevelConsumer. This consumer interface takes care of balancing topic / partition assignment across members of a consumer group.

Edit: Removed the 'cons' entries in the task description to reflect this.

GWicke updated the task description. (Show Details)Nov 6 2015, 7:06 PM

Less simple to add dependency storage later.

I have looked a bit at hooking up restbase-mod-table-* modules directly into a separate service, and there are no show-stoppers for that: we can load the module just like we do in RESTBase and then use the exposed operations directly. So I'd say adding storage later is as complicated as adding it for the RESTBase-as-a-framework approach.

Declarative way to add subscriptions directly in RESTBase entry points.

If there are two RB instances (the other one being the change-propagation system), then how is this relevant given that subscriptions would need to be declared in the change-prop config, not the REST API one ?

Option of deploying as separate services (change propagation vs. REST API), or as a single service, depending on config.

This is a pro wrt. the current RESTBase deployment, but there is no difference between this and having a separate service for change propagation as both do exactly that - have two service instances for two different things.

mobrovac updated the task description. (Show Details)Nov 11 2015, 4:06 PM
GWicke updated the task description. (Show Details)Nov 11 2015, 4:10 PM

@mobrovac, I added easy & quick setup to either option, as we have puppet roles for both.

@mobrovac, I added easy & quick setup to either option, as we have puppet roles for both.

Correct, but I am not convinced that deploying a second RB instance on sc[ab] would be as easy as deploying a separate service based on the service template.

We discussed the deployment question on IRC. service::node should handle any service-runner based service, including services based on RESTBase.

T118401 tracks using service::node for the REST API service (commonly called 'RESTBase').

GWicke added a comment.EditedNov 12 2015, 7:27 PM

We discussed this further this morning. Summary:

  • We are leaning towards a consumer group per subscription model (see T118429 for background).
  • The first version should support a static config similar to @mobrovac's proposal, subscribe to Kafka & emit templated events. The implementation can use RESTBase's existing request templating infrastructure.
  • We mostly agreed on splitting out the framework portion of restbase to a separate package (T118404), and likely use it for future REST API needs in the change propagation service. The name of the new package is to be discussed / decided.

Action items:

  • Marko to prototype a fairly stand-alone node module similar to 'first version' above.
  • Group to decide on a name for the framework package (T118404), and Gabriel to implement.
Gilles added a subscriber: Gilles.Nov 13 2015, 4:21 PM

A first-version prototype is available at https://github.com/d00rman/restbase-mod-queue-kafka . As discussed, this version creates a new consumer group per topic subscription. I've tested it and, as stated in the the Kafka docs, only one consumer process gets the message. Unfortunately, the first subscribed client gets all of the messages for the consumer group. Moreover, that holds true for all of the consumer groups, resulting in one process getting all of the messages, while all of the other processes idling.

We will have to resort to a more complex round-robin message-assignment algorithm. An idea could be to have a dispatcher process per topic, which would subscribe to the topic and then dispatch it to worker processes. This would require, however, a coordination mechanism amongst all service processes involved in the process, which isn't trivial.

For phase one, we might even go with a simple one process per topic approach. Having this behaviour replicated over multiple nodes coupled with Kafka's rebalancing would ensure protection against failures. However, it is clear that this approach is not scalable.

mobrovac claimed this task.Feb 8 2016, 11:38 PM
mobrovac triaged this task as High priority.
mobrovac updated the task description. (Show Details)
GWicke closed this task as Resolved.Oct 5 2016, 8:48 PM

Phase 1 is complete, and ChangeProp is now handling RESTBase updates, as well as those for other services like ORES.

We decided to not implement remote registration support, as this seemed complex and potentially brittle.