Page MenuHomePhabricator

Replay/mirror update requests to RESTBase dev environment
Closed, ResolvedPublic

Description

An environment for RESTBase now exists to proof large, or disruptive changes that require performance characteristics comparable to that of production. One of the first use-cases for this environment is the testing of proposed alternative data models. Seeing an appropriate subset of update requests, with the same data sizes and overwrite distribution would help us determine whether compaction will behave as we have predicted, and assist in establishing a baseline configuration.

The most straightforward way to accomplish this would be to run a separate instance of change-propagation, updated to propagate changes to the dev environment for an apropos subset of titles.

When replaying update requests, a corresponding request load will be generated on the Parsoid and API clusters. We will need to establish whether the additional request load will be a problem for these systems. If the additional request load is a problem, one possibility would be the addition of a RESTBase endpoint to emulate Parsoid using cached responses.

See also: T129682: Look into solutions for replaying traffic to testing environment(s)

Event Timeline

Eevans triaged this task as Medium priority.Mar 2 2017, 8:04 PM

A couple things are also required for the testing CP to work properly:

  1. We need to be able to switch-off event production from change-prop completely by a config stanza in the sys/kafka module
  2. We need to be able to configure the prefix for the consumer groups - currently it's static and it's change-prop-${rule.name}. The dev instance of change-prop should have a different prefix to avoid messing with production consumer groups.
  3. Currently, when CP starts up, it starts processing where it's left, or if nothing was committed, it fallbacks to the latest offset in each topic via auto.offset.reset=latest property. We've discussed, that going through the backlog of events wouldn't allow us to test properly, because of the do not write if html is the same optimization in RESTBase. This means, to make proper testing we need to follow the life stream of events, but the test-CP wouldn't run continuously, so when we stop and restart it, we don't want it to process backlogs. I propose to switch off commits in the test-CP instance completely via one more config stanza.
  4. Switch off log stash logging and metrics.

[ ... ]

  1. Switch off log stash logging and metrics.

Do they need to be switched off? Couldn't we just do something to put them under a different namespace?

[ ... ]

  1. Switch off log stash logging and metrics.

Do they need to be switched off? Couldn't we just do something to put them under a different namespace?

Ye, sure, just wanted to highlight that we need not to forget to do something with them to avoid messing with prod/logging

I would be in favour of logging only locally for CP, if even that.

A couple things are also required for the testing CP to work properly:

  1. We need to be able to switch-off event production from change-prop completely by a config stanza in the sys/kafka module
  2. We need to be able to configure the prefix for the consumer groups - currently it's static and it's change-prop-${rule.name}. The dev instance of change-prop should have a different prefix to avoid messing with production consumer groups.
  3. Currently, when CP starts up, it starts processing where it's left, or if nothing was committed, it fallbacks to the latest offset in each topic via auto.offset.reset=latest property. We've discussed, that going through the backlog of events wouldn't allow us to test properly, because of the do not write if html is the same optimization in RESTBase. This means, to make proper testing we need to follow the life stream of events, but the test-CP wouldn't run continuously, so when we stop and restart it, we don't want it to process backlogs. I propose to switch off commits in the test-CP instance completely via one more config stanza.
  4. Switch off log stash logging and metrics.

I'm vacillating on the configuration aspect of this a bit, so I'll open it up to some bikeshedding; Do you have any opinions on how all of this should be represented?

I'm not very familiar with change-propagation, but having separate knobs for items 1-3 above seems a little weird. Would it make sense to encapsulate all of this behind a mirror_mode, or dev_mode boolean? Is there any value in having them separately configurable?

I'm not very familiar with change-propagation, but having separate knobs for items 1-3 above seems a little weird. Would it make sense to encapsulate all of this behind a mirror_mode, or dev_mode boolean? Is there any value in having them separately configurable?

Seems like a good idea. No opinion on the switch name, anything will do fine.

@Eevans, is there anything left to do on this task?

GWicke edited projects, added Services (doing); removed Services.