Page MenuHomePhabricator

Determine if we need to communicate anything special about forward and backwards compatibility during train experiment week
Closed, ResolvedPublic

Event Timeline

dduvall triaged this task as Medium priority.Mar 16 2022, 9:47 PM

This is still unknown. I've reached out to PET and we have @DAlangi_WMF joining us during deployment windows for the week, but it sounds like @daniel might be the one to better answers questions regarding forward/backward compatibility features for the week between branch cuts.

From the perspective of the content transform team:

  1. Independent of the /pace/ of the trains, I think the "number of different versions simultaneously live" is an issue. As long as there are only two versions in flight at any point, we won't have to make any changes. If there could be more than (N, N+1) active at a given time, then procedures for doing various upgrades involving (say) ParserCache changes, or dealing with dependency cycles between core and parsoid, might have to be re-examined.
  2. That said, we've got T298046: Provide a way to run Parsoid against "latest git HEAD of mediawiki-vendor" during round trip testing on scandium as a long-standing bug. Parsoid doesn't automatically fork from master on every train (although that might be an interesting future experiment), instead we do some expensive/long-running regression tests against existing content on scandium before manually tagging a new version of the parsoid library and 'releasing it' (by patching composer.json in mediawiki-vendor to name the new version). Those long-running tests use a prod machine which is subject to the train deploy, so usually we do them Thursday/Friday/Monday after the train has fully run. With more frequent trains, our regression tests might actually end up running against multiple different versions of core, which could be an issue.

Thanks for that explanation, @cscott!

  1. Independent of the /pace/ of the trains, I think the "number of different versions simultaneously live" is an issue. As long as there are only two versions in flight at any point, we won't have to make any changes. If there could be more than (N, N+1) active at a given time, then procedures for doing various upgrades involving (say) ParserCache changes, or dealing with dependency cycles between core and parsoid, might have to be re-examined.

Just wanted to confirm that our experiment should not result in more than two versions (1.39.0-wmf.n, 1.39.0-wmf.n+1) active at any time.

  1. That said, we've got T298046: Provide a way to run Parsoid against "latest git HEAD of mediawiki-vendor" during round trip testing on scandium as a long-standing bug. Parsoid doesn't automatically fork from master on every train (although that might be an interesting future experiment), instead we do some expensive/long-running regression tests against existing content on scandium before manually tagging a new version of the parsoid library and 'releasing it' (by patching composer.json in mediawiki-vendor to name the new version). Those long-running tests use a prod machine which is subject to the train deploy, so usually we do them Thursday/Friday/Monday after the train has fully run. With more frequent trains, our regression tests might actually end up running against multiple different versions of core, which could be an issue.

Can you point me to more information about these tests?

  1. Where exactly they run, on what schedule (automated or manual)?
  2. Where is the code for the test suites, repo, branch, etc.?
  3. How does the test suite get updated?

And @daniel if you've had a moment to think about the possible side-effects of next week's experiment, we'd love to hear from you or someone on your team as well. Thanks, all!

Am 17.03.22 um 23:23 schrieb dduvall:

And @daniel https://phabricator.wikimedia.org/p/daniel/ if you've had a
moment to think about the possible side-effects of next week's experiment,
we'd love to hear from you or someone on your team as well. Thanks, all!

I didn't have time to dig in properly, but offhand I'll just echo what Scott
said: as long as we don't have more that two versions of MW active at any given
time, and we don't end up rolling back by more than one version, I can't think
of anything that would  be affected.

Great. Thank you both for the feedback.

and we don't end up rolling back by more than one version, I can't think
of anything that would  be affected.

This is important to note! We had discussed whether this week's version would serve as a rollback target in case anything serious went wrong and wasn't noticed until more than one additional version had rolled. Can you briefly explain why this is? (cc @thcipriani)

Am 17.03.2022 um 23:58 schrieb dduvall:

View Task https://phabricator.wikimedia.org/T303759

Great. Thank you both for the feedback.

In T303759#7787458 <https://phabricator.wikimedia.org/T303759#7787458>,
@daniel <https://phabricator.wikimedia.org/p/daniel/> wrote:

and we don't end up rolling back by more than one version, I can't think
of anything that would  be affected.

This is important to note! We had discussed whether this week's version would
serve as a rollback target in case anything serious went wrong and wasn't
noticed until more than one additional version had rolled. Can you briefly
explain why this is? (cc @thcipriani
https://phabricator.wikimedia.org/p/thcipriani/)

When performing changes that require forward-compatibility to protect against
breakage when rolling back, we have been introducing the forward compat code in
one train, then started to write new data in the next train.  If we can roll
back more than one train, then this practice has to be changed to take this into
account (e.g. introduce forward compat code, then start writing new data TWO
trains later).

In practice, it probably doesn't make much of a difference. I'm not aware of any
change of this kind being in flight right now, and I also think that "one train"
has come to mean "at least one week". But I thought it is something to be aware
of and communicate.

When performing changes that require forward-compatibility to protect against
breakage when rolling back, we have been introducing the forward compat code in
one train, then started to write new data in the next train.  If we can roll
back more than one train, then this practice has to be changed to take this into
account (e.g. introduce forward compat code, then start writing new data TWO
trains later).

I'll make sure this is captured in our retrospective for sure.

In practice, it probably doesn't make much of a difference. I'm not aware of any
change of this kind being in flight right now, and I also think that "one train"
has come to mean "at least one week". But I thought it is something to be aware
of and communicate.

Absolutely. Thank you for clarifying that.