Page MenuHomePhabricator

Move Termbox SSR for Beta Wikidata into deployment-prep project
Closed, ResolvedPublic

Description

Currently, the server-side rendering for the mobile termbox on Beta Wikidata happens on ssr-termbox.wmflabs.org (config), which is a proxy for wikidata-misc.wikidata-dev.eqiad1.wikimedia.cloud. This is a Debian Stretch VM, which should be removed by May 2022; it’s also in the wikidata-dev project, rather than deployment-prep like (most of?) the rest of the Beta infrastructure. This might have made sense while the new termbox was under heavy development at WMDE, but these days it’s fairly stable, and I think it should be in deployment-prep (to which WMDE developers can still be added, of course).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Status Update:

  1. The deployment-prep instance was created successfully under the name deloyment-termbox-ssr
  2. The Ansible play-book seemingly ran successfully
  3. A security group and temporary web proxy were created and attached to the instance
  4. Trying to test the service failed and revealed that the SSR service never really started, due to the fact that published wikibase-termbox images are no longer tagged with the latest commit hash

Next steps:

  1. Update logic in service shell scripts to compare the local image hash of the "latest" tag with the hash of the remote image tagged with "latest" (instead of the commit hash) to determine when the service needs to be updated or restarted
  2. Rerun Ansible play-book, to ensure the service is properly instantiated.
  3. Retry testing the SSR service
  4. If Successful, remove the web proxy from old instance
  5. Add correct proxy to the new image and delete the temporary proxy
  6. Test again, and remove wikidata-misc instance, if successful

Change 799269 had a related patch set uploaded (by Itamar Givon; author: Itamar Givon):

[wikibase/termbox@master] Update docker package name

https://gerrit.wikimedia.org/r/799269

Change 799269 merged by jenkins-bot:

[wikibase/termbox@master] Update docker package name

https://gerrit.wikimedia.org/r/799269

Change 799288 had a related patch set uploaded (by Itamar Givon; author: Itamar Givon):

[wikibase/termbox@master] Compare current and latest image hash

https://gerrit.wikimedia.org/r/799288

Status Update:

  1. Logic in shell scripts has been updated to allow fetching of images by 'latest' tag rather than commit hash
  2. Ansible playbook re-ran successfully
  3. SSR and updater services are up and running successfully (See: https://test-termbox.wmflabs.org/_info)

Next steps:

  1. Merge script changes to code repository
  2. Remove the web proxy from old instance
  3. Add correct proxy to the new image and delete the temporary proxy
  4. Test again, and remove wikidata-misc instance, if successful

This new instance is failing to run Puppet:

taavi@deployment-termbox-ssr:~$ sudo run-puppet-agent
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]
Info: Retrieving pluginfacts
Error: /File[/var/lib/puppet/facts.d]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]
Error: /File[/var/lib/puppet/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet:///pluginfacts: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]
Info: Retrieving plugin
Error: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]
Error: /File[/var/lib/puppet/lib]: Could not evaluate: Could not retrieve file metadata for puppet:///plugins: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]
Error: Could not retrieve catalog from remote server: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs]

Please fix. All deployment-prep instances must be fully configured via Puppet and not by hand / separate Ansible cookbooks.

Change 799380 had a related patch set uploaded (by Itamar Givon; author: Itamar Givon):

[wikibase/termbox@master] Update instance URL

https://gerrit.wikimedia.org/r/799380

Please fix. All deployment-prep instances must be fully configured via Puppet and not by hand / separate Ansible cookbooks.

@Majavah thanks for the heads-up, can you please provide some documentation on how to set this up? I've looked at the documentation on wikitech, and both pages I found weren't really forthcoming in how and where I should create these puppet configurations:

Change 799288 merged by jenkins-bot:

[wikibase/termbox@master] Compare current and latest image hash

https://gerrit.wikimedia.org/r/799288

Change 799380 merged by jenkins-bot:

[wikibase/termbox@master] Update instance URL

https://gerrit.wikimedia.org/r/799380

After a quick discussion with @Jakob_WMDE, I think we should consider if we even require termbox SSR to be enabled in beta for the time being, since this system was only meant to be maintained for product preview purposes while termbox was under active development, which it is not under at the moment, as stated in the ticket description. Therefore, the easiest solution might be to turn off mobile termbox in beta for the time being, and invest resources in re-instantiating it when it is needed for active development once more (for which case, we already have the ansible cookbook)

Please fix. All deployment-prep instances must be fully configured via Puppet and not by hand / separate Ansible cookbooks.

@Majavah thanks for the heads-up, can you please provide some documentation on how to set this up? I've looked at the documentation on wikitech, and both pages I found weren't really forthcoming in how and where I should create these puppet configurations:

First of all, all instances in the deployment-prep project must be configured to use the project-local puppetmaster (currently deployment-puppetmaster04) following these instructions: https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster#Step_2:_Setup_a_puppet_client

Having the instance configured by Puppet can be a bit more complicated if there aren't existing puppet manifests for the service. If it runs in docker, you can use role::beta::docker_services (look at any of the deployment-docker-* instances for inspiration). For non-dockerized services you would need to write actual puppet manifests, which can be complicated if you aren't familiar with Puppet already. https://wikitech.wikimedia.org/wiki/Puppet_coding and existing examples in the puppet repo can be helpful here. If this is intended to run in production one day, you'll want to talk to SRE for any architectural considerations.

After a quick discussion with @Jakob_WMDE, I think we should consider if we even require termbox SSR to be enabled in beta for the time being … the easiest solution might be to turn off mobile termbox in beta for the time being

Do you mean turning off server-side rendering of the new termbox (so no-JS users don’t see anything), or turning off the new termbox completely and going back to the old one? (To me the former sounds better.)

In T304328#7967941, @Majavah wrote:

If this is intended to run in production one day

It is running in production, but using the deployment pipeline (mainly configured under helmfile.d in deployment-charts, I believe), which I don’t think has a direct equivalent in Beta?

In T304328#7967941, @Majavah wrote:

If this is intended to run in production one day

It is running in production, but using the deployment pipeline (mainly configured under helmfile.d in deployment-charts, I believe), which I don’t think has a direct equivalent in Beta?

In T304328#7967941, @Majavah wrote:

If it runs in docker, you can use role::beta::docker_services (look at any of the deployment-docker-* instances for inspiration)

Thank you @Majavah for pointing us in the right direction, but I think we will actually shut down and remove this instance after all.

@Lucas_Werkmeister_WMDE I mean shutting the current running instance - wikidata-misc - down (as it seems as though it was not updated automatically for some time now anyhow), and setting wmgWikibaseUseSSRTermbox to false for beta. I believe this means that mobile termbox will then be disabled in beta. I'm not sure that there is even another way that doesn't require us to spend more time on this anyhow, as fixing termbox on beta is not a high priority at the moment, from what I understand. Are there any other needs for the mobile termbox to be available on beta at the moment, apart from what the initial dev team inteded it to be (as a quick preview for product purposes)?

In any case, mobile termbox will still be available to test locally by using the provided docker images or on test.wikidata, in case we need to perform bugfixes.

In light of the decision above, the next steps for this are:

  • Shut down and remove the termbox-ssr instance from deployment-prep
  • Update documentation and instructions on starting a new instance (for future refernce, when development on this feature restarts)
  • Update wmf-config to turn off mobile termbox on beta
  • Shut down and remove wikidata-misc instance from wikidata-dev

@Lucas_Werkmeister_WMDE I mean shutting the current running instance - wikidata-misc - down (as it seems as though it was not updated automatically for some time now anyhow), and setting wmgWikibaseUseSSRTermbox to false for beta. I believe this means that mobile termbox will then be disabled in beta.

I don’t think that’s what it actually means. If I understand correctly, that will only disable server-side rendering of the new / mobile termbox, but it will still remain enabled overall. Users with JavaScript will see the termbox after a brief break (client-side rendering); users without JavaScript won’t see a termbox at all. This was recently decided to be acceptable for the default Wikibase configuration (T292962), so I assume it’s acceptable for Beta Wikidata as well.

I don’t think that’s what it actually means. If I understand correctly, that will only disable server-side rendering of the new / mobile termbox, but it will still remain enabled overall. Users with JavaScript will see the termbox after a brief break (client-side rendering); users without JavaScript won’t see a termbox at all. This was recently decided to be acceptable for the default Wikibase configuration (T292962), so I assume it’s acceptable for Beta Wikidata as well.

Perfect, in that case I will create a patch to change this configuration, thanks.

Change 802769 had a related patch set uploaded (by Itamar Givon; author: Itamar Givon):

[wikibase/termbox@master] Update SSR for beta deployment instructions

https://gerrit.wikimedia.org/r/802769

Change 802770 had a related patch set uploaded (by Itamar Givon; author: Itamar Givon):

[operations/mediawiki-config@master] Turn Wikbase termbox SSR off for beta wikidata

https://gerrit.wikimedia.org/r/802770

Change 802769 merged by jenkins-bot:

[wikibase/termbox@master] Update SSR for beta deployment instructions

https://gerrit.wikimedia.org/r/802769

Change 803494 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3)

https://gerrit.wikimedia.org/r/803494

Change 803495 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3)

https://gerrit.wikimedia.org/r/803495

Change 803496 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (3/3)

https://gerrit.wikimedia.org/r/803496

Change 803497 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Separate wmgWikibaseTermboxEnabled and wmgWikibaseSSRTermboxServerUrl

https://gerrit.wikimedia.org/r/803497

Change 803498 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Unconfigure wmgWikibaseSSRTermboxServerUrl on Beta

https://gerrit.wikimedia.org/r/803498

@Lucas_Werkmeister_WMDE Thank you for all the patches, but I am currently assigned to this task, and I'm considering other, additional, approaches.

Alright, sure. Though only the last of my changes actually creates a difference in effect, and I think the other four could still be useful as refactorings to make the config less confusing.

Change 803494 merged by jenkins-bot:

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3)

https://gerrit.wikimedia.org/r/803494

Change 803495 merged by jenkins-bot:

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3)

https://gerrit.wikimedia.org/r/803495

Change 807148 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Revert "Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3)"

https://gerrit.wikimedia.org/r/807148

Change 807149 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Revert "Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3)"

https://gerrit.wikimedia.org/r/807149

Change 807149 merged by Urbanecm:

[operations/mediawiki-config@master] Revert "Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3)"

https://gerrit.wikimedia.org/r/807149

Change 807148 merged by Urbanecm:

[operations/mediawiki-config@master] Revert "Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3)"

https://gerrit.wikimedia.org/r/807148

Change 807254 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3)

https://gerrit.wikimedia.org/r/807254

Change 807255 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3)

https://gerrit.wikimedia.org/r/807255

Change 807254 merged by jenkins-bot:

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3)

https://gerrit.wikimedia.org/r/807254

Mentioned in SAL (#wikimedia-operations) [2022-06-22T13:14:18Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:807254|Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (1/3) (T304328)]] (duration: 03m 35s)

Change 807255 merged by jenkins-bot:

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3)

https://gerrit.wikimedia.org/r/807255

Mentioned in SAL (#wikimedia-operations) [2022-06-22T13:21:43Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/Wikibase.php: Config: [[gerrit:807255|Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (2/3) (T304328)]] (duration: 03m 35s)

Change 803496 merged by jenkins-bot:

[operations/mediawiki-config@master] Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (3/3)

https://gerrit.wikimedia.org/r/803496

Mentioned in SAL (#wikimedia-operations) [2022-06-22T13:29:59Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:803496|Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (3/3) (T304328)]] (1/2) (duration: 03m 35s)

Mentioned in SAL (#wikimedia-operations) [2022-06-22T13:33:52Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:803496|Rename wmgWikibaseUseSSRTermbox to wmgWikibaseTermboxEnabled (3/3) (T304328)]] (2/2) (duration: 03m 39s)

Change 802770 abandoned by Itamar Givon:

[operations/mediawiki-config@master] Turn Wikbase termbox SSR off for beta wikidata

Reason:

Superseded by Icf777669db59079e47fa7a5de47fb7382a08edb7

https://gerrit.wikimedia.org/r/802770

Change 803497 merged by jenkins-bot:

[operations/mediawiki-config@master] Separate wmgWikibaseTermboxEnabled and wmgWikibaseSSRTermboxServerUrl

https://gerrit.wikimedia.org/r/803497

Mentioned in SAL (#wikimedia-operations) [2022-06-27T13:40:08Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/Wikibase.php: Config: [[gerrit:803497|Separate wmgWikibaseTermboxEnabled and wmgWikibaseSSRTermboxServerUrl (T304328)]] (duration: 03m 27s)

Change 803498 merged by jenkins-bot:

[operations/mediawiki-config@master] Unconfigure wmgWikibaseSSRTermboxServerUrl on Beta

https://gerrit.wikimedia.org/r/803498

Mentioned in SAL (#wikimedia-operations) [2022-06-27T13:51:18Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:803498|Unconfigure wmgWikibaseSSRTermboxServerUrl on Beta (T304328)]] (duration: 03m 20s)

Status update

  • Update wmf-config to turn off mobile termbox on beta
  • Shut down and remove wikidata-misc instance from wikidata-dev

As the machine has been shut down for almost 24 hours now, and there are no apparent errors in the beta-cluster logs, I will proceed to delete the wikidata-misc instance.

ItamarWMDE moved this task from Review to Done on the User-ItamarWMDE board.