Page MenuHomePhabricator

Set up some beta cluster wikis with different registrable domain
Open, Needs TriagePublic

Description

All beta cluster wikis have the same registrable domain (wmflabs.org). That's a problem for testing T348388: Use central login wiki for login (SUL3) as some browser limitations on cross-wiki requests only apply to requests to a different registrable domain. We should have, at the very least, two beta wikis under a different registrable domain that they share with each other.

TODO:

Event Timeline

@taavi @Tgr do we have any domains that we could use or do we need to purchase those first?

beta.wmcloud.org is already delegated to this project, and Cloud-VPS admins can delegate additional wmcloud.org subdomains (which is on the public suffix list) if necessary.

As I try to help plan this out, I have accumulated some questions.

  • Can we add the new domain right to the unified cert for acme-chief for deployment-prep, or is it better to have a separate cert entry in the config?
  • Can we use the current cache-text08 to handle requests to wikis with the new domain, or should we spin up another cache-text instance?
  • Is it feasible to move an existing wiki family to the new domain? en.wikinews seems like a good candidate since it is the only one in the wikinews family, if it's not being used for end to end testing.
  • Is there anything we'll need to do for routing to make sure external packets to a host using the new domain get to the right place?
  • We'll need a second wiki (new) for full CentralAuth testing; are the instructions at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/Add_a_wiki still good? If not, what should we be aware of?
  • I guess we need to add the new domain to the setting for $wmgUseCORS in CommonSettings-labs.php? Where else should we be checking?
  • Probably other things here but that's the stuff in my notes.
  • Can we use the current cache-text08 to handle requests to wikis with the new domain, or should we spin up another cache-text instance?

I don't think Varnish cares about the domain (beyond it being part of the URL). It shouldn't be any different from production where any Varnish box can cache for any wiki.

  • Is it feasible to move an existing wiki family to the new domain? en.wikinews seems like a good candidate since it is the only one in the wikinews family, if it's not being used for end to end testing.

Cached links will point to the wrong place so we'll probably have to flush the parser cache and the Varnish cache. Other than that I am not aware of complications (which doesn't say much).

  • I guess we need to add the new domain to the setting for $wmgUseCORS in CommonSettings-labs.php? Where else should we be checking?

Probably just grep puppet and mw-config for wmflabs.org and see if anything seems relevant.

I'm going to start adding items that needs to be done. @taavi you mentioned that beta.wmcloud.org is already designated for this. But do you meant that we can create a wiki with beta.wmcloud.org URL, or are we talking about creating a subdomain under beta.wmcloud.org ?

  • Is it feasible to move an existing wiki family to the new domain? en.wikinews seems like a good candidate since it is the only one in the wikinews family, if it's not being used for end to end testing.

I gave it a thought - and because we're creating this just to test Account creation and Auth mechanisms we don't need a wiki with content. I'm afraid that repurposing an existing wiki might cause us some troubles in the future (breaking someones flow but removing wiki they use)

I'm going to start adding items that needs to be done. @taavi you mentioned that beta.wmcloud.org is already designated for this. But do you meant that we can create a wiki with beta.wmcloud.org URL, or are we talking about creating a subdomain under beta.wmcloud.org ?

Both are possible. The domain is delegated to the project, which means that DNS records for it and any subdomains can be modified via Horizon (DNSZones), acme-chief can issue certificates, etc.

I would stick to beta.wmcloud.org as it nicely sticks follows {LANG}.{PROJECT}.org schema. Now I wonder, do we need an additional domain too?

@Ariel you mentioned we would need "a second wiki" - did you mean that we need beta.wmcloud.org and one more, or just beta.wmcloud.org is enough as the second wiki.

My current understanding is that we want to make sure that autologin works between https://beta.wmflabs.org and https://beta.wmcloud.org - eg you log-in on one, you're automatically logged in on second one. @Tgr
can you confirm?

We may want to test the behaviour when going from logged in on a wiki on beta.wmflabs.org (let's say en.wikipedia) and then visiting some-language.some-wiki.beta.wmcloud.org which is not designated as the "representative wiki" for that wiki family, and see if the behaviour is different from visiting the "representative wiki" immediately after login on en.wp.beta.wmflabs.org. These scenarios behave differently for me in production.

I spoke with @Urbanecm_WMF about creating new wikis and I learned that we want to keep the beta cluster wikis similar to production wikis. In other words, if we want to create a wiki then it should be something that is not available on the beta cluster yet but has a production equivalent. It is not desirable to have beta-wikis that do not have a production equivalent.

Additionally, we might want to migrate more wikis to wmcloud in the future, and creating a wiki that already exists on beta cluster is not recommended - as it is tricky to delete a wiki. Therefore I cannot create a second English Wikipedia on wmcloud.org. Maybe we could merge them, but it would still be tricky and it's better to avoid it. T

We agreed that the best route for this ticket would be to create two wikis:

@taavi - quick question - do you think we should keep the .beta part in the URL? Eg test2.wikipedia.beta.wmcloud or can we skip the beta part and do only test2.wikipedia.wmcloud.org ?

Commands I'm going to execute on beta cluster env:

mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki pl wikivoyage plwikivoyage pl.wikivoyage.beta.wmcloud.org

@Tgr @ArielGlenn - do you have any other thoughts on this?

I'd prefer the beta be kept in the name, making it clear that these are wikis on the deployment cluster.

I'm not sure of the viability of having some wikis in the same family on one domain and some on the other. Maybe @Tgr has more to say on that.

@taavi - quick question - do you think we should keep the .beta part in the URL? Eg test2.wikipedia.beta.wmcloud or can we skip the beta part and do only test2.wikipedia.wmcloud.org ?

Quick answer: yes, let's keep it.

Cannot move forward due to AddWiki script not working on BetaCluster due to missing SQL files in Math Extension - https://phabricator.wikimedia.org/T358236. I'm looking into that.

Mentioned in SAL (#wikimedia-releng) [2024-03-14T14:39:39Z] <pmiazga> on deployment-deploy03 execution “mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki --skipclusters=main,extstore,echo,growth,mediamoderation en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org” failed with `Unknown database test2wiki’ T355281

Mentioned in SAL (#wikimedia-releng) [2024-03-14T14:58:28Z] <pmiazga> on deployment-deploy03 execution “mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org” failed with `Query::isWriteQuery called with incorrect flags parameter’ T355281

This task has been blocked for some time due to issues with the addWiki.php maintenance script. We do not create new wikis on a daily basis, it's more on a per-request basis.The addWiki.php works on the BetaCluster/Prod infrastructure therefore it is tricky to fully test its behaviour locally. We can check parts of it locally, but it can be fully tested only on deployment servers.
On top of that - another challenge is that the BetaCluster has no owner, no product person, no one who makes sure it's in a healthy state. When such issues occur, we're on our own to get those fixed.

So far we tried to create a new wiki multiple times (over 10 times) and each time it failed. Some of those failures didn't have side effects, just required code fixes before moving forward. But for example, one of the issues caused an outage on the Beta Cluster database causing replication failure for around five days (Thursday-Tuesday). The replication stopped working, which resulted in unusable beta cluster wikis and failing jenkins jobs.

We're at the state where the database is finally properly initialised and the majority of the addWiki.php script is working.

All details on broken addWiki.php behaviour and recent work is tracked in T358236

Mentioned in SAL (#wikimedia-sre) [2024-04-04T16:58:11Z] <pmiazga> T355281 executed “mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki --skipclusters=main,echo,growth,mediamoderation,extstore en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org” on deployment-deploy03.deployment-prep

Mentioned in SAL (#wikimedia-releng) [2024-04-04T17:02:45Z] <pmiazga> deployment-prep T355281 executed “mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki --skipclusters=main,echo,growth,mediamoderation,extstore en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org” on deployment-deploy03.