Page MenuHomePhabricator

Create WCQS UI microsite deployment
Open, HighPublic5 Estimated Story Points

Description

As a WCQS maintainer I want to have a deployment process for WCQS UI that requires fewest manual steps.

WDQS UI is deployed as a microsite (see details here T266702). We need to replicate a similar deployment for WCQS, possibly reusing most of the work in the process. Since WCQS only requires config changes it should be possible to lift most of the original process. Idea is to bundle in the same deployment repo both WDQS and WCQS implementations and add a custom location routing for WCQS (mail thread in the comments).

I
AC:

  • WCQS UI is updated with the changes to general, WDQS UI

Event Timeline

On Tue, 13 Apr 2021 at 11:52, Zbyszko Papierski <zpapierski@wikimedia.org> wrote:
Hi,

We're about to start to push WCQS to production and I need to have some info on how to work with the current WDQS GUI.
It was decided that the code will remain the same for both WCQS and WDQS - we handle everything by changing the configuration and we don't plan on any WCQS specific features. What remains is to create a deployment of W*QS GUI (honestly, those names are confusing;) for WCQS, which I know very little about. We will do the work on this, but any guidance would be appreciated.

Zbyszko Papierski (He/Him)

Senior Software Engineer

Wikimedia Foundation

On Wed, 14 Apr 2021 at 01:03, Adam Shorland <adam.shorland@wikimedia.de> wrote:
Hey Zbyszko!

So firstly I'll lead with https://phabricator.wikimedia.org/T266702 bring the place to look for all of the steps around the current wdqs ui deployment.
And secondly https://phabricator.wikimedia.org/T210286 which is the implementation of a job to do the build step for us.
Everything in these tasks can likely be reused for the WCQS.

The job for the build step can be seen at https://github.com/wikimedia/integration-config/blob/master/jjb/wikidata.yaml#L71-L140
This defines what is being built (gui master branch) https://github.com/wikimedia/integration-config/blob/master/jjb/wikidata.yaml#L77-L91
And where the artifact is being pushed (gui-deploy production branch) https://github.com/wikimedia/integration-config/blob/master/jjb/wikidata.yaml#L125-L130
Using the WDQSGuidBuilder user https://github.com/wikimedia/integration-config/blob/master/jjb/wikidata.yaml#L105-L117

The build itself is created using a grunt command https://github.com/wikimedia/integration-config/blob/master/jjb/wikidata.yaml#L123
Which is defined here https://github.com/wikimedia/wikidata-query-gui/blob/master/Gruntfile.js#L328-L330
This in turn calls only_build which in turn calls copy which seems to be where the "magic" happens right now https://github.com/wikimedia/wikidata-query-gui/blob/master/Gruntfile.js#L331-L333

This copy command includes putting the logo, robots and favicon into the build https://github.com/wikimedia/wikidata-query-gui/blob/master/Gruntfile.js#L150-L151
As well as the config https://github.com/wikimedia/wikidata-query-gui/blob/master/Gruntfile.js#L177-L180
And it seems this is the part that will need modification / an alternative for WQCS

With that changed everything else should be able to operate in the same way, with a new target branch or repo for the built artifact, and a new jenkins job to do the building.
Then a copy of the microsites setup (all the puppet changes are in the ticket mentioned above.

Anything else just let me know!

Adam

On Wed, 14 Apr 2021 at 07:56, Guillaume Lederrey <glederrey@wikimedia.org> wrote:
Note that we already have a second micro site deployed with the same UI code and a different configuration. This is used to expose wdqs1009 as a test server and will be removed at some point.

The approach taken was:

We could use the same approach, but add a location entry in the httpd config to direct calls to /custom-config.json and redirect them to another config file deployed with puppet.

Pro:

  • no duplication of the qui-deploy repo, build pipelines, etc...
  • minimal amount of code / logic to add over the current implementation
  • code for WDQS and WCQS is kept in sync (this is a single deployment)

Cons:

  • config changes require a puppet change, under supervision of SRE (Ryan and Guillaume can help with this)
  • code for WDQS and WCQS is kept in sync (this is a single deployment)

Guillaume Lederrey (he/him)
Engineering Manager
Wikimedia Foundation

On Wed, 14 Apr 2021 at 11:04, Adam Shorland <adam.shorland@wikimedia.de> wrote:

config changes require a puppet change, under supervision of SRE (Ryan and Guillaume can help with this)

This shouldn't need to be the case (I think).
Both a wdqs-config.json and wcqs-config.json could be provided in the repo and included in the build.
Then one initial puppet setup change for a custom location would be needed, but config changes could just be done in the main code repo?

That sounds like a good optimization to what I was thinking, allowing us to keep a single deploy repository while keeping the flexibility of not having to rely on SRE privileges!

  • code for WDQS and WCQS is kept in sync (this is a single deployment)

Indeed. In my opinion this is a fine con for now, so long as neither UI is going to be drastically different.

I could see this both as a plus or a minus. In general, keeping things in sync is helpful, but I'm sure we'll get into a situation at some point where we would prefer to be able to deploy on only WDQS or only WCQS. I think that having a single deploy repo and not dealing with the duplication is worth paying that price down the road if we ever need to. Or we can revisit this solution when we have a real use case.

Adam

MPhamWMF moved this task from All WDQS-related tasks to GUI on the Wikidata-Query-Service board.

If I understand the above our intent is to use the same microsite deployment repo/branch as wdqs. Looking over things, one open question I see relates to how we will manage per-site resources. Various files such as index.html, embed.html, custom-config.json, logo(-embed).svg, favicon.*, etc. reference wikidata directly.

For the images and json we could have a directory for each site (docroot/wcqs?) to drop the site-specific resources in. Apache is fronting the assets, it wouldn't be too hard to rewrite 404's at the top level to check the site-specific docroot.

For the html files things are a little murkier. There are lots of links to wikidata.org that may or may not have matching commons resources. Some of the links we likely want to keep pointing to wikidata.org. Some templating build step combined with the per-site docroots could plausibly do it, but some effort has to be invested in reviewing and tracking down all the appropriate links. For the initial microsite deployment I'm expecting correcting this is out of scope.

Will need to do a couple steps in puppet to transition the existing wdqs microsite to using the new docroot, then we can update the puppet to expose the second docroot for wcqs.

Change 714622 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[wikidata/query/gui@master] Move wdqs assets into site specific asset folder

https://gerrit.wikimedia.org/r/714622

Change 714623 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[wikidata/query/gui-deploy@production] Support wdqs and wcqs from the same deployment

https://gerrit.wikimedia.org/r/714623

Change 714624 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] query_service: support multiple variants of wdqs microsite

https://gerrit.wikimedia.org/r/714624

Change 714633 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[integration/config@master] wdqs-gui: Adjust build to expect configuration in docroot/

https://gerrit.wikimedia.org/r/714633

These end up needing a somewhat specific ordering.

  1. Puppet patch which makes apache start falling back to the site-specific resources. For example it rewrites favicon.ico into /docroot/$site/favicon.ico. This needs to come first because once the CI patch lands deployments will delete config in the old location.
  2. Patch for CI that builds the deploy repo: it clears out all except a few expected files. Add the new docroot as appropriate
  3. The remaining patches, to wikidata/query/gui and wikidata/query/gui-deploy, to put the files in the new expected locations. Order shouldn't matter at this point.

Let me know if I can help on anything, I did the deployment of wdqs gui and query builder to microsites.

@Ladsgroup The open question here is mostly if this is an acceptable path for your side, the method seems reasonable enough to me but I'd like someone that works with it to give a +1 before we ship it.

I looked at the patches, a brain dump:

  • One rather easy way to look at it, is to move the assets (logo, etc.) to the deploy repo (while keeping a default in the main one), it feels weird changing the gui codebase for something very specific to WMF (the gui is heavily used outside WMF as well). We should change it to support multiple sites but having the docroot for both wdqs and wcqs there seems misplaced to me.
  • speaking of docroot, Can I bikeshed? where does this name come from? it seems very unusual to me (by not conveying much information) but it might be just me not knowing about standard web servers.
  • I suggest splitting the puppet patch to 1- Refactor 2- adding wcqs. That way we can get the first part in and make sure nothing in wmde-side is broken
  • The traffic patch needs an LVS patch on top (maybe the third puppet patch?)
  • It might be me very conservative but maybe for start can we have the files in docroot hard-coded in puppet? There are not many of them and wildcard apache redirects seem scary to me but not a big deal. Just a suggestion. Feel free to ignore.

HTH. Let me know if I can be of any service. Would be more than happy to help get this deployed.

I looked at the patches, a brain dump:

  • One rather easy way to look at it, is to move the assets (logo, etc.) to the deploy repo (while keeping a default in the main one), it feels weird changing the gui codebase for something very specific to WMF (the gui is heavily used outside WMF as well). We should change it to support multiple sites but having the docroot for both wdqs and wcqs there seems misplaced to me.

Sure i can move it all to the deploy repo. It seemed unlikely to me that anyone else was using this since we had hardcoded wdqs assets in the main repo. It sounds like they should have never been there.

  • speaking of docroot, Can I bikeshed? where does this name come from? it seems very unusual to me (by not conveying much information) but it might be just me not knowing about standard web servers.

In apache httpd the DocumentRoot (shorthand docroot) is the root directory assets are served out of. In this implementation the rewriting forms a conceptually layered document root, where first we reference the global assets and then the site-specific assets. Nginx shortened this to just root in it's configuration(maybe? I don't really know the provenance). Basically at some point in the past (perhaps less so today? book ngrams for docroot peak in 2011) docroot was synonmous with "where the web server looks for files". Could call it sites or something else if it aligns more with peoples understanding.

  • I suggest splitting the puppet patch to 1- Refactor 2- adding wcqs. That way we can get the first part in and make sure nothing in wmde-side is broken
  • The traffic patch needs an LVS patch on top (maybe the third puppet patch?)
  • It might be me very conservative but maybe for start can we have the files in docroot hard-coded in puppet? There are not many of them and wildcard apache redirects seem scary to me but not a big deal. Just a suggestion. Feel free to ignore.

We can if you prefer, i went with this particular direction (global assets first, then fallback to site-specific assets) to reduce the uncertainty of the redirecting. By checking the global root first we guarantee any file in the main build is always accessed directly, there can't be weird overrides from specific sites that make it do something unexpected. Only files that are completely undefined in the primary repo will pull from the site-specific docroot's.

HTH. Let me know if I can be of any service. Would be more than happy to help get this deployed.

Change 714622 abandoned by Ebernhardson:

[wikidata/query/gui@master] Move wdqs assets into site specific asset folder

Reason:

WDQS specific assets will instead be removed from this repository in favor of moving this repo to be less wdqs specific.

https://gerrit.wikimedia.org/r/714622

Change 714633 merged by jenkins-bot:

[integration/config@master] wdqs-gui: Adjust build to expect configuration in docroot/

https://gerrit.wikimedia.org/r/714633

I looked at the patches, a brain dump:

  • One rather easy way to look at it, is to move the assets (logo, etc.) to the deploy repo (while keeping a default in the main one), it feels weird changing the gui codebase for something very specific to WMF (the gui is heavily used outside WMF as well). We should change it to support multiple sites but having the docroot for both wdqs and wcqs there seems misplaced to me.

Sure i can move it all to the deploy repo. It seemed unlikely to me that anyone else was using this since we had hardcoded wdqs assets in the main repo. It sounds like they should have never been there.

I understand. The default is our production (mostly) but it's used outside and it's part of WMDE's standard bundled Wikibase release.

  • speaking of docroot, Can I bikeshed? where does this name come from? it seems very unusual to me (by not conveying much information) but it might be just me not knowing about standard web servers.

In apache httpd the DocumentRoot (shorthand docroot) is the root directory assets are served out of. In this implementation the rewriting forms a conceptually layered document root, where first we reference the global assets and then the site-specific assets. Nginx shortened this to just root in it's configuration(maybe? I don't really know the provenance). Basically at some point in the past (perhaps less so today? book ngrams for docroot peak in 2011) docroot was synonmous with "where the web server looks for files". Could call it sites or something else if it aligns more with peoples understanding.

Ack. Thanks for explanation

  • I suggest splitting the puppet patch to 1- Refactor 2- adding wcqs. That way we can get the first part in and make sure nothing in wmde-side is broken
  • The traffic patch needs an LVS patch on top (maybe the third puppet patch?)
  • It might be me very conservative but maybe for start can we have the files in docroot hard-coded in puppet? There are not many of them and wildcard apache redirects seem scary to me but not a big deal. Just a suggestion. Feel free to ignore.

We can if you prefer, i went with this particular direction (global assets first, then fallback to site-specific assets) to reduce the uncertainty of the redirecting. By checking the global root first we guarantee any file in the main build is always accessed directly, there can't be weird overrides from specific sites that make it do something unexpected. Only files that are completely undefined in the primary repo will pull from the site-specific docroot's.

If you're confident that it won't break(TM), I don't mind either way.

Change 717638 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[wikidata/query/gui-deploy@production] Remove top-level custom-config.json

https://gerrit.wikimedia.org/r/717638

Change 717649 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[wikidata/query/gui-deploy@production] Add config for commons query service

https://gerrit.wikimedia.org/r/717649

Relevant patches have been updated:

  • wdqs assets copied from gui to gui-deploy
  • The declaration of wcqs microsite is in a new puppet patch
  • docroot is now called sites
  • deployment should be less risky

This rework also adjust the dependencies between patches, the order is now:

  1. CI changes to expect sites/ in gui-deploy (merged)
  2. gui-deploy patch to populate sites/wdqs/
  3. puppet patch to configure apache to read sites/wdqs
  4. gui patch to remove wdqs assets from old location. At this point apache is reading from new locations. Verify correctness
  5. gui-deploy patch to remove old custom-config.json

At that point query.wikidata.org is fully transitioned and should be verified correct. Once satisfied the initial wcqs microsite can be deployed:

Critically this only deploys the site internally, before we can add the config for ATS to route traffic we have to decide how to resolve https://phabricator.wikimedia.org/T280006#7325872 which will determine if that config should mirror WDQS or if we need to do something slightly different to support oauth. ATS config also needs the LVS for wcqs which is in progress as part of setting up the new clusters.

This is also missing the favicon.ico, logo.svg, and logo-embed.svg. I'm not sure who I would even ask about what these should be?

The plan looks great and let me know if I need to do anything,

My only note is that the fourth step should be avoided or changed since it would break for third parties that don't change the logo (albeit being wrong using wikidata's logo, it's better than a broken system most people don't change the default). One suggestion would be to change the logo and favicon to something general in the gui and merge that instead but I'm in a similar situation, I don't know whom should I ask for it. I can ask UX from WMDE for the general gui logo but no guarantee it will happen.

Hmm, yes that makes sense that we want to continue building a working, if slightly misleading, site for current external users. The imagery should be replaced at some point, but we don't have to do it now.

If we don't remove the assets from gui in the 4th step the rewrites wont take effect, in the current patches the rewrite only happens when there is no source file in the main build. The simplest option i can think of is to .gitignore the assets in the gui-deploy repo. Then the build can put those files in place and the git add -A used by CI wont attempt to add those files to the deploy repo. In that case we drop step 4 above, adjust step 2 to remove the top-level imagery, and then move step 2 after step 3 so the apache config is in place when the assets move. I've updated the depends-on markers in the patches so they have the right sequence.

If you can deploy this that would be great, i think everything should be ready now. The remaining patches to transition wdqs gui to the new structure:

  1. puppet patch to configure apache to read sites/wdqs
  2. gui-deploy patch to populate sites/wdqs/ and de-populate imagery from root. At this point site-specific imagery is loading from the per-site dirs. Verify.
  3. gui-deploy patch to remove old custom-config.json.

Sounds good to me. I can't merge/deploy the puppet patch but I assume @RKemper can? Once that's in, I'll handle the rest.

Change 714624 merged by Ryan Kemper:

[operations/puppet@production] query_service: support multiple variants of wdqs microsite

https://gerrit.wikimedia.org/r/714624

@Ladsgroup Puppet patch is merged now. I haven't merged the gui-deploy patches yet so those two gui-deploy patches should be good to go whenever now that the puppet changes are in place

EDIT: Reverted the puppet patch, we'll need to fix the issue that popped up (T290545) and re-merge it before you're clear to proceed with the gui-deploy patches

Change 719502 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] Revert \"Revert \"query_service: support multiple variants of wdqs microsite\"\"

https://gerrit.wikimedia.org/r/719502

Change 719502 merged by Ryan Kemper:

[operations/puppet@production] query_service: support multiple variants of wdqs microsite

https://gerrit.wikimedia.org/r/719502

Mentioned in SAL (#wikimedia-operations) [2021-09-08T21:55:07Z] <ryankemper> [WDQS] T280247 Purged varnish to make sure change took effect: echo 'https://query-preview.wikidata.org/' | mwscript purgeList.php and echo 'https://query.wikidata.org/' | mwscript purgeList.php on mwmaint1002

@Ladsgroup https://gerrit.wikimedia.org/r/c/operations/puppet/+/719502 Okay the puppet change is officially merged, so the gui-deploy stuff should be unblocked now.

Change 714623 merged by Ladsgroup:

[wikidata/query/gui-deploy@production] Support wdqs and wcqs from the same deployment

https://gerrit.wikimedia.org/r/714623

Done, now let's wait for half an hour. If you can, it'd be great if you run puppet agent on miscweb.

I confirm the logo is still accessible meaning the apache redirects work now. Merging the final patch.

Change 717638 merged by Ladsgroup:

[wikidata/query/gui-deploy@production] Remove top-level custom-config.json

https://gerrit.wikimedia.org/r/717638

Change 717649 merged by Ladsgroup:

[wikidata/query/gui-deploy@production] Add config for commons query service

https://gerrit.wikimedia.org/r/717649

Change 720072 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[wikidata/query/gui-deploy@production] Import wcqs-beta config from operations/puppet

https://gerrit.wikimedia.org/r/720072

Change 720078 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] trafficserver: Create routing for commons-query.wikimedia.org

https://gerrit.wikimedia.org/r/720078

Change 720072 merged by Ladsgroup:

[wikidata/query/gui-deploy@production] Import wcqs-beta config from operations/puppet

https://gerrit.wikimedia.org/r/720072

Change 720801 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] query_service: Support proxying to microsite from backend

https://gerrit.wikimedia.org/r/720801

This should be all the patches, except for LVS followups, to deploy the microsite and route traffic appropriately between the microsite and wcqs. Now we just (hah!) have to review and ship them. Otherwise we somehow need to create the same levels of protection that the spicerack restart script attempts to provide, seems undesirable to duplicate that effort.

Change 722958 had a related patch set uploaded (by Ryan Kemper; author: Ebernhardson):

[operations/puppet@production] query_service: account for aliases in httpd conf

https://gerrit.wikimedia.org/r/722958

Change 722958 merged by Ryan Kemper:

[operations/puppet@production] query_service: account for aliases in httpd conf

https://gerrit.wikimedia.org/r/722958