Page MenuHomePhabricator

Move WDQS UI to microsites
Closed, ResolvedPublic

Description

In T264710#6562678, T264710#6573814 and T264710#6586070 as part of T264710: Host static sites on kubernetes it was decided that the query ui and querybuilder ui would be deployed on the "static sites" infrastructure. See https://wikitech.wikimedia.org/wiki/Microsites
This is also further work in the query UI decoupling that was previously covered in T241291: Simplify WDQS Packaging

And followups to this task:

  • Build step to populate the deploy repo with a built thing, with config and favicon

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptOct 28 2020, 7:55 PM
Dzahn removed Dzahn as the assignee of this task.

Took this to say I will take care of the puppet part to get these on miscweb. But that may be a subtask and I should have waited for the list. (?)

Took this to say I will take care of the puppet part to get these on miscweb. But that may be a subtask and I should have waited for the list. (?)

So for this one I'm not sure we actually need any sub tasks?
The build GUI to be deployed lives in https://gerrit.wikimedia.org/r/admin/repos/wikidata/query/gui-deploy
AFAIK that is all that is needed for the puppet change?

My understanding is that either:

  1. query.wikidata.org will end up needing to point to this static site? and then something there redirect /sparql to the wdqs service?
  2. query.wikidata.org will continue to point to wdqs servers, and nginx there will then pass everything that isn't /sparql to the static site.

From the discussions on T264710 and in IRC I believe the consensus was that open 2 would be easiest?
Need to poke Wikidata-Query-Service team (cc @Gehel ) about which paths other than /sparql also need to be accounted for in order to make the nginx change on the wdqs servers, but perhaps the search team can do that anyway.

Open questions:

  • Are we sure that we want to go with option 2 above, and have edge -> wdqs -> static site?
  • Are the +2 permissions on wikidata/query/gui-deploy okay from the ops side? (we should review this)
  • How will this be exposed internally on the cluster and thus accessible to the wdqs hosts to point to in nginx?

I don't like having services proxying each others internally. Both edge->wdqs->static and edge-static-wdqs are adding accidental complexity to the architecture. This should really be a traffic routing question, wdqs and wdqs-ui are to separate services. The limitation is that for historical reason (tm) those have been deployed behind the same hostname (query.wikidata.org). Introducing 2 different hostnames (sparql.wikidata.org / query.wikidata.org ?) would be the right thing to do, but the transition is probably non trivial to manage.

We can route different URI subspaces differently at the edge layer, based on URI regexes, as shown here for the split of the API namespace of the primary wiki sites:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/profile/trafficserver/backend.yaml#263

We can route different URI subspaces differently at the edge layer, based on URI regexes, as shown here for the split of the API namespace of the primary wiki sites:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/profile/trafficserver/backend.yaml#263

This sounds perfect.
I'll populate the various parts of this ticket soon that are going to need to happen.

  • Are we sure that we want to go with option 2 above, and have edge -> wdqs -> static site?

No we do not, we will route at the edge

From #wikimedia-traffic

2:21 PM <bblack> addshore: https://phabricator.wikimedia.org/T266702#6588396
2:23 PM <addshore> bblack: <3 ty
2:42 PM <ema> addshore: we also already do something similar for wdqs https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/profile/trafficserver/backend.yaml#154
2:42 PM <ema> /bigdata/ldf -> wdqs1005.eqiad.wmnet/bigdata/ldf
2:42 PM <ema> everything else to wdqs.discovery.wmnet
2:42 PM <addshore> aaaah nice
2:43 PM <ema> the very important thing is that we don't rewrite the uri_path
2:43 PM <ema> like in /bigdata/ldf -> wdqs1005.eqiad.wmnet/bigdata/ldf
2:43 PM <ema> we don't want to deal with uri rewrites at the caching layer if at all possible
2:46 PM <ema> addshore: so /sparql -> wdqs_service/sparql is great :)
2:46 PM <addshore> ack, thats probably all perfect :)
2:47 PM <addshore> I may copy these logs into the ticket so that I don't loose the links

Change 637552 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] create microsite for WDQS UI

https://gerrit.wikimedia.org/r/637552

So looking at the traffic layer change a little more....

And looking at other things in general we need to account for.

There are various things that are currently added for the GUI by the WDQS puppet that are not included in the guid build repo

These should probably be included in the -deploy repo for the gui so that WMDE can continue to change them without needing a puppet merge
The alternative that would need puppet merges would be to add them to https://gerrit.wikimedia.org/r/c/operations/puppet/+/637552
The -deploy repo is currently manually built, so we need to figure out how to correctly add these files there.

Change 637552 merged by Dzahn:
[operations/puppet@production] create microsite for WDQS UI

https://gerrit.wikimedia.org/r/637552

Hi @Addshore Puppet has an issue cloning from the deployment repo:

fatal: Remote branch master not found in upstream origin

(Oh, another repo using "production" besides operations/puppet)

Change 637807 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] WDQS microsite: use branch production instead of master

https://gerrit.wikimedia.org/r/637807

Change 637807 merged by Dzahn:
[operations/puppet@production] WDQS microsite: use branch production instead of master

https://gerrit.wikimedia.org/r/637807

Mentioned in SAL (#wikimedia-operations) [2020-10-30T23:32:41Z] <mutante> adding query.wikidata.org to TLS cert for webserver-misc-apps.discovery.wmnet T266702

Change 637811 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] ssl: add query.wikidata.org to TLS cert for webserver-misc-apps

https://gerrit.wikimedia.org/r/637811

Change 637811 merged by Dzahn:
[operations/puppet@production] ssl: add query.wikidata.org to TLS cert for webserver-misc-apps

https://gerrit.wikimedia.org/r/637811

I added Apache config and git cloning to the miscweb backends.

Then added query.wikidata.org to the TLS cert they are using.

Now you can already request query.wikidata.org from them from internal:

[deploy1001:~] $ curl -H "Host: query.wikidata.org" https://webserver-misc-apps.discovery.wmnet

or

[cumin1001:~] $ httpbb --hosts miscweb1002.eqiad.wmnet,miscweb2002.codfw.wmnet - < wdqs.yaml 
Sending to 2 hosts...
PASS: 1 request sent to each of 2 hosts. All assertions passed.

So I now see that the custom-config used to be in the build repo in the production branch.
It was removed in https://gerrit.wikimedia.org/r/c/wikidata/query/gui-deploy/+/606545 as part of T251514: UI for SPARQL Endpoint for Commons
Need to talk to the search platform team to figure out how this change will end up playing with the sdoc query service.

jijiki triaged this task as Medium priority.Nov 10 2020, 4:22 PM

@Addshore Any suggestion who will take the "Deal with the favicon and the custom-config in the GUI build" check box that seems to be next here?

@Addshore Any suggestion who will take the "Deal with the favicon and the custom-config in the GUI build" check box that seems to be next here?

So currently both the WCQS (Wiki Commons Query Service) and the WDQS (Wikidata query service) are deployed from this same deployment repo.
After a discussion with the team leading WCQS (search team) we agreed that we can go and add the config and favicon for the WDQS back to this deployment repo and we take ownership of it (at least of this branch) for the WDQS deployment to microsites for now.
So the next steps forward would be:

  • Add the config and favicon to the deployment repo
  • Update the microsites deployment
  • Do the traffic layer dance

At a later point we will then want to:

  • Figure out our build step which would add the built code to the repo, including the config and favicon :)

Notes from the call:

  • Branches for gui deploy repo - for WDQS and WCQS
  • in the meantime, keep both approaches to deployment of GUI
  • WCQS should have microsite deployment for gui from the start
  • routing for direct queries should be resolved by traffic team
  • https://phabricator.wikimedia.org/T266702#6589774 - endpoint that should be accessible from microsite
  • wdqs service
    • /sparql - main sparql api path
    • /bigdata - main blazegraph path

Are all needed for WDQS and should be passed to blazegraph

Only relevant to OAUTH and the commons query service, so not needed for WDQS yet

Should not be exposed externally

There are various things that are currently added for the GUI by the WDQS puppet that are not included in the guid build repo

These will be in the build repo being deployed to microsites.

So we will:

  • Add the config and favicon to the build step and commit the stuff to be deployed to a new branch in the deployment repo
  • Switch puppet for wdqs on microsites to read from the new branch
  • Check it works with requests internally etc
  • Coordinate with traffic to make the switch, letting WDQS team know

What can we do to move this forward?

Change 654253 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[wikidata/query/gui-deploy@production] Add custom-config.json from puppet

https://gerrit.wikimedia.org/r/654253

There are various things that are currently added for the GUI by the WDQS puppet that are not included in the guid build repo

These will be in the build repo being deployed to microsites.

So we will:

  • Add the config and favicon to the build step and commit the stuff to be deployed to a new branch in the deployment repo

The favicon is already there, made a patch to add the custom-config.json..

Change 654253 merged by Addshore:
[wikidata/query/gui-deploy@production] Add custom-config.json from puppet

https://gerrit.wikimedia.org/r/654253

I have merged the change.
Now I believe we just need to wait for puppet to grab the latest changes.
Then curl -H "Host: query.wikidata.org" https://webserver-misc-apps.discovery.wmnet/custom-config.json should succeed.

@Dzahn is there any way for us to be able to force an update rather than wait for puppet?

Once this is done we can check it seems to all look good and get onto changing the traffic layer.

[deploy1001:~] $ curl -H "Host: query.wikidata.org" https://webserver-misc-apps.discovery.wmnet/custom-config.json
{
  "api": {
    "sparql": {
      "uri": "/sparql"
    },
    "urlShortener": "wmf"
  },
  "brand": {
    "logo": "logo.svg",
    "favicon": "favicon.ico",
    "title": "Wikidata Query Service",
    "copyrightUrl": "https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Copyright"
  },
  "location": {
    "root": "https://query.wikidata.org/",
    "index": "https://query.wikidata.org/"
  }
}

@Dzahn is there any way for us to be able to force an update rather than wait for puppet?

Not without shell access to the miscweb machines. You can ask SRE on IRC but it also just takes max. 30 minutes and meanwhile it's all done of course.

Also.. it works :)

So now, per T266702#6662363 we need to change the traffic layer to point most of query.wikidata.org to microsites, with the two paths below pointing to the existing wdqs clusters.

/sparql
/bigdata

Change 655051 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/puppet@production] Make query.wikidata.org point to microsite backend instead (for GUI)

https://gerrit.wikimedia.org/r/655051

My knowledge in here is not that great but this should work ^

@Ladsgroup this may interest you :)

1:18 PM <addshore> Hey yall! Its friday so I guess we don't want to deploy it.... but is there any way to test something like this https://gerrit.wikimedia.org/r/c/operations/puppet/+/655051 ?
1:19 PM <addshore> and what's the "right" process to try and get it deployed? puppet deploy window?
2:53 PM <ema> addshore: hey! Usually those changes are fairly safe, but yeah definitely not merging Friday at 4PM :)
2:53 PM <ema> next week!
2:54 PM <ema> to see what the change would do, add the following to the git commit log:
2:54 PM <ema> Hosts: cp3050.esams.wmnet
2:54 PM <ema> and then comment 'check experimental' on the gerrit changeset
2:55 PM <ema> that will give us the puppet catalog diff, including the actual changes to the remap.config file -> https://docs.trafficserver.apache.org/en/latest/admin-guide/files/remap.config.en.html
3:50 PM <jbond42> fyi you can also use ./util/pcc last parse_commit from the root of the puppet repo to kick of a pcc with the current checked out change (must have been submited to gerrit) and it will also parse the Hosts line

Thanks. You can also simply use this form too: https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/build?delay=0sec

I did it and put the result in the patch. Let's merge it next week.

Change 655051 merged by Ema:
[operations/puppet@production] Make query.wikidata.org point to microsite backend instead (for GUI)

https://gerrit.wikimedia.org/r/655051

Change 655697 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/puppet@production] Fix the mapping

https://gerrit.wikimedia.org/r/655697

Change 655697 merged by Ema:
[operations/puppet@production] ATS: fix wdqs remap rules

https://gerrit.wikimedia.org/r/655697

Ladsgroup updated the task description. (Show Details)

This is done now. Thanks to everyone who helped!

Macro macro-deployed:

Wow, already done? Now that was quicker than anticipated. nice :)

It would be great if we can add one or a couple assertions to ./modules/profile/files/httpbb/miscweb/test_miscweb.yaml in operations/puppet.

That file has tests to check if all the existing microsites on miscweb* work but now that WDQS UI is on it that is missing from the tests to have complete coverage.

Change 656270 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/puppet@production] miscweb: Add tests for query.wikidata.org

https://gerrit.wikimedia.org/r/656270

@Dzahn Added tests ^ please take a look when you have time

Change 656270 merged by Dzahn:
[operations/puppet@production] miscweb: Add tests for query.wikidata.org

https://gerrit.wikimedia.org/r/656270

elukey added a subscriber: elukey.

Hi folks, on miscweb1002 I see the following in puppet:

Feb  3 07:40:34 miscweb1002 puppet-agent[31666]: (/Stage[main]/Profile::Microsites::Query_service/Git::Clone[wikidata/query/gui-deploy]/Exec[git_pull_wikidata/query/gui-deploy]/returns) executed successfully (corrective)

It is repeated for each puppet run and it causes the host to pop up in our alerts :)

Found the issue:

diff --git a/maint.html b/maint.html
index 703e17a..e63c70e 100644
--- a/maint.html
+++ b/maint.html
@@ -1 +1 @@
-<html><head><title>Error 503 Service Unavailable</title></head><body><h1>503 Service Unavailable</h1>The service is temporary down for maintenance. Please check back a bit later.</body></html>
\ No newline at end of file
+<html><head><title>Error 503 Service Unavailable</title></head><body><h1>503 Service Unavailable</h1>The service is temporary down for maintenance. Please check back a bit later..</body></html>

I assume this was a test, I am going to checkout the correct file so puppet will not complain anymore.

elukey claimed this task.

Thanks @elukey I don't know where it came from but suspect as well somebody wanted to test if you can edit that manually or puppet will overwrite it.