Page MenuHomePhabricator

Move WDQS UI to microsites
Open, MediumPublic

Description

In T264710#6562678, T264710#6573814 and T264710#6586070 as part of T264710: Host static sites on kubernetes it was decided that the query ui and querybuilder ui would be deployed on the "static sites" infrastructure. See https://wikitech.wikimedia.org/wiki/Microsites
This is also further work in the query UI decoupling that was previously covered in T241291: Simplify WDQS Packaging

And followups to this task:

  • Build step to populate the deploy repo with a built thing, with config and favicon

Event Timeline

Addshore created this task.Oct 28 2020, 7:55 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 28 2020, 7:55 PM
Dzahn claimed this task.Oct 29 2020, 1:55 AM
Dzahn removed Dzahn as the assignee of this task.

Took this to say I will take care of the puppet part to get these on miscweb. But that may be a subtask and I should have waited for the list. (?)

Took this to say I will take care of the puppet part to get these on miscweb. But that may be a subtask and I should have waited for the list. (?)

So for this one I'm not sure we actually need any sub tasks?
The build GUI to be deployed lives in https://gerrit.wikimedia.org/r/admin/repos/wikidata/query/gui-deploy
AFAIK that is all that is needed for the puppet change?

My understanding is that either:

  1. query.wikidata.org will end up needing to point to this static site? and then something there redirect /sparql to the wdqs service?
  2. query.wikidata.org will continue to point to wdqs servers, and nginx there will then pass everything that isn't /sparql to the static site.

From the discussions on T264710 and in IRC I believe the consensus was that open 2 would be easiest?
Need to poke Wikidata-Query-Service team (cc @Gehel ) about which paths other than /sparql also need to be accounted for in order to make the nginx change on the wdqs servers, but perhaps the search team can do that anyway.

Open questions:

  • Are we sure that we want to go with option 2 above, and have edge -> wdqs -> static site?
  • Are the +2 permissions on wikidata/query/gui-deploy okay from the ops side? (we should review this)
  • How will this be exposed internally on the cluster and thus accessible to the wdqs hosts to point to in nginx?

I don't like having services proxying each others internally. Both edge->wdqs->static and edge-static-wdqs are adding accidental complexity to the architecture. This should really be a traffic routing question, wdqs and wdqs-ui are to separate services. The limitation is that for historical reason (tm) those have been deployed behind the same hostname (query.wikidata.org). Introducing 2 different hostnames (sparql.wikidata.org / query.wikidata.org ?) would be the right thing to do, but the transition is probably non trivial to manage.

BBlack added a subscriber: BBlack.Oct 29 2020, 1:42 PM

We can route different URI subspaces differently at the edge layer, based on URI regexes, as shown here for the split of the API namespace of the primary wiki sites:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/profile/trafficserver/backend.yaml#263

Addshore added a comment.EditedOct 29 2020, 2:24 PM

We can route different URI subspaces differently at the edge layer, based on URI regexes, as shown here for the split of the API namespace of the primary wiki sites:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/profile/trafficserver/backend.yaml#263

This sounds perfect.
I'll populate the various parts of this ticket soon that are going to need to happen.

  • Are we sure that we want to go with option 2 above, and have edge -> wdqs -> static site?

No we do not, we will route at the edge

From #wikimedia-traffic

2:21 PM <bblack> addshore: https://phabricator.wikimedia.org/T266702#6588396
2:23 PM <addshore> bblack: <3 ty
2:42 PM <ema> addshore: we also already do something similar for wdqs https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/profile/trafficserver/backend.yaml#154
2:42 PM <ema> /bigdata/ldf -> wdqs1005.eqiad.wmnet/bigdata/ldf
2:42 PM <ema> everything else to wdqs.discovery.wmnet
2:42 PM <addshore> aaaah nice
2:43 PM <ema> the very important thing is that we don't rewrite the uri_path
2:43 PM <ema> like in /bigdata/ldf -> wdqs1005.eqiad.wmnet/bigdata/ldf
2:43 PM <ema> we don't want to deal with uri rewrites at the caching layer if at all possible
2:46 PM <ema> addshore: so /sparql -> wdqs_service/sparql is great :)
2:46 PM <addshore> ack, thats probably all perfect :)
2:47 PM <addshore> I may copy these logs into the ticket so that I don't loose the links

Change 637552 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] create microsite for WDQS UI

https://gerrit.wikimedia.org/r/637552

Addshore updated the task description. (Show Details)Oct 29 2020, 7:37 PM

So looking at the traffic layer change a little more....

And looking at other things in general we need to account for.

There are various things that are currently added for the GUI by the WDQS puppet that are not included in the guid build repo

These should probably be included in the -deploy repo for the gui so that WMDE can continue to change them without needing a puppet merge
The alternative that would need puppet merges would be to add them to https://gerrit.wikimedia.org/r/c/operations/puppet/+/637552
The -deploy repo is currently manually built, so we need to figure out how to correctly add these files there.

Addshore updated the task description. (Show Details)Oct 29 2020, 7:49 PM

Change 637552 merged by Dzahn:
[operations/puppet@production] create microsite for WDQS UI

https://gerrit.wikimedia.org/r/637552

Dzahn added a comment.EditedOct 30 2020, 10:50 PM

Hi @Addshore Puppet has an issue cloning from the deployment repo:

fatal: Remote branch master not found in upstream origin

(Oh, another repo using "production" besides operations/puppet)

Change 637807 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] WDQS microsite: use branch production instead of master

https://gerrit.wikimedia.org/r/637807

Change 637807 merged by Dzahn:
[operations/puppet@production] WDQS microsite: use branch production instead of master

https://gerrit.wikimedia.org/r/637807

Mentioned in SAL (#wikimedia-operations) [2020-10-30T23:32:41Z] <mutante> adding query.wikidata.org to TLS cert for webserver-misc-apps.discovery.wmnet T266702

Change 637811 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] ssl: add query.wikidata.org to TLS cert for webserver-misc-apps

https://gerrit.wikimedia.org/r/637811

Change 637811 merged by Dzahn:
[operations/puppet@production] ssl: add query.wikidata.org to TLS cert for webserver-misc-apps

https://gerrit.wikimedia.org/r/637811

I added Apache config and git cloning to the miscweb backends.

Then added query.wikidata.org to the TLS cert they are using.

Now you can already request query.wikidata.org from them from internal:

[deploy1001:~] $ curl -H "Host: query.wikidata.org" https://webserver-misc-apps.discovery.wmnet

or

[cumin1001:~] $ httpbb --hosts miscweb1002.eqiad.wmnet,miscweb2002.codfw.wmnet - < wdqs.yaml 
Sending to 2 hosts...
PASS: 1 request sent to each of 2 hosts. All assertions passed.
Dzahn updated the task description. (Show Details)Oct 30 2020, 11:47 PM

So I now see that the custom-config used to be in the build repo in the production branch.
It was removed in https://gerrit.wikimedia.org/r/c/wikidata/query/gui-deploy/+/606545 as part of T251514: UI for SPARQL Endpoint for Commons
Need to talk to the search platform team to figure out how this change will end up playing with the sdoc query service.

jijiki triaged this task as Medium priority.Tue, Nov 10, 4:22 PM
Dzahn added a comment.Tue, Nov 17, 8:30 PM

@Addshore Any suggestion who will take the "Deal with the favicon and the custom-config in the GUI build" check box that seems to be next here?

@Addshore Any suggestion who will take the "Deal with the favicon and the custom-config in the GUI build" check box that seems to be next here?

So currently both the WCQS (Wiki Commons Query Service) and the WDQS (Wikidata query service) are deployed from this same deployment repo.
After a discussion with the team leading WCQS (search team) we agreed that we can go and add the config and favicon for the WDQS back to this deployment repo and we take ownership of it (at least of this branch) for the WDQS deployment to microsites for now.
So the next steps forward would be:

  • Add the config and favicon to the deployment repo
  • Update the microsites deployment
  • Do the traffic layer dance

At a later point we will then want to:

  • Figure out our build step which would add the built code to the repo, including the config and favicon :)
Addshore updated the task description. (Show Details)Fri, Nov 20, 11:51 AM