Page MenuHomePhabricator

Restructure paws away from special networking
Closed, ResolvedPublic

Description

PAWS is in a position now that it could be updated to look much like any other cloud VPS project. Since the move to Magnum it no longer has a manually deployed kubernetes cluster, it uses Trove rather than a manually install DB. There are two things that it uses that are not cloud service in flavor, those are nfs, and its haproxy setup. I believe there is work going into making a cloud storage solution, so nfs will be ignored for this task.

On the ingress side of things paws has a fairly complex setup and uses a floating IP, putting it outside the realm of how we would like to see projects use our services. This could be simplified by using a web proxy instead of a floating ip with associated dns entries, and an haproxy/acme-chief setup that manages and terminates tls. Rather this could all be collapsed into a web proxy pointed to a magnum cluster member.

This would have a user facing change of:
hub.paws.wmcloud.org
would become:
hub-paws.wmcloud.org

and
public.paws.wmcloud.org
would become:
public-paws.wmcloud.org

This would be announced in advance, both to cloud announce and as a banner on paws itself. Additionally T329212 will allow for us to have a parallel deploy, giving a grace period where both the new and old domain are active. After which a VM could be setup to direct anyone who arrives from the old domains to the new domains for a time (T328971).

We get some additional bonus improvements, the acme-chief/haproxy setup has failed in the past (T308383 is one such instance), removing them would prevent that. Additionally we would simplify the structure of paws lowering the bar to entry for anyone who might be interested in working on it.

After doing this we have something of a flagship project that we can point to as an example of how one might want to be using cloud VPS. Giving us a clear example of a project that is using our services that doesn't feel like a toy project.

Event Timeline

Hello @rook I want to take this issue . So should I change
hub.paws.wmcloud.org
to
hub-paws.wmcloud.org
and
public.paws.wmcloud.org
to
public-paws.wmcloud.org
In the all directories??

Hi @Ayushk21! In this case this is not a very straight forward ticket. Which, amusingly, will probably involve no changes to the code in the git repo. Most of this ticket is about first verifying that we can't use a web proxy with the same names that are currently used, and then if that is the case asking if we can get that enabled. Indeed even if we cannot get that enabled we would end up using the floating IP as is (if we could use a web proxy we could drop the floating IP entirely, and that would be neat). Sorry for the confusing ticket, it occurred to me as a way to increase stability on a Saturday, so I took note of it but haven't done some testing to clean the ticket up yet.

Mostly I wouldn't recommend this one, as it won't involve any code changes, and basically all changes that might occur would likely happen by other folks outside of paws.

Testing in paws-dev shows the same header problem seen in T326217

[W 2023-02-06 12:53:00.475 JupyterHub base:89] Blocking Cross Origin API request.  Referer: https://hub-paws-dev.codfw1dev.wmcloud.org/hub/home, Host: hub-paws-dev.codfw1dev.wmcloud.org, Host URL: http://hub-paws-dev.codfw1dev.wmcloud.org/hub/

Basically when we decrypt in web proxy we don't set REFERER from https to http.

Adding the following to the jupyterhub ingress annotations seems to get it working in paws-dev.

nginx.ingress.kubernetes.io/configuration-snippet: |
  more_set_input_headers "REFERER: http://hub-paws-dev.codfw1dev.wmcloud.org/hub/";
rook renamed this task from Remove haproxy? to Restructure paws away from special networking.Feb 6 2023, 8:53 PM
rook updated the task description. (Show Details)

After doing this we have something of a flagship project that we can point to as an example of how one might want to be using cloud VPS. Giving us a clear example of a project that is using our services that doesn't feel like a toy project.

I am personally thrilled by this idea and appreciate the effort to make PAWS an excellent example to follow for other projects!

I appreciate the transition period for updating links. I was curious about how many links would be affected and so ran the query (at least for Meta where I expected the most links to exist on-wiki; maybe worth running a similar query for others?) and migrating existing links seems pretty doable as most are on archive pages and so can presumably be ignored: https://quarry.wmcloud.org/query/71641

I've got what I could updated in https://quarry.wmcloud.org/query/71641 thank you for posting

rook changed the task status from Open to In Progress.Mon, Mar 13, 10:32 AM

Mentioned in SAL (#wikimedia-cloud) [2023-03-13T10:40:31Z] <Rook> Restructure paws away from special networking (Change paws domain name) df16f355de3856c9ef7ef72ea4ae86dc9080723f T328842

Adding a note here that I've updated the links on:

  • mediawiki.org
    • Template:REST API
    • API:REST API/Reference
    • Manual:Pywikibot/PAWS#See_also
    • Wikimedia_Hackathon_2022/Showcase#Blocks_to_Code
  • wikitech
    • PAWS/About_Jupyter_notebooks_hosted_on_PAWS#Example_Uses
    • PAWS/PAWS_and_Pywikibot
    • News/Wiki_Replicas_2020_Redesign#How_should_I_connect_to_databases_in_PAWS?

Thanks for sharing the helpful Quarry link!