Page MenuHomePhabricator

Allow access to wdqs.svc.eqiad.wmnet on port 8888
Open, NormalPublic


As far as I can tell wdqs.svc.eqiad.wmnet will direct me to an active wdqs server 'always'.

Port 8888 was opened on the wdqs servers to allow for internal queries to run with a longer timeout T119941

Would it be possible to also access wdqs.svc.eqiad.wmnet on port 8888

Allowing this would allow me to remove the hard coding of an individual machine added in

Event Timeline

Restricted Application added a project: Discovery. · View Herald TranscriptSep 27 2017, 1:52 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Addshore updated the task description. (Show Details)Sep 27 2017, 1:55 PM
Addshore moved this task from Unsorted 💣 to Watching 👀 on the User-Addshore board.

I wonder if it may be more beneficial to use codfw ones for longer tasks, since they are getting less routine traffic now.

BBlack moved this task from Triage to LoadBalancer on the Traffic board.Oct 23 2017, 2:51 PM
ema triaged this task as Normal priority.Nov 9 2017, 7:29 AM

Bump as this is probably trivial but needs the right pair of hands to get it done.

Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Mar 5 2018, 4:15 PM
elukey added a subscriber: elukey.Jul 2 2018, 12:41 PM
elukey added a comment.Jul 4 2018, 9:09 AM

After a chat with Discovery we ended up refreshing the list of hosts in the Analytics VLAN firewall (that is meant for traffic from the analytics hosts towards production, like stat1005 to wdqs):

It seems that it is not possible to whitelist only the VIP IP wdqs.svc.eqiad.wmnet

It looks like this was the cause of the dashboard breaking again in T218710.

It is a shame that we can not whitelist wdqs.svc.eqiad.wmnet, I guess we will just have to keep manually changing which server we point at?
Unless anyone can think of another way?

@Addshore, just saw T218710 and clicked through to here. If you use, you can access wdqs.svc.eqiad.wmnet over HTTP from the analytics VLAN.

@Addshore, just saw T218710 and clicked through to here. If you use, you can access wdqs.svc.eqiad.wmnet over HTTP from the analytics VLAN.

Please don't do that. As the page very clearly says it's To allow HTTP requests reach the outside world, not to bypass internal restrictions

Ah, hm ok.

Actually, @elukey why can't we allow the VIP IP? We did this in T221690: Allow analytics VLAN to reach schema.svc.$site.wmnet, no?

elukey added a subscriber: ayounsi.EditedJul 9 2019, 3:30 PM

Not really, I wish myself from the past added more info. I asked to @ayounsi and he didn't come up with a reason not to, so in theory we could try to modify the term on the firewall and see how it goes. The config is currently:

elukey@re0.cr1-eqiad> show configuration firewall family inet filter analytics-in4 term wdqs
from {
    destination-address {
        /* wdqs1003 */;
        /* wdqs1004 */;
        /* wdqs1005 */;
        /* wdqs2003 */;
        /* wdqs2001 */;
        /* wdqs2002 */;
    protocol tcp;
    destination-port 8888;
then accept;

That explicitly whitelist every target host. I recall that there was a reason behind it, but not which one :(

Changed the following: (Cc: @ayounsi )

elukey@re0.cr2-eqiad# show | compare
[edit firewall family inet filter analytics-in4 term wdqs from destination-address] { ... }
+        /* wdqs.svc.eqiad.wmnet */

Now I can see telnet reaching the endpoint from stat1007, but getting connection refused:

elukey@stat1007:~$ telnet wdqs.svc.eqiad.wmnet 8888
telnet: Unable to connect to remote host: Connection refused

I guess that something more is needed?

elukey added subscribers: WMDE-leszek, Ladsgroup.EditedTue, Aug 6, 10:43 AM

Adding @WMDE-leszek and @Ladsgroup since afaics they were/are working on this :)

The idea would be to move all your scripts to the wdqs.svc.eqiad.wmnet 8888 endpoint if possible, and then clean up the explicit single host settings in the analytics firewall. Let me know your thoughts!

Gehel added a comment.EditedTue, Aug 6, 11:59 AM

At the moment, we have a ferm rule to allow access to port 8888 from $DOMAIN_NETWORKS. I think this should be sufficient, but I'm always somewhat lost in our network.

As far as I can see, we don't have an LVS configuration for port 8888, so that needs to be addressed as well.

Side note: since we are expecting heavy queries, we should route those only to the public wdqs endpoint (wdqs.svc.{eqiad|codfw}.wmnet) and NOT to the private cluster (wdqs-internal.svc.{eqiad|codfw}.wmnet).

Change 529053 had a related patch set uploaded (by Mathew.onipe; owner: Mathew.onipe):
[operations/puppet@production] lvs: allow access to wdqs lvs on port 8888

Gehel added a comment.Thu, Aug 8, 10:07 AM

A few more comments after discussion with @elukey :

  • the use of port 8888 to get extended query timeouts is exceptional and should only ever be used by analytics (or at least, new use cases needs to be vetted)
  • not having this go through LVS makes it fairly explicit that this is a hack and should not be used widely
  • if we add an LVS endpoint, we need to ensure that we have some control on who is accessing it
  • $ANALYTICS_NETWORK ferm alias could be used, but that's more restrictive than what we have now, so we need to check that no other clients is using this port
  • not directly related to this task: we don't have SSL termination on the wdqs servers, everything in is the clear, we should probably address that at some point

Change 530856 had a related patch set uploaded (by Mathew.onipe; owner: Mathew.onipe):
[operations/puppet@production] wdqs: restrict port 8888 to analytics networks

Change 530856 merged by Gehel:
[operations/puppet@production] wdqs: restrict port 8888 to analytics networks