Page MenuHomePhabricator

Update indirect dependency on github.com/gwicke/kad.git
Open, LowPublic

Description

The limitation rate-limiting module pulls in a fork of kadence from a git master located at github.com/gwicke/kad.git. This fork hasn't been updated in a while and has begun throwing deprecation warnings on npm install:

npm WARN deprecated kad-memstore@0.0.1: Please upgrade to @kadenceproject/kadence - See https://kadence.github.io
npm WARN deprecated kad-fs@0.0.4: Please upgrade to @kadenceproject/kadence - See https://kadence.github.io

This should be cleaned up and hopefully brought up to date.

Event Timeline

Mholloway renamed this task from Update indirect dependency on gwicke/kad.git to Update indirect dependency on github.com/gwicke/kad.git.Jul 25 2018, 9:56 PM
Mholloway updated the task description. (Show Details)

And transferred to wikimedia group.

I've been looking into this and the task ain't easy - the fork significantly diverged from the master, so rebasing it on top of a current master is challenging.

This is increasingly becoming a problem as npm audit is reporting more issues from the dependencies.

Pchelolo raised the priority of this task from Low to Medium.

Given that lack of maintenance, the limiter has started causing real issues (see T212631), I'm raising the priority to 'Normal'

A bit more background:
The limitation library is a rate limiter based on DHT implementation called KAD. Historically, it was using Gabriels fork which was then transferred into the wikimedia org, but the fork has very significantly diverged from the upstream. Given that upstream API has changed, even the repos the fork is based on were deprecated, we should abandon the fork completely.

Ideally, we need to figure out the reasoning behind the 10 additional commits in the fork and attempt to use the upstream in the limitation library.
Additionally, we need to first reevaluate existing third-party libs - in the years that have passed since limitation was created the environment might have changed.

Update: Now npm audit shows 2 vulnerabilities from limitation.

If this takes longer I suggest just updating its dependencies.

Can't any necessary rate limiting be handled by Varnish/Apache Traffic Server, at least in the WMF production context?

Can't any necessary rate limiting be handled by Varnish/Apache Traffic Server, at least in the WMF production context?

We (as in the WMF) are trying to minimise the reliance on Varnish for anything that is not related to caching (routing, rate-limiting, etc), so this would go against that effort.

Update: Now npm audit shows 2 vulnerabilities from limitation.

Apologies for the hold-up. The dependencies have been updated, so now if you rm -rf node_modules && npm install, the vulnerabilities should no longer show. If you still use the deploy repo for your service (i.e. you are not on k8s), you should use --force when building the repo next time in order to pull in the updated dependencies.

I'm raising the priority of this yet again cause the fork has completely diverged from the upstream and it's getting to the point when it's dependencies are not supported either, so we really need to switch away from the fork.

I'm having another related problem. For some reason, lately, npm is sometimes not completely installing kad and its dependencies. In particular I'm finding that I frequently hit an error related to a missing colors module:

Error: Cannot find module 'colors/safe'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:582:15)
    at Function.Module._load (internal/modules/cjs/loader.js:508:25)
    at Module.require (internal/modules/cjs/loader.js:637:17)
    at require (internal/modules/cjs/helpers.js:22:18)
    at Object.<anonymous> (/home/mholloway/code/wikimedia/mobileapps/node_modules/limitation/node_modules/kad/lib/logger.js:3:14)
WDoranWMF lowered the priority of this task from High to Low.Nov 7 2019, 8:00 PM

Resetting deactivated assignee account.

@WDoranWMF can you clarify why this was deprioritized to "low" given the problems it's causing elsewhere and previous comments at T200374#5538736 and T298854#7618747? service-runner is pretty critical to nearly all the services we run.

MSantos added subscribers: Nikerabbit, MSantos.

@Nikerabbit cxserver seems to be the only service apart from restbase using this according to codesearch. Since service-runner is unowned, I'll remove MediaWiki-Engineering but happy to discuss this if the change is important to you, please let me know.

@Nikerabbit cxserver seems to be the only service apart from restbase using this according to codesearch. Since service-runner is unowned, I'll remove MediaWiki-Engineering but happy to discuss this if the change is important to you, please let me know.

Not against removing this from cxserver, it wasn't working anyway in a kubernetes environment (where cxserver was deployed) since it is implementing a very simple discovery mechanism that is not a good fit for an environment with ephemeral workloads. In the pathological case, it leads to 1 isolated DHT per pod. I have removed support from the chart myself in 8499d5aa41796bd8bbc754be1041ffc27d877917 in fact.

However, I am worried about the Since service-runner is unowned statement. Has this been raised/communicated/discussed etc?