Page MenuHomePhabricator

UDP packet flooding possible with maintenance/purgeChangedPages.php
Closed, ResolvedPublic

Description

Mark has pointed out that the new maintenance/purgeChangedPages.php script could saturate packet buffers in routers and switches if it sends a very large number of UDP packets in a short period of time. The consequence of this would be that some unknown number of packets are silently discarded due to buffer overflow.

A suggested solution is to insert a small artificial delay before sending each packet. Brandon suggested that a 10ms should be more than enough to prevent buffer overflows. With that rate limit in place we would effectively throttle the HTCP packet output to 100/s (360,000/hr). It's estimated that a 10K-item purge would take ~17 minutes of wall clock time to complete and a 150K list would take 25 minutes .


Version: 1.22.0
Severity: normal

Details

Reference
bz55632

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:28 AM
bzimport set Reference to bz55632.

Further discussion with Brandon and Mark sets a rate of 200/s (5ms delay) as also acceptable.

Change 89325 had a related patch set uploaded by BryanDavis:
Add HTCP rate limiting to SquidUpdate

https://gerrit.wikimedia.org/r/89325

Change 89842 had a related patch set uploaded by BryanDavis:
Add configurable delay between purgeChangedPages batches

https://gerrit.wikimedia.org/r/89842

Change 89844 had a related patch set uploaded by BryanDavis:
Add configurable delay between purgeChangedPages batches

https://gerrit.wikimedia.org/r/89844

Change 89325 merged by jenkins-bot:
Add configurable delay between purgeChangedPages batches

https://gerrit.wikimedia.org/r/89325

Change 89842 merged by jenkins-bot:
Add configurable delay between purgeChangedPages batches

https://gerrit.wikimedia.org/r/89842

Change 89844 merged by jenkins-bot:
Add configurable delay between purgeChangedPages batches

https://gerrit.wikimedia.org/r/89844

Patch is merged, backported and pushed to cluster.