Page MenuHomePhabricator

KDC performance tuning for TCP requests
Closed, ResolvedPublic

Description

See T329525 for context. @nfraison and myself did some tests and going forward we should

  • Increase the amount of KDC worker daemons
  • Bump net.core.somaxconn to cope with the bursty nature of tgs-req requests issued by Presto queries

Event Timeline

MoritzMuehlenhoff triaged this task as Medium priority.
MoritzMuehlenhoff created this task.

Change 889971 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Tweak scalability of KDC requests

https://gerrit.wikimedia.org/r/889971

Change 889971 merged by Muehlenhoff:

[operations/puppet@production] Tweak scalability of KDC requests

https://gerrit.wikimedia.org/r/889971

Change 890388 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Revert "Tweak scalability of KDC requests"

https://gerrit.wikimedia.org/r/890388

Change 890388 merged by Muehlenhoff:

[operations/puppet@production] Revert "Tweak scalability of KDC requests"

https://gerrit.wikimedia.org/r/890388

Change 890389 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Tweak scalability of KDC requests (v2)

https://gerrit.wikimedia.org/r/890389

Change 890389 merged by Muehlenhoff:

[operations/puppet@production] Tweak scalability of KDC requests (v2)

https://gerrit.wikimedia.org/r/890389

Change 890393 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Adapt KDC monitoring to dynamic KDC worker count

https://gerrit.wikimedia.org/r/890393

Change 890393 merged by Muehlenhoff:

[operations/puppet@production] Adapt KDC monitoring to dynamic KDC worker count

https://gerrit.wikimedia.org/r/890393

The amount of KDC workers is now configurable via the new profile::kerberos::kdc::workers Hiera setting and has been raised to 8. In the addition net.core.somaxconn sysctl was bumped to 16k. This fixes the request burst issues observed in T329525

Change 891310 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Adjust monitoring for KDC processes if worker threads are in use

https://gerrit.wikimedia.org/r/891310

Change 891310 merged by Muehlenhoff:

[operations/puppet@production] Adjust monitoring for KDC processes if worker threads are in use

https://gerrit.wikimedia.org/r/891310