Update WANCache preemptive refresh probability functionClosed, ResolvedPublicActions

Assigned To
 aaron
Authored By
 aaron Mar 13 2023, 6:20 PM2023-03-13 18:20:28 (UTC+0)
Referenced Files
 F36911780: isrd-stochastic-model.php Mar 15 2023, 1:10 AM2023-03-15 01:10:53 (UTC+0)
 F36910527: isrd-stochastic-model.php Mar 14 2023, 4:18 AM2023-03-14 04:18:58 (UTC+0)
Subscribers

Description

The linear formula can be improved with a power based formula. The one from T265386 was already tested. A variant accounting for LOW_TTL can be used.

This would better handle high traffic keys.

Event Timeline

Restricted Application added a subscriber: Aklapper. Mar 13 2023, 6:20 PM
aaron triaged this task as Medium priority.Mar 13 2023, 7:57 PM
aaron moved this task from Inbox, needs triage to Doing: Goals on the Performance-Team board.

Change 898441 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] objectcache: improve WANObjectCache::worthRefreshExpiring() scalability

https://gerrit.wikimedia.org/r/898441

Running some simulations for the new worthRefreshExpiring():
isStateRefreshDuePow4DelayAwareTimely (ac="ave contention", mc="max contention"):

Req/sRegen/s (2ms delay)Regen/s (8ms delay)Regen/s (16ms delay)Regen/s (512ms delay)Regen/s (1024ms delay)
10.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.02, mc=5]0.04 [ac=0.04, mc=6]
20.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=3]0.03 [ac=0.02, mc=5]0.04 [ac=0.04, mc=7]
40.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.04 [ac=0.02, mc=5]0.05 [ac=0.05, mc=5]
80.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.04 [ac=0.02, mc=6]0.06 [ac=0.05, mc=5]
160.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.05 [ac=0.02, mc=4]0.07 [ac=0.06, mc=6]
320.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.03 [ac=0.00, mc=2]0.05 [ac=0.02, mc=6]0.08 [ac=0.07, mc=6]
640.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=2]0.06 [ac=0.03, mc=4]0.09 [ac=0.08, mc=7]
1280.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=3]0.06 [ac=0.03, mc=5]0.11 [ac=0.09, mc=6]
2560.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=2]0.07 [ac=0.03, mc=6]0.13 [ac=0.10, mc=6]
5120.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=1]0.08 [ac=0.03, mc=4]0.15 [ac=0.12, mc=6]
10240.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=2]0.08 [ac=0.04, mc=4]0.17 [ac=0.13, mc=6]
20480.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=1]0.04 [ac=0.00, mc=2]0.10 [ac=0.04, mc=4]0.22 [ac=0.16, mc=7]
40960.04 [ac=0.00, mc=1]0.05 [ac=0.00, mc=2]0.04 [ac=0.00, mc=1]0.11 [ac=0.04, mc=5]0.26 [ac=0.18, mc=7]
81920.04 [ac=0.00, mc=1]0.05 [ac=0.00, mc=2]0.04 [ac=0.00, mc=1]0.13 [ac=0.05, mc=7]0.28 [ac=0.20, mc=8]

The old function has:
isStateRefreshDueLinear (ac="ave contention", mc="max contention"):

Req/sRegen/s (2ms delay)Regen/s (8ms delay)Regen/s (16ms delay)Regen/s (512ms delay)Regen/s (1024ms delay)
10.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=3]0.04 [ac=0.00, mc=3]0.04 [ac=0.02, mc=5]0.05 [ac=0.04, mc=5]
20.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=3]0.04 [ac=0.00, mc=3]0.05 [ac=0.02, mc=5]0.06 [ac=0.05, mc=8]
40.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=2]0.05 [ac=0.02, mc=6]0.07 [ac=0.05, mc=8]
80.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=2]0.04 [ac=0.00, mc=2]0.06 [ac=0.03, mc=5]0.08 [ac=0.06, mc=8]
160.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.07 [ac=0.03, mc=8]0.10 [ac=0.07, mc=9]
320.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=3]0.08 [ac=0.03, mc=7]0.14 [ac=0.09, mc=10]
640.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=3]0.10 [ac=0.04, mc=8]0.19 [ac=0.11, mc=15]
1280.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=3]0.14 [ac=0.04, mc=10]0.28 [ac=0.15, mc=17]
2560.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=3]0.20 [ac=0.06, mc=13]0.46 [ac=0.22, mc=21]
5120.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=3]0.28 [ac=0.07, mc=16]0.71 [ac=0.32, mc=29]
10240.05 [ac=0.00, mc=2]0.05 [ac=0.00, mc=3]0.06 [ac=0.00, mc=3]0.45 [ac=0.11, mc=18]1.29 [ac=0.53, mc=41]
20480.05 [ac=0.00, mc=1]0.05 [ac=0.00, mc=2]0.06 [ac=0.00, mc=3]0.75 [ac=0.16, mc=21]2.29 [ac=0.88, mc=76]
40960.05 [ac=0.00, mc=1]0.06 [ac=0.00, mc=3]0.06 [ac=0.00, mc=2]1.27 [ac=0.26, mc=43]4.39 [ac=1.64, mc=120]
81920.05 [ac=0.00, mc=1]0.05 [ac=0.00, mc=1]0.06 [ac=0.00, mc=2]2.41 [ac=0.48, mc=67]8.22 [ac=3.07, mc=198]

I tried a few different values for the coefficient of the generation time component (which is added added to \$lowTTL to make \$effectiveLowTTL) for a 120s TTL key.

Results for 0:

isStateRefreshDuePow4DelayAwareTimely (ac="ave contention", mc="max contention", t="total time"):

Req/sRegen/s (16ms delay)Regen/s (512ms delay)Regen/s (1024ms delay)Regen/s (4096ms delay)
1 (t=2000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.01 [ac=0.01, mc=5]0.02 [ac=0.06, mc=13]
2 (t=1000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.01 [ac=0.01, mc=6]0.02 [ac=0.07, mc=13]
4 (t=500000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=6]0.01 [ac=0.01, mc=6]0.03 [ac=0.07, mc=15]
8 (t=250000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=6]0.03 [ac=0.08, mc=17]
16 (t=125000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.01 [ac=0.01, mc=6]0.04 [ac=0.09, mc=17]
32 (t=62500)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.02 [ac=0.01, mc=6]0.05 [ac=0.10, mc=17]
64 (t=31250)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.02 [ac=0.01, mc=6]0.06 [ac=0.12, mc=24]
128 (t=15625)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=5]0.02 [ac=0.01, mc=4]0.07 [ac=0.13, mc=22]
256 (t=7812)0.01 [ac=0.00, mc=2]0.02 [ac=0.01, mc=5]0.02 [ac=0.01, mc=6]0.10 [ac=0.18, mc=29]
512 (t=3906)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=5]0.02 [ac=0.01, mc=6]0.12 [ac=0.20, mc=29]
1024 (t=1953)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=3]0.03 [ac=0.02, mc=6]0.18 [ac=0.27, mc=33]
2048 (t=976)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=5]0.03 [ac=0.02, mc=5]0.25 [ac=0.36, mc=43]
4096 (t=488)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=3]0.03 [ac=0.02, mc=5]0.27 [ac=0.41, mc=56]
8192 (t=244)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=3]0.01 [ac=0.01, mc=2]0.21 [ac=0.25, mc=32]

Results for 16:

isStateRefreshDuePow4DelayAwareTimely (ac="ave contention", mc="max contention", t="total time"):

Req/sRegen/s (16ms delay)Regen/s (512ms delay)Regen/s (1024ms delay)Regen/s (4096ms delay)
1 (t=2000001)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=5]0.02 [ac=0.07, mc=7]
2 (t=1000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=5]0.02 [ac=0.08, mc=8]
4 (t=500000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=5]0.03 [ac=0.08, mc=9]
8 (t=250000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=6]0.03 [ac=0.10, mc=9]
16 (t=125000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=6]0.01 [ac=0.01, mc=6]0.03 [ac=0.11, mc=9]
32 (t=62500)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.02 [ac=0.01, mc=6]0.04 [ac=0.12, mc=11]
64 (t=31250)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.02 [ac=0.01, mc=7]0.05 [ac=0.14, mc=8]
128 (t=15625)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=4]0.02 [ac=0.02, mc=5]0.05 [ac=0.15, mc=9]
256 (t=7812)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=3]0.02 [ac=0.02, mc=5]0.06 [ac=0.17, mc=10]
512 (t=3906)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=4]0.02 [ac=0.02, mc=4]0.08 [ac=0.19, mc=9]
1024 (t=1953)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=2]0.02 [ac=0.02, mc=4]0.09 [ac=0.23, mc=11]
2048 (t=976)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=3]0.02 [ac=0.02, mc=7]0.14 [ac=0.32, mc=15]
4096 (t=488)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=3]0.03 [ac=0.02, mc=4]0.15 [ac=0.31, mc=11]
8192 (t=244)0.01 [ac=0.00, mc=1]0.01 [ac=0.00, mc=1]0.03 [ac=0.02, mc=3]0.35 [ac=0.59, mc=25]

Results for 32:

isStateRefreshDuePow4DelayAwareTimely (ac="ave contention", mc="max contention", t="total time"):

Req/sRegen/s (16ms delay)Regen/s (512ms delay)Regen/s (1024ms delay)Regen/s (4096ms delay)
1 (t=2000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.01 [ac=0.01, mc=5]0.02 [ac=0.08, mc=8]
2 (t=1000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=4]0.03 [ac=0.10, mc=8]
4 (t=500000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=6]0.03 [ac=0.11, mc=8]
8 (t=250000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.02 [ac=0.01, mc=5]0.04 [ac=0.13, mc=7]
16 (t=125000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.02 [ac=0.01, mc=4]0.05 [ac=0.16, mc=8]
32 (t=62500)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.02 [ac=0.02, mc=4]0.06 [ac=0.19, mc=9]
64 (t=31250)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=4]0.02 [ac=0.02, mc=5]0.07 [ac=0.23, mc=8]
128 (t=15625)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.02 [ac=0.02, mc=4]0.09 [ac=0.28, mc=8]
256 (t=7812)0.01 [ac=0.00, mc=2]0.02 [ac=0.01, mc=4]0.02 [ac=0.02, mc=4]0.12 [ac=0.34, mc=10]
512 (t=3906)0.01 [ac=0.00, mc=1]0.01 [ac=0.01, mc=2]0.02 [ac=0.02, mc=4]0.16 [ac=0.42, mc=10]
1024 (t=1953)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=2]0.02 [ac=0.02, mc=3]0.22 [ac=0.53, mc=13]
2048 (t=976)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=3]0.03 [ac=0.02, mc=5]0.23 [ac=0.61, mc=9]
4096 (t=488)0.01 [ac=0.00, mc=2]0.02 [ac=0.01, mc=2]0.03 [ac=0.02, mc=5]0.35 [ac=0.78, mc=11]
8192 (t=244)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=2]0.02 [ac=0.02, mc=3]0.44 [ac=0.95, mc=16]

Just raising lowTTL from 30 to 60, with no generation time factor yields:

isStateRefreshDuePow4DelayAwareTimely (ac="ave contention", mc="max contention", t="total time"):

Req/sRegen/s (16ms delay)Regen/s (512ms delay)Regen/s (1024ms delay)Regen/s (4096ms delay)
1 (t=2000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=5]0.01 [ac=0.01, mc=6]0.02 [ac=0.06, mc=10]
2 (t=1000000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=6]0.02 [ac=0.06, mc=10]
4 (t=500000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=6]0.02 [ac=0.07, mc=10]
8 (t=250000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.01 [ac=0.01, mc=4]0.03 [ac=0.08, mc=12]
16 (t=125000)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.02 [ac=0.01, mc=7]0.03 [ac=0.08, mc=10]
32 (t=62500)0.01 [ac=0.00, mc=2]0.01 [ac=0.01, mc=4]0.02 [ac=0.01, mc=5]0.03 [ac=0.09, mc=18]
64 (t=31250)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=3]0.02 [ac=0.02, mc=4]0.04 [ac=0.10, mc=13]
128 (t=15625)0.01 [ac=0.00, mc=2]0.02 [ac=0.01, mc=4]0.02 [ac=0.02, mc=4]0.05 [ac=0.11, mc=12]
256 (t=7812)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=3]0.02 [ac=0.02, mc=4]0.06 [ac=0.14, mc=15]
512 (t=3906)0.01 [ac=0.00, mc=2]0.02 [ac=0.01, mc=2]0.02 [ac=0.02, mc=5]0.07 [ac=0.15, mc=16]
1024 (t=1953)0.01 [ac=0.00, mc=2]0.02 [ac=0.01, mc=3]0.02 [ac=0.02, mc=4]0.08 [ac=0.17, mc=15]
2048 (t=976)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=4]0.02 [ac=0.02, mc=3]0.11 [ac=0.22, mc=14]
4096 (t=488)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=2]0.03 [ac=0.02, mc=4]0.14 [ac=0.23, mc=16]
8192 (t=244)0.01 [ac=0.00, mc=1]0.02 [ac=0.01, mc=2]0.02 [ac=0.01, mc=2]0.08 [ac=0.14, mc=14]

Change 898441 merged by jenkins-bot:

[mediawiki/core@master] objectcache: improve WANObjectCache::worthRefreshExpiring() scalability

https://gerrit.wikimedia.org/r/898441