ATM only originals are being replicated to codfw, we should also replicate thumbnail containers
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | aaron | T88445 MediaWiki active/active datacenter investigation and work (tracking) | |||
Resolved | aaron | T91869 Implement a replication strategy for Swift | |||
Resolved | aaron | T125791 swiftrepl replication pass for thumbnails eqiad -> codfw |
Event Timeline
swiftrepl running on copper ATM, it copied ~1.4M files in an hour, in eqiad there are ~670M thumbs, assuming a constant rate that's ~20d for a full copy. I'm also looking at copying the thumbnail standard sizes first since those make up the majority or requests anyways.
object/bytes for thumbs in eqiad+codfw:
re: requested sizes, after ~1.5h of requests these are the results edit: these are not the requested sizes from end users but sizes involved in thumbnail flush. the requested sizes from end users are at https://phabricator.wikimedia.org/T125791#2003130
$ sort ~/thumbs_requests | sort | uniq -c | sort -nr | head -50 1125 240px 1076 2880px 850 180px 758 120px 644 800px 601 150px 370 1024px 359 256px 349 200px 349 1920px 338 300px 308 320px 305 1280px 290 220px 286 128px 281 1200px 273 100px 262 450px 260 80px 240 75px 230 135px 214 330px 211 440px 208 250px 197 267px 184 400px 179 2560px 163 600px 142 188px 139 1600px 136 877px 134 640px 122 96px 121 144px 117 90px 115 375px 106 480px 105 500px 95 360px 94 270px 93 48px 91 853px 91 67px 86 675px 83 140px 82 160px 79 125px 77 302px 76 720px 74 1599px $ wc -l thumbs_requests 20214 thumbs_requests
similar distribution after ~150k requests: edit: see https://phabricator.wikimedia.org/T125791#2028898 for (sampled) distribution of end-user requests
$ sort ~/thumbs_requests | sort | uniq -c | sort -nr | head -50 10148 240px 7539 180px 6571 2880px 6166 120px 4705 800px 3935 150px 3425 300px 3062 200px 2955 320px 2454 220px 2452 1024px 2251 1280px 2197 100px 2164 80px 2159 1920px 2074 1200px 2059 250px 2005 135px 1781 640px 1664 450px 1547 877px 1507 600px 1462 128px 1376 330px 1373 400px 1320 267px 1297 440px 1271 144px 1226 2560px 1192 96px 1189 360px 1156 75px 1140 720px 1011 1600px 984 90px 797 270px 779 160px 778 500px 748 50px 716 302px 712 480px 705 375px 654 60px 644 125px 627 853px 588 192px 572 140px 570 67px 563 678px 560 225px $ wc -l ~/thumbs_requests 155047 /home/filippo/thumbs_requests
Change 269387 had a related patch set uploaded (by Filippo Giunchedi):
swiftrepl: name-based filter for objects
Mentioned in SAL [2016-02-12T13:01:28Z] <godog> restart thumbs swiftrepl, auth token expired T125791
after 56M thumbnail requests from ms-fe1001 the size distribution looks like this
$ sort ~/thumbs_requests | uniq -c | sort -nr | head -100 4385244 120px 3065097 220px 1969490 180px 1886039 240px 1853986 800px 1745935 440px 1665711 200px 1571025 720px 1540029 300px 1413915 320px 1411493 330px 1299094 250px 1145884 80px 1104180 640px 1089041 1024px 1062607 1280px 987829 150px 914954 100px 785784 500px 704360 600px 689219 400px 588255 144px 536918 90px 490428 375px 469687 450px 426055 1920px 410155 20px 398832 75px 369575 48px 353330 360px 340015 225px 320753 96px 319464 2880px 298162 280px 281487 160px 276432 64px 268781 270px 262787 130px 262376 60px 261131 170px 258721 135px 240388 260px 238059 50px 231692 1200px 225385 480px 214306 40px 182280 140px 171926 350px 167295 128px 155129 340px 142825 2560px 136440 255px 132809 560px 131267 230px 124329 256px 124311 512px 124180 420px 118168 30px 109164 72px 104952 2000px 104532 1600px 101976 520px 95179 70px 93135 110px 92394 192px 91821 210px 90642 1000px 90328 290px 85960 390px 85426 85px 83356 112px 83251 22px 81624 768px 80045 125px 77366 266px 76299 113px 75019 36px 74135 53px 73209 79px 71948 67px 68259 81px 67546 45px 66293 25px 65425 105px 65078 108px 64903 799px 62825 103px 62795 78px 62040 190px 61529 700px 60352 540px 60020 92px 59723 95px 59562 119px 58528 245px 57580 460px 57440 63px 57121 175px 54833 267px 54516 308px
Mentioned in SAL [2016-02-16T10:50:25Z] <godog> start swiftrepl commons thumbs for top50 popular size T125791
Change 272455 had a related patch set uploaded (by Filippo Giunchedi):
swiftrepl: fix destination container listing limit
Change 272455 merged by Filippo Giunchedi:
swiftrepl: fix destination container listing limit
the initial copy is still ongoing, at 474M over 685M objects.
wrt the codfw switchover:
- we're ~70% of the way there
- assuming imagescalers are mostly cpu-bound, current cpu utilization is average 4% and ~80 rps
- swift requests peak at ~1200 rps
- when we switch to codfw, if all missing thumbnails would hit the imagescalers that's 30% or 1200 rps = ~360 rps or IOW 4.5x the current load
- at 4.5x the current load that translates to ~20% cpu util on the imagescalers
- ditto for network, current utilization is ~8MB/s per machine or ~36MB/s per machine at 4.5x the current load