Spent a good part of the day on this in IRC, copying some relevant info here:
* dcausse started a completion suggester build this morning, it ran all day (14hours and still going)
* Choosing a random day, 20170312, all completion suggester indices built in 8 hours
** enwiki built 5.35M docs from 03:44:26 to 05:22:16
** about 100 minutes, so 54,720 docs/minute
* ebernhardson took a 5 minute tcpdump of all traffic to/from terbium to codfw cluster hosts
** Indexing bulks is reporting an average of ~10ms
** scrolls reporting an average of ~100ms, peaks ~200ms
** 3088 /_bulk requests over 5 minutes, average content size of 41kB. This is for all 4 wikis currently indexing
** Hard to filter packets to only one wiki, but assuming 1/4 thats 772 bulks per wiki, 77,200k docs per 5 minutes, 15.4k docs per minute per wiki
** enwiki is 5 mil docs, 15.4k docs/minute would take 5.5 hours or more.
** 15.4k docs/minute, vs 54k docs/minute previously. Where did the time go?
** 772 bulks in 5 minutes per wiki is 2.57 round trips per second
Possible solutions:
* batch sizes are very small. request and response cycle happens multiple times per second. Current size is only 100 docs per scroll + bulk.
* Figure out what happened to make this so much slower...