Jun 7 2020
Hi Krinkle, just would like to check whether there is any update on this issue? Just to clarify the request, for the dataset, we are looking for the following information, "timestamp, key, size of value, TTL, operation". Thank you!
Apr 18 2020
BTW, we can help contribute some log collection scripts if needed.
Hi Krinkle, this is what we are looking for, it would be great if we can have such dataset, even if it is sampled. Thank you! Meanwhile, may I ask what kind of data is stored in memcached cluster?
Dec 14 2019
I see, thank you again for explaining and providing the trace!
Dec 13 2019
Hi @lexnasser, thank you for the quick reply! This is helpful. For me to better understand the filtering, what is this filter by is_pageview? What are the requests that are not pageview? Thank you!
Hi @Nuria, @lexnasser and everyone else, thank you for the dataset, they are great assets to the research community!
Same as Daniel's question, grafana shows the txt cache is serving several times higher request rate compared to upload cache (https://grafana.wikimedia.org/d/000000450/varnish-traffic-instance-breakdown?orgId=1&from=now-7d&to=now&var-datasource=esams%20prometheus%2Fops&var-cache_type=upload&var-server=All&var-layer=frontend), however, the txt cache is much smaller than the upload cache, which seems weird.
I understand Nuria says upload cache servers have larger volumes because each page has several images, but there are also a lot of pages not having any image.
Is it possible to have a check? Thank you!
Dec 11 2019
Dec 10 2019
Hi all, Thank you for the quick reply! I have been reading the task thread @mforns pointed out and the 2016 task thread, they are very helpful! And the mediaWiki sites have so much useful information! (but it seems both sites are not indexed by Google :( )
I think I don't need a new dataset at this time, so I am changing this to resolved, and I will post questions in the other thread. Thank you for the help! You guys are awesome!