Better response times on AQS (Pageview API mostly) {melc}
Closed, ResolvedPublic0 Estimated Story Points
Actions

Assigned To

Authored By

	• Nuria
	Jan 21 2016, 6:20 PM

Description

Better response times Pageview API, better throughput.Scaling.

Details

	Subject	Repo	Branch	Lines +/-
	Add a new AQS testing environment to play with Cassandra settings before production.	operations/puppet	production	+111 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	Ottomata	T134275 rack/setup/deploy 3 eqiad druid nodes
Resolved	RobH	T128807 eqiad: (3) nodes for Druid / analytics
		Unknown Object (Task)
Duplicate	• mobrovac	T125345 Many error 500 from pageviews API "Error in Cassandra table storage backend"
Resolved	JAllemandou	T124314 Better response times on AQS (Pageview API mostly) {melc}
Resolved	RobH	T124947 eqiad: (3) AQS replacement nodes
Resolved	JAllemandou	T133016 analyse AQS queries over the previous month or weeks to have a better understanding of how compaction should behave
Resolved	elukey	T133785 rack/setup/deploy aqs100[456]
		Unknown Object (Task)
		Unknown Object (Task)
		Unknown Object (Task)
Resolved	elukey	T135145 Test cassandra compactions on new AQS nodes
Resolved	elukey	T142075 Replace RAID0 arrays with RAID10 on aqs100[456]
Resolved	elukey	T144497 Switch AQS to new cluster
Resolved	• Nuria	T144521 Coalesce nulls to 0s in output
Resolved	JAllemandou	T145087 Setup regular loading jobs new aqs cluster (per-article, top and unique devices)
Resolved	• Nuria	T140866 Continue New AQS Loading
Resolved	JAllemandou	T145089 Load top article data into new AQS cluster
Duplicate	JAllemandou	T147460 Decomission hosts from old aqs cluster
Resolved	elukey	T147461 Clean up references on puppet code to old AQS cluster

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

• Nuria added a project: Analytics.Jan 21 2016, 6:20 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 21 2016, 6:20 PM

Milimetric moved this task from Incoming to Analytics Query Service on the Analytics board.Jan 21 2016, 6:25 PM

Eevans added a subtask: T124947: eqiad: (3) AQS replacement nodes.Apr 13 2016, 3:17 PM

Eevans subscribed.

I briefly looked at the Cassandra cluster, and the typical column family read latency (this just the time Cassandra records for local storage-system reads) is quite high.

At least some component of this is the number of SSTables/read. This column family (apparently typical), hits 10 SSTables to satisfy reads 50p, and a whopping 24 99p. That's quite a lot, rotational disks notwithstanding.

P2892 cfhistograms

1	eevans@aqs1001:~$ nodetool cfhistograms -- local_group_default_T_pageviews_per_article_flat data
2	local_group_default_T_pageviews_per_article_flat/data histograms
3	Percentile SSTables Write Latency Read Latency Partition Size Cell Count
4	(micros) (micros) (bytes)
5	50% 10.00 29.00 379022.00 1916 35
6	75% 14.00 35.00 1131752.00 3973 72
7	95% 20.00 60.00 2816159.00 20501 310
8	98% 20.00 72.00 3379391.00 29521 535
9	99% 24.00 103.00 4055269.00 51012 924
10	Min 0.00 6.00 51.00 925 15
11	Max 60.00 10090808.00 12108970.00 263210 4768

I did an analysis of DateTieredCompactionStrategy in the RESTBase cluster over here, and found that out-of-order writes are flattening the date-based structure, leaving us with a single, all-encompassing tier. Since DTCS does STCS within tiers, we are effectively running STCS. The situation is similar for AQS.

P2893 sstables

1	eevans@aqs1001:~/compaction$ sudo -u cassandra ./print_sstables_info /var/lib/cassandra/data/local_group_default_T_pageviews_per_article_flat/data-f478740078fa11e5bfce55ce467c43aa
2	ID \| Timespan \| Max timestamp \| Size
3	60804 \| 172 days, 8:19:10.689000 \| 2016-04-13 04:07:00.759000 \| 1 GB
4	60550 \| 172 days, 8:16:09.957000 \| 2016-04-13 04:04:00.260000 \| 6 GB
5	59996 \| 171 days, 16:41:20.289000 \| 2016-04-12 12:29:10.365000 \| 22 GB
6	60147 \| 171 days, 16:41:20.277000 \| 2016-04-12 12:29:10.342000 \| 12 GB
7	57935 \| 168 days, 8:36:59.525001 \| 2016-04-09 04:24:49.628001 \| 63 GB
8	60849 \| 167 days, 2:27:47.254000 \| 2016-04-13 04:07:00.619000 \| 73 MB
9	60803 \| 153 days, 14:06:03.332001 \| 2016-04-11 16:26:51.567001 \| 2 GB
10	59246 \| 153 days, 12:13:26.965999 \| 2016-04-11 14:34:23.545000 \| 10 GB
11	60818 \| 149 days, 16:55:34.091000 \| 2016-04-12 04:24:47.236000 \| 3 MB
12	60834 \| 149 days, 9:59:46.798000 \| 2016-04-07 12:23:23.148001 \| 131 MB
13	51854 \| 143 days, 0:50:37.059999 \| 2016-04-01 03:11:24.549000 \| 100 GB
14	47637 \| 126 days, 8:04:51.378001 \| 2016-02-27 03:52:41.429001 \| 581 GB
15	60812 \| 126 days, 8:02:47.450000 \| 2016-02-27 03:50:37.841000 \| 567 MB
16	59975 \| 104 days, 10:56:37.075000 \| 2016-04-07 12:23:23.161001 \| 11 GB
17	60799 \| 61 days, 1:44:23.249001 \| 2016-04-09 04:24:44.724001 \| 1 GB
18	59976 \| 61 days, 1:44:20.290999 \| 2016-04-09 04:24:41.768000 \| 14 GB
19	60847 \| 45 days, 0:43:07.506001 \| 2016-02-08 02:09:53.618001 \| 1 GB
20	60844 \| 45 days, 0:12:23.379000 \| 2016-03-24 02:52:49.561000 \| 222 MB
21	47083 \| 41 days, 4:19:15.816000 \| 2016-02-04 05:46:02.372000 \| 140 GB
22	57257 \| 12 days, 1:47:08.213001 \| 2016-04-06 03:12:45.585001 \| 94 GB
23	49631 \| 10 days, 22:35:49.693000 \| 2016-03-13 02:40:40.144000 \| 79 GB
24	48462 \| 10 days, 10:21:30.744998 \| 2016-03-01 04:42:45.998000 \| 45 GB
25	60758 \| 10 days, 1:41:37.277000 \| 2016-04-04 03:07:07.367000 \| 5 GB
26	60843 \| 10 days, 1:41:37.221999 \| 2016-04-04 03:07:07.311000 \| 376 MB
27	50715 \| 10 days, 1:37:43.264000 \| 2016-03-24 02:53:39.220000 \| 48 GB
28	47081 \| 9 days, 3:09:41.723000 \| 2016-02-18 05:07:06.076000 \| 27 GB
29	60842 \| 7 days, 2:52:35.191999 \| 2016-04-12 04:24:47.281000 \| 189 MB
30	60537 \| 6 days, 14:48:08.520999 \| 2016-04-11 16:20:20.585000 \| 1 GB
31	60538 \| 4 days, 8:29:04.523000 \| 2016-04-12 04:56:54.663001 \| 1 GB
32	60841 \| 4 days, 8:29:04.446001 \| 2016-04-12 04:56:54.617001 \| 201 MB
33	57254 \| 3 days, 9:38:35.952999 \| 2016-04-08 11:10:48.017000 \| 39 GB
34	59105 \| 2 days, 9:46:12.666000 \| 2016-04-10 06:14:02.811000 \| 41 GB
35	60793 \| 1 day, 10:36:04.743001 \| 2016-04-12 12:27:48.934001 \| 276 MB
36	60790 \| 1 day, 0:04:10.201999 \| 2016-04-13 02:49:48.036000 \| 119 MB
37	60530 \| 19:19:25.116001 \| 2016-04-12 04:56:54.657001 \| 895 MB
38	60839 \| 17:08:08.778000 \| 2016-04-12 02:45:38.320001 \| 67 MB
39	60151 \| 9:34:01.491000 \| 2016-04-12 12:19:39.828000 \| 3 GB
40	60814 \| 9:16:11.294000 \| 2016-04-12 12:01:49.723000 \| 4 MB
41	59103 \| 7:43:56.605999 \| 2016-04-11 09:37:33.908000 \| 8 GB
42	60840 \| 7:43:51.495000 \| 2016-04-11 09:37:29.255001 \| 10 MB
43	59629 \| 6:19:17.435000 \| 2016-04-11 15:56:51.341000 \| 15 GB
44	60787 \| 2:07:30.813000 \| 2016-04-13 04:03:45.125000 \| 127 MB
45	60838 \| 1:52:07.249000 \| 2016-04-13 04:02:02.163000 \| 19 MB
46	60516 \| 0:33:48.603999 \| 2016-04-13 02:51:28.072000 \| 657 MB
47	60837 \| 0:16:08.755001 \| 2016-04-13 03:30:11.231001 \| 1 MB
48	60836 \| 0:15:44.803000 \| 2016-04-13 03:45:56.153000 \| 353 KB
49	60782 \| 0:10:35.988001 \| 2016-04-13 03:56:32.141001 \| 4 MB
50	60275 \| 0:08:32.889000 \| 2016-04-13 03:25:22.525001 \| 146 MB
51	60417 \| 0:08:06.173000 \| 2016-04-13 03:38:17.665000 \| 277 MB
52	60781 \| 0:04:57.418000 \| 2016-04-13 04:06:59.784000 \| 164 KB
53	60808 \| 0:01:53.391001 \| 2016-04-13 01:59:54.363001 \| 152 KB

It would seem that upgrades to SSDs are in the works, which should definitely help.

Other options to improve Cassandra read performance:

TimeWindowCompactionStrategy (CASSANDRA-9666)

This compaction strategy is currently out-of-tree (though that will likely change); We'd need to build and deploy the jars ourselves. That said, it has seen surprisingly wide adoption, likely because it's algorithm is much closer to what people expect of DTCS, and is easier to reason about. For us, it should solve the problem of out-of-order writes.

Lower node density (more Cassandra nodes / multi-instance)

This is likely going to be necessary at some point anyway. These nodes have 10T of raw storage, and running single instances of Cassandra this large is going to get painful (think repair, bootstraps, decommission, etc). Adding more nodes decreases the per-node share of data, resulting in fewer (smaller) SSTables. Particularly if the compaction algorithm isn't behaving ideally, this will do a lot to reduce read latency

With new nodes on the way, now is really the time to think about this. We could do the evaluation of (1) using one of the new nodes in a write sampling mode, and a local TWCS override. If it passes muster, then bootstrapping the new nodes with TWCS enabled would save a recompaction later. And, moving to multi-instance (2) would be straightforward when adding the new nodes, far less so down the road.

Happy to help with all this; Let me know!

JAllemandou claimed this task.Apr 13 2016, 4:11 PM

JAllemandou edited projects, added Analytics-Kanban; removed Analytics.

JAllemandou moved this task from Next Up to In Progress on the Analytics-Kanban board.

elukey subscribed.Apr 19 2016, 8:28 AM

JAllemandou added a subtask: T133016: analyse AQS queries over the previous month or weeks to have a better understanding of how compaction should behave.Apr 19 2016, 8:59 AM

MusikAnimal subscribed.Apr 19 2016, 4:25 PM

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptApr 19 2016, 4:25 PM

JAllemandou merged a task: T133005: Pageviews API returning `Error in Cassandra table storage backend`.Apr 19 2016, 4:26 PM

JAllemandou subscribed.

• Nuria renamed this task from Better response times Pageview API to Better response times on AQS (Pageview API mostly).Apr 19 2016, 5:44 PM

• Nuria renamed this task from Better response times on AQS (Pageview API mostly) to Better response times on AQS (Pageview API mostly) {slug}.Apr 19 2016, 5:47 PM

• Nuria renamed this task from Better response times on AQS (Pageview API mostly) {slug} to Better response times on AQS (Pageview API mostly) {melc}.

• Nuria moved this task from In Progress to Parent Tasks on the Analytics-Kanban board.

• Nuria set the point value for this task to 0.

Nemo_bis added projects: RESTBase-Cassandra, Pageviews-API.Apr 24 2016, 7:29 AM

Restricted Application added a project: Analytics. · View Herald TranscriptApr 24 2016, 7:29 AM

JAllemandou removed a project: Analytics.Apr 25 2016, 9:28 AM

Restricted Application added a project: Analytics. · View Herald TranscriptApr 25 2016, 9:28 AM

Removing pageview-API tag to prevent having automatic Analytics tagging (this belongs to Analytics-Kanban now)

• Nuria merged a task: T132938: Provision new SSD-able machines on AQS.Apr 25 2016, 5:00 PM

• Nuria added a subscriber: Ottomata.

RobH closed subtask T124947: eqiad: (3) AQS replacement nodes as Resolved.Apr 27 2016, 3:59 PM

RobH added a subtask: T133785: rack/setup/deploy aqs100[456].Apr 27 2016, 4:05 PM

• Nuria closed subtask T133016: analyse AQS queries over the previous month or weeks to have a better understanding of how compaction should behave as Resolved.Apr 28 2016, 4:46 PM

Nemo_bis added a project: Datasets-Webstatscollector.Apr 30 2016, 8:15 PM

MusikAnimal mentioned this in T134920: Integrate Pageviews Analysis with Page Pile.May 10 2016, 8:42 PM

Samwalton9-WMF subscribed.May 10 2016, 8:43 PM

Is this really unrelated from T125345?

Nemo_bis added a parent task: T125345: Many error 500 from pageviews API "Error in Cassandra table storage backend".May 11 2016, 5:53 AM

@Nemo_bis : there are many issues among them tow main ones: 1) we need new hardware and 2) we need to bound queries to a date range that we can return w/o running into timeouts. https://phabricator.wikimedia.org/T134524

Change 288373 had a related patch set uploaded (by Elukey):
Add a new AQS testing environment to play with Cassandra settings before production.

https://gerrit.wikimedia.org/r/288373

gerritbot added a project: Patch-For-Review.May 12 2016, 11:44 AM

• Nuria added a subtask: T135145: Test cassandra compactions on new AQS nodes.May 12 2016, 5:25 PM

I wanted to speak up and say that I get the throttling notice. I need to use tools to make article traffic queries. It is not a problem for me to wait 15 seconds but when I see an alert like this, it makes me a little afraid that the service is not watched and could go down entirely.

In T124314#2292812, @Bluerasberry wrote:

I wanted to speak up and say that I get the throttling notice. I need to use tools to make article traffic queries. It is not a problem for me to wait 15 seconds but when I see an alert like this, it makes me a little afraid that the service is not watched and could go down entirely.

He's speaking of the notice at https://tools.wmflabs.org/massviews (or Langviews), both make batch requests, the former up to 500.

Don't worry @Bluerasberry, it's not going to go down. The throttling I added I think ensures you get all the data you need, and the notice is there just to keep users are informed of the issue and that the kind folks on the Analytics team are working on it :)

MusikAnimal mentioned this in T134524: Pageview API: Limit (and document) size of data you can request.May 13 2016, 5:20 PM

In T124314#2285392, @Nuria wrote:

@Nemo_bis : there are many issues among them tow main ones: 1) we need new hardware and 2) we need to bound queries to a date range that we can return w/o running into timeouts.

Not really. The first thing is to not lie to clients. Error 500 is very unhelpful.

• MZMcBride subscribed.May 14 2016, 3:58 PM

@Nemo_bis: system sends 500 when it doesn't work, not when there is a controlled error condition

Change 288373 merged by Elukey:
Add a new AQS testing environment to play with Cassandra settings before production.

https://gerrit.wikimedia.org/r/288373

elukey mentioned this in rOPUPa8508ca8108b: Add a new AQS testing environment to play with Cassandra settings before….May 17 2016, 3:02 PM

MusikAnimal mentioned this in T135530: Limit Massviews interface to groups with 500 or fewer pages.May 17 2016, 6:33 PM

MusikAnimal mentioned this in T135432: Massviews: Add chart for total pageviews of pagepile collection.May 19 2016, 6:45 PM

elukey closed subtask T133785: rack/setup/deploy aqs100[456] as Resolved.May 20 2016, 1:15 PM

system sends 500 when it doesn't work, not when there is a controlled error condition

Exceeding capacity seems a standard condition to take into account, and certainly one that happens frequently for this service (and any service that queries so much data, I guess).

kaldari added a project: Pageviews-API.May 31 2016, 9:30 PM

Restricted Application added a project: Analytics. · View Herald TranscriptMay 31 2016, 9:30 PM

Milimetric moved this task from Analytics Query Service to Operational Excellence Future on the Analytics board.Jun 2 2016, 4:58 PM

Milimetric removed projects: Pageviews-API, Analytics.

SatyamMishra subscribed.Jun 6 2016, 7:58 AM

elukey mentioned this in rOPUP46dc96b99a1b: Add a new AQS testing environment to play with Cassandra settings before….Jun 17 2016, 6:09 PM

elukey mentioned this in rOPUP004c2df67cce: Add a new AQS testing environment to play with Cassandra settings before….

elukey mentioned this in rOPUP8f108965030f: Add a new AQS testing environment to play with Cassandra settings before….

elukey mentioned this in rOPUP22ccc7da24a0: Add a new AQS testing environment to play with Cassandra settings before….

• Nuria closed subtask T135145: Test cassandra compactions on new AQS nodes as Resolved.Jul 4 2016, 4:27 PM

• Nuria merged a task: T125345: Many error 500 from pageviews API "Error in Cassandra table storage backend".Jul 4 2016, 4:38 PM

• Nuria added subscribers: Milimetric, • mobrovac, • GWicke.

Summary of the last months of work:

The new AQS cluster has been built and it is up and running. All the technical details have been collected in https://wikitech.wikimedia.org/wiki/User:Elukey/Ops/AQS_Settings (still need to be completed but almost done). All the puppet work is completed.

The new cluster has been load tested by Joseph in https://etherpad.wikimedia.org/p/analytics-aqs-cassandra and we obtained good performances compared to the actual ones. We are striving for at least one order of magnitude more req/s performances on the new cluster, and for the moment it seems that we are going to meet the expectations.

We decided to go for the safest migration path, namely creating and loading data to the new AQS cluster without leveraging Cassandra cluster expansion features. This will allow us to have both new and old AQS clusters up and running and completely independent from each other at migration time, in order to ensure a clean and smooth rollback via confd/pybal/lvs in case an unexpected regression will appear.

One major problem that we have been working on (Joseph leading the efforts again) is bulk loading the new cluster with one year of pageviews data. Cassandra offers a new experimental feature to create SSTables in Hadoop and to stream them directly to the cluster. Tests revealed that pure loading time has been cut of by a factor of 5, but SSTable compaction takes more time than expected. Joseph is currently working on it to figure out if it is possible to improve the compaction step.

In T124314#2446753, @elukey wrote:

One major problem that we have been working on (Joseph leading the efforts again) is bulk loading the new cluster with one year of pageviews data. Cassandra offers a new experimental feature to create SSTables in Hadoop and to stream them directly to the cluster. Tests revealed that pure loading time has been cut of by a factor of 5, but SSTable compaction takes more time than expected. Joseph is currently working on it to figure out if it is possible to improve the compaction step.

Is this an apples-to-apples comparison using the same compaction strategy? In other words (as I understand it) you are using leveled compaction now, does the sstableloadermethod create greater compaction load than the previous method did with leveled compaction?

@Eevans : It is indeed using the same compaction strategy (Leveled). I assume it is because of while-loading-presorting or something like that that the difference is so big. I have not managed to find a parameter tuning the size of sstables generated by the loaders and I the feeling they are too small (~10Mb) and therefore incur the compaction thing, but I could be wrong.

After a week of almost no progress on a month of bulk loaded data to compact, we wiped the cluster.
We are eager to use the new cluster in prod, and loading takes time, so we decided to with CQL loading even if it's more expensive at load time (it seems really cheaper at compaction time though).

Status update:

We loaded four months of data to the cluster and fixed some misconfigurations (like sstable compression) but after a review of the disks configuration with the Operations team, we decided to move the Cassandra instances' disk partitions to RAID10 arrays (rather than RAID0). This choice will allow us to be more resilient to disk/host failures in the longer term, but it will delay some days the cluster's data backfill of course.

We have also reached out to the Analytics community to know which use cases would require more data than the past 18 months. This will allow us to make a better prediction of how much disk space we'll need during the next months to satisfy our users' needs.

The plan is to re-image one host at the time (so two Cassandra instances down maximum at the same time) and then wait for Cassandra recovery/bootstrapping to bring the new instances up to speed again. From our calculations this work should be finished early next week, but I'll keep this task updated.

elukey created subtask T142073: Improve user management for AQS Cassandra.Aug 4 2016, 8:05 AM

elukey created subtask T142075: Replace RAID0 arrays with RAID10 on aqs100[456].Aug 4 2016, 8:10 AM

• Nuria closed subtask T142075: Replace RAID0 arrays with RAID10 on aqs100[456] as Resolved.Aug 16 2016, 3:09 PM

leila subscribed.Aug 23 2016, 3:16 PM

Akeron subscribed.Sep 5 2016, 12:37 PM

MusikAnimal mentioned this in T144760: Add 'offset' or pagination to Massviews.Sep 5 2016, 5:00 PM

Status update:

We have re-imaged all the Cassandra hosts (aqs100[456]) with the new RAID settings and started the data load. We are currently very close to finish loading 14 months of data, optimistically by the end of this week. We have also improved user management moving away from the default cassandra user for everything (T142073) and investigated if anything could be done to improve the performances of the current cluster while waiting for the new one (T143873).

Next steps:

complete the data load (optimistically end of this week)
consistency checks about data loaded and Cassandra/Restbase settings (will require a couple of days max)
final load tests to make sure that the cluster will be able to sustain correctly the current traffic (will require 3/4 days max)
Cluster switch - we'll add the new nodes behind the AQS LVS load balancer leaving the current ones running, and then we'll deprecate old hosts if nothing major appears during the following days.

Given the above timeline we should be able to make the switch before the end of the quarter.

Samwalton9-WMF unsubscribed.Sep 6 2016, 8:25 AM

Server activity for the curious: https://ganglia.wikimedia.org/latest/graph.php?r=month&z=xlarge&title=&vl=&x=&n=&hreg[]=aqs100%5B456%5D&mreg[]=load_one&gtype=line&glegend=show&aggregate=1&embed=1&_=1473167732059 / https://ganglia.wikimedia.org/latest/?c=Analytics%20Query%20Service%20eqiad

• Nuria added a subtask: T144497: Switch AQS to new cluster .Sep 8 2016, 3:40 PM

• Nuria added a subtask: T140866: Continue New AQS Loading.Sep 8 2016, 3:58 PM

Change 310831 had a related patch set uploaded (by Elukey):
Add the new aqs nodes to conftool-data

https://gerrit.wikimedia.org/r/310831

• Nuria closed subtask T140866: Continue New AQS Loading as Resolved.Sep 19 2016, 3:09 PM

MusikAnimal awarded a token.Sep 21 2016, 6:02 PM

• Nuria updated the task description. (Show Details)Oct 5 2016, 4:03 PM

• Nuria created subtask T147460: Decomission hosts from old aqs cluster .

• Nuria created subtask T147461: Clean up references on puppet code to old AQS cluster .

• Nuria closed subtask T147461: Clean up references on puppet code to old AQS cluster as Resolved.Oct 11 2016, 7:00 PM

• GWicke added a project: Services.Oct 12 2016, 11:21 PM

• GWicke edited projects, added Services (watching); removed Services.Oct 12 2016, 11:25 PM

• GWicke set Security to None.

• Nuria closed subtask T144497: Switch AQS to new cluster as Resolved.Oct 19 2016, 7:22 PM

it's really fast now! :)

Milimetric removed a subtask: T142073: Improve user management for AQS Cassandra.Nov 7 2016, 4:57 PM

1	eevans@aqs1001:~/compaction$ sudo -u cassandra ./print_sstables_info /var/lib/cassandra/data/local_group_default_T_pageviews_per_article_flat/data-f478740078fa11e5bfce55ce467c43aa
2	ID \| Timespan \| Max timestamp \| Size
3	60804 \| 172 days, 8:19:10.689000 \| 2016-04-13 04:07:00.759000 \| 1 GB
4	60550 \| 172 days, 8:16:09.957000 \| 2016-04-13 04:04:00.260000 \| 6 GB
5	59996 \| 171 days, 16:41:20.289000 \| 2016-04-12 12:29:10.365000 \| 22 GB
6	60147 \| 171 days, 16:41:20.277000 \| 2016-04-12 12:29:10.342000 \| 12 GB
7	57935 \| 168 days, 8:36:59.525001 \| 2016-04-09 04:24:49.628001 \| 63 GB
8	60849 \| 167 days, 2:27:47.254000 \| 2016-04-13 04:07:00.619000 \| 73 MB
9	60803 \| 153 days, 14:06:03.332001 \| 2016-04-11 16:26:51.567001 \| 2 GB
10	59246 \| 153 days, 12:13:26.965999 \| 2016-04-11 14:34:23.545000 \| 10 GB
11	60818 \| 149 days, 16:55:34.091000 \| 2016-04-12 04:24:47.236000 \| 3 MB
12	60834 \| 149 days, 9:59:46.798000 \| 2016-04-07 12:23:23.148001 \| 131 MB
13	51854 \| 143 days, 0:50:37.059999 \| 2016-04-01 03:11:24.549000 \| 100 GB
14	47637 \| 126 days, 8:04:51.378001 \| 2016-02-27 03:52:41.429001 \| 581 GB
15	60812 \| 126 days, 8:02:47.450000 \| 2016-02-27 03:50:37.841000 \| 567 MB
16	59975 \| 104 days, 10:56:37.075000 \| 2016-04-07 12:23:23.161001 \| 11 GB
17	60799 \| 61 days, 1:44:23.249001 \| 2016-04-09 04:24:44.724001 \| 1 GB
18	59976 \| 61 days, 1:44:20.290999 \| 2016-04-09 04:24:41.768000 \| 14 GB
19	60847 \| 45 days, 0:43:07.506001 \| 2016-02-08 02:09:53.618001 \| 1 GB
20	60844 \| 45 days, 0:12:23.379000 \| 2016-03-24 02:52:49.561000 \| 222 MB
21	47083 \| 41 days, 4:19:15.816000 \| 2016-02-04 05:46:02.372000 \| 140 GB
22	57257 \| 12 days, 1:47:08.213001 \| 2016-04-06 03:12:45.585001 \| 94 GB
23	49631 \| 10 days, 22:35:49.693000 \| 2016-03-13 02:40:40.144000 \| 79 GB
24	48462 \| 10 days, 10:21:30.744998 \| 2016-03-01 04:42:45.998000 \| 45 GB
25	60758 \| 10 days, 1:41:37.277000 \| 2016-04-04 03:07:07.367000 \| 5 GB
26	60843 \| 10 days, 1:41:37.221999 \| 2016-04-04 03:07:07.311000 \| 376 MB
27	50715 \| 10 days, 1:37:43.264000 \| 2016-03-24 02:53:39.220000 \| 48 GB
28	47081 \| 9 days, 3:09:41.723000 \| 2016-02-18 05:07:06.076000 \| 27 GB
29	60842 \| 7 days, 2:52:35.191999 \| 2016-04-12 04:24:47.281000 \| 189 MB
30	60537 \| 6 days, 14:48:08.520999 \| 2016-04-11 16:20:20.585000 \| 1 GB
31	60538 \| 4 days, 8:29:04.523000 \| 2016-04-12 04:56:54.663001 \| 1 GB
32	60841 \| 4 days, 8:29:04.446001 \| 2016-04-12 04:56:54.617001 \| 201 MB
33	57254 \| 3 days, 9:38:35.952999 \| 2016-04-08 11:10:48.017000 \| 39 GB
34	59105 \| 2 days, 9:46:12.666000 \| 2016-04-10 06:14:02.811000 \| 41 GB
35	60793 \| 1 day, 10:36:04.743001 \| 2016-04-12 12:27:48.934001 \| 276 MB
36	60790 \| 1 day, 0:04:10.201999 \| 2016-04-13 02:49:48.036000 \| 119 MB
37	60530 \| 19:19:25.116001 \| 2016-04-12 04:56:54.657001 \| 895 MB
38	60839 \| 17:08:08.778000 \| 2016-04-12 02:45:38.320001 \| 67 MB
39	60151 \| 9:34:01.491000 \| 2016-04-12 12:19:39.828000 \| 3 GB
40	60814 \| 9:16:11.294000 \| 2016-04-12 12:01:49.723000 \| 4 MB
41	59103 \| 7:43:56.605999 \| 2016-04-11 09:37:33.908000 \| 8 GB
42	60840 \| 7:43:51.495000 \| 2016-04-11 09:37:29.255001 \| 10 MB
43	59629 \| 6:19:17.435000 \| 2016-04-11 15:56:51.341000 \| 15 GB
44	60787 \| 2:07:30.813000 \| 2016-04-13 04:03:45.125000 \| 127 MB
45	60838 \| 1:52:07.249000 \| 2016-04-13 04:02:02.163000 \| 19 MB
46	60516 \| 0:33:48.603999 \| 2016-04-13 02:51:28.072000 \| 657 MB
47	60837 \| 0:16:08.755001 \| 2016-04-13 03:30:11.231001 \| 1 MB
48	60836 \| 0:15:44.803000 \| 2016-04-13 03:45:56.153000 \| 353 KB
49	60782 \| 0:10:35.988001 \| 2016-04-13 03:56:32.141001 \| 4 MB
50	60275 \| 0:08:32.889000 \| 2016-04-13 03:25:22.525001 \| 146 MB
51	60417 \| 0:08:06.173000 \| 2016-04-13 03:38:17.665000 \| 277 MB
52	60781 \| 0:04:57.418000 \| 2016-04-13 04:06:59.784000 \| 164 KB
53	60808 \| 0:01:53.391001 \| 2016-04-13 01:59:54.363001 \| 152 KB

	F3869428: screenshot-grafana wikimedia org 2016-04-13 10-20-26.png
	Apr 13 2016, 4:10 PM

	F3869443: screenshot-grafana wikimedia org 2016-04-13 10-24-10.png
	Apr 13 2016, 4:10 PM

Better response times on AQS (Pageview API mostly) {melc}Closed, ResolvedPublic0 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

Better response times on AQS (Pageview API mostly) {melc}
Closed, ResolvedPublic0 Estimated Story Points
Actions

Related Objects
Search...