Page MenuHomePhabricator

Raw webrequest partitions for 2014-12-07T20/2H not marked successful
Closed, DeclinedPublic

Description

The raw bits webrequest partition *) for 2014-12-07T20/2H has been
been marked successful.

What happened?

*)

_________________________________________________________________
qchris@stat1002 // jobs: 0 // time: 14:34:32 // exit code: 0
cwd: ~/cluster-scripts
~/cluster-scripts/dump_webrequest_status.sh

  +------------------+--------+--------+--------+--------+
  | Date             |  bits  | mobile |  text  | upload |
  +------------------+--------+--------+--------+--------+
[...]
  | 2014-12-07T18/1H |    X   |    .   |    .   |    .   |
  | 2014-12-07T19/1H |    X   |    .   |    .   |    .   |
  | 2014-12-07T20/1H |    X   |    .   |    .   |    .   |
  | 2014-12-07T21/1H |    X   |    .   |    .   |    .   |
  | 2014-12-07T22/1H |    .   |    .   |    .   |    .   |
  | 2014-12-07T23/1H |    .   |    .   |    .   |    .   |
[...]
  +------------------+--------+--------+--------+--------+


Statuses:

  . --> Partition is ok
  M --> Partition manually marked ok
  X --> Partition is not ok (duplicates, missing, or nulls)

Event Timeline

QChris raised the priority of this task from to Medium.
QChris updated the task description. (Show Details)
QChris added a project: Analytics-Clusters.
QChris changed Security from none to None.
QChris added subscribers: Unknown Object (MLST), kevinator, QChris and 2 others.

2014-12-07T18/1H is handled in https://phabricator.wikimedia.org/T77022
2014-12-07T19/1H is handled in https://phabricator.wikimedia.org/T77023

It affects only esams bits caches, but all of them.

HostStart of issueEnd of issue
cp3019.esams.wikimedia.org2014-12-07T20:29:082014-12-07T21:20:13
cp3020.esams.wikimedia.org2014-12-07T20:30:552014-12-07T21:04:39
cp3021.esams.wikimedia.org2014-12-07T20:30:052014-12-07T21:10:30
cp3022.esams.wikimedia.org2014-12-07T20:30:232014-12-07T20:57:31

Between, 2014-12-07T20:29:08 and 2014-12-07T21:20:14, the esams bits have:

  • ~6M duplicates (worth ~4 minutes of overall bits traffic), and
  • ~19M missing lines (worth ~10 minutes of overall bits traffic).

HM. There is now an extra varnishkafka instance running on all bits servers (statsv). I wonder if the recent increase in bits message loss is related.