Page MenuHomePhabricator

Raw webrequest partitions for 2014-09-28T04:xx:xx not marked successful
Closed, ResolvedPublic

Description

For the hour 2014-08-28T04:xx:xx, none [1] of the the four sources'
bucket was marked successful.

What happened?

[1]


qchris@stat1002 jobs: 0 time: 15:39:05 // exit code: 0
cwd: ~
cluster-scripts/dump_webrequest_status.sh

+---------------------+--------+--------+--------+--------+
| Date                |  bits  |  text  | mobile | upload |
+---------------------+--------+--------+--------+--------+

[...]

| 2014-09-28T02:xx:xx |    .   |    .   |    .   |    .   |    
| 2014-09-28T03:xx:xx |    .   |    .   |    .   |    .   |    
| 2014-09-28T04:xx:xx |    X   |    X   |    X   |    X   |    
| 2014-09-28T05:xx:xx |    .   |    .   |    .   |    .   |    
| 2014-09-28T06:xx:xx |    .   |    .   |    .   |    .   |

[...]

+---------------------+--------+--------+--------+--------+

Statuses:

. --> Partition is ok
X --> Partition is not ok (duplicates, missing, or nulls)

Version: unspecified
Severity: normal

Details

Reference
bz71425

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:48 AM
bzimport set Reference to bz71425.
bzimport added a subscriber: Unknown Object (MLST).

Between 04:52:27--04:53:01 (so ~35 seconds) in total ~200K log lines
have been dropped.
Drop happened across all datacenters, across all webrequest_sources.
No duplicates.

The affected time period closely matches a zookeeper timeout on
analytics1021.