Page MenuHomePhabricator

Some raw webrequest partitions for 2014-12-10T14/2H not marked successful
Closed, ResolvedPublic

Description

5 of the webrequest partitions *) for 2014-12-10T14/2H have been been
marked successful.

What happened?

*)

_________________________________________________________________
qchris@stat1002 // jobs: 0 // time: 15:23:10 // exit code: 0
cwd: ~
~/cluster-scripts/dump_webrequest_status.sh 600
  +------------------+--------+--------+--------+--------+
  | Date             |  bits  | mobile |  text  | upload |
  +------------------+--------+--------+--------+--------+
[...]
  | 2014-12-10T12/1H |    .   |    .   |    .   |    .   |
  | 2014-12-10T13/1H |    .   |    .   |    .   |    .   |
  | 2014-12-10T14/1H |    X   |    X   |    X   |    X   |
  | 2014-12-10T15/1H |    .   |    X   |    .   |    .   |
  | 2014-12-10T16/1H |    .   |    .   |    .   |    .   |
  | 2014-12-10T17/1H |    .   |    .   |    .   |    .   |
[...]
  +------------------+--------+--------+--------+--------+


Statuses:

  . --> Partition is ok
  M --> Partition manually marked ok
  X --> Partition is not ok (duplicates, missing, or nulls)

Event Timeline

QChris raised the priority of this task from to Needs Triage.
QChris updated the task description. (Show Details)
QChris added subscribers: Unknown Object (MLST), kevinator, QChris and 3 others.

Analytics1021 got dropped out it's partition leader role around 2014-12-10T14:18 (which caused loss ~1 seconds worth of traffic).
Ottomata ran the leader reelection on 2014-12-10T15:27 (which caused loss <<1 second worth of traffic).

We give the logs from the brokers followed by stats about the impact
on the partitions.


Here are the logs from analytics1022:

[2014-12-10 14:18:56,028] 11034220240 [ZkClient-EventThread-14-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] INFO  kafka.controller.ReplicaStateMachine$BrokerChangeListener  - [BrokerChangeListener on Controller 22]: Broker change listener fired for path /brokers/ids with children 22,18,12
[2014-12-10 14:18:56,043] 11034220255 [ZkClient-EventThread-14-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] INFO  kafka.controller.ReplicaStateMachine$BrokerChangeListener  - [BrokerChangeListener on Controller 22]: Newly added brokers: , deleted brokers: 21, all live brokers: 22,18,12

and ~2 seconds later:

[2014-12-10 14:18:57,965] 11034222177 [ZkClient-EventThread-14-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] INFO  kafka.controller.ReplicaStateMachine$BrokerChangeListener  - [BrokerChangeListener on Controller 22]: Newly added brokers: 21, deleted brokers: , all live brokers: 21,22,18,12
[2014-12-10 14:18:58,002] 11034222214 [ZkClient-EventThread-14-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] INFO  kafka.controller.RequestSendThread  - [Controller-22-to-broker-21-send-thread], Controller 22 connected to id:21,host:analytics1021.eqiad.wmnet,port:9092 for sending state change requests

and for the reelection:

[2014-12-10 15:27:30,607] 11038334819 [ZkClient-EventThread-14-analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad] INFO  kafka.controller.KafkaController  - [Controller 22]: Starting preferred replica leader election ...

  • bits

Affected period: 2014-12-10T14:18:41 -- 2014-12-10T14:18:46
Duplicates: 0
Missing: 5051

HostStart of issueEnd of issue
cp1056.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:44
cp1057.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:45
cp1069.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp1070.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:44
cp3019.esams.wikimedia.org2014-12-10T14:18:412014-12-10T14:18:42
cp3020.esams.wikimedia.org2014-12-10T14:18:422014-12-10T14:18:42
cp3021.esams.wikimedia.org2014-12-10T14:18:422014-12-10T14:18:43
cp3022.esams.wikimedia.org2014-12-10T14:18:412014-12-10T14:18:42
cp4001.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:45
cp4002.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:45
cp4003.ulsfo.wmnet2014-12-10T14:18:442014-12-10T14:18:46
cp4004.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:45

(cp3022.esams.wikimedia.org had 252 duplicates which got
deduplicated.)


  • mobile 2014-12-10T14

Affected period: 2014-12-10T14:18:42 -- 2014-12-10T14:18:45
Duplicates: 0
Missing: 586

HostStart of issueEnd of issue
cp1046.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:44
cp1060.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:44
cp3014.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:45
cp4011.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4019.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44

  • mobile 2014-12-10T15

Affected period: 2014-12-10T15:27:30 -- 2014-12-10T15:27:31
Duplicates: 0
Missing: 70

HostStart of issueEnd of issue
cp1060.eqiad.wmnet2014-12-10T15:27:302014-12-10T15:27:31

  • text

Affected period: 2014-12-10T14:18:42 -- 2014-12-10T14:18:46
Duplicates: 0
Missing: 5054

HostStart of issueEnd of issue
amssq31.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq32.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:45
amssq34.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq35.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq36.esams.wmnet2014-12-10T14:18:442014-12-10T14:18:45
amssq37.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:45
amssq38.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq39.esams.wmnet2014-12-10T14:18:422014-12-10T14:18:43
amssq40.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq41.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq42.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:45
amssq43.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq44.esams.wmnet2014-12-10T14:18:442014-12-10T14:18:46
amssq45.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq46.esams.wmnet2014-12-10T14:18:432014-12-10T14:18:44
amssq47.esams.wmnet2014-12-10T14:18:422014-12-10T14:18:44
amssq48.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:45
amssq49.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq50.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq51.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq52.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq53.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq54.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq55.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq56.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:45
amssq57.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:45
amssq58.esams.wikimedia.org2014-12-10T14:18:422014-12-10T14:18:43
amssq59.esams.wikimedia.org2014-12-10T14:18:422014-12-10T14:18:43
amssq60.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:44
amssq61.esams.wikimedia.org2014-12-10T14:18:432014-12-10T14:18:45
amssq62.esams.wikimedia.org2014-12-10T14:18:422014-12-10T14:18:44
cp1052.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:43
cp1053.eqiad.wmnet2014-12-10T14:18:442014-12-10T14:18:45
cp1054.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp1055.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:45
cp1065.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:43
cp1066.eqiad.wmnet2014-12-10T14:18:422014-12-10T14:18:43
cp1067.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:45
cp1068.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4008.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4009.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4010.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4016.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4017.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp4018.ulsfo.wmnet2014-12-10T14:18:432014-12-10T14:18:44

  • upload

Affected period: 2014-12-10T14:18:41 -- 2014-12-10T14:18:44
Duplicates: 0
Missing: 1834

HostStart of issueEnd of issue
cp1048.eqiad.wmnet2014-12-10T14:18:412014-12-10T14:18:42
cp1063.eqiad.wmnet2014-12-10T14:18:432014-12-10T14:18:44
cp3004.esams.wikimedia.org2014-12-10T14:18:412014-12-10T14:18:42
cp3006.esams.wikimedia.org2014-12-10T14:18:412014-12-10T14:18:42
cp3008.esams.wikimedia.org2014-12-10T14:18:412014-12-10T14:18:41
cp3009.esams.wikimedia.org2014-12-10T14:18:412014-12-10T14:18:42
cp3010.esams.wikimedia.org2014-12-10T14:18:422014-12-10T14:18:42
cp3015.esams.wmnet2014-12-10T14:18:412014-12-10T14:18:42
cp4007.ulsfo.wmnet2014-12-10T14:18:422014-12-10T14:18:43
QChris claimed this task.