Page MenuHomePhabricator

Raw webrequest partitions for 2014-10-08T1[89]:xx:xx not marked successful
Closed, DeclinedPublic

Description

The two bits partitions [1] on 2014-10-08T1[89]:xx:xx, were not marked
successful.

What happened?

[1]


qchris@stat1002 jobs: 0 time: 13:05:11 // exit code: 0
cwd: ~/cluster-scripts
./dump_webrequest_status.sh

+---------------------+--------+--------+--------+--------+
| Date                |  bits  |  text  | mobile | upload |
+---------------------+--------+--------+--------+--------+

[...]

| 2014-10-08T16:xx:xx |    .   |    .   |    .   |    .   |    
| 2014-10-08T17:xx:xx |    .   |    .   |    .   |    .   |    
| 2014-10-08T18:xx:xx |    X   |    .   |    .   |    .   |    
| 2014-10-08T19:xx:xx |    X   |    .   |    .   |    .   |    
| 2014-10-08T20:xx:xx |    .   |    .   |    .   |    .   |    
| 2014-10-08T21:xx:xx |    .   |    .   |    .   |    .   |

[...]

+---------------------+--------+--------+--------+--------+

Statuses:

. --> Partition is ok
X --> Partition is not ok (duplicates, missing, or nulls)

Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=71435

Details

Reference
bz71881

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:43 AM
bzimport set Reference to bz71881.
bzimport added a subscriber: Unknown Object (MLST).

Since checking missings across hour boundaries does not make the
missing go away [1], it does not seem to be the race condition
described in bug 69615.

Since it seems only esams bits are affected, it might be another
instance of bug 71435.

More investigation needed.

[1]


qchris@stat1002 jobs: 0 time: 12:55:38 // exit code: 0
cwd: ~/refinery
hive -f two_hour_stats.hql -d table=wmf_raw.webrequest -d webrequest_source=bits -d year=2014 -d month=10 -d day=8 -d hourA=18 -d hourB=19
[...]
Total MapReduce CPU Time Spent: 0 days 2 hours 44 minutes 50 seconds 570 msec
OK
hostname sequence_min sequence_max count_actual count_expected count_different count_duplicate count_null_sequence percent_different
cp3020.esams.wikimedia.org 2135978184 2170880606 34085296 34902423 817127 0 0 -2.3411755682406348
cp3022.esams.wikimedia.org 2389095903 2423994513 34670200 34898611 228411 0 0 -0.6544988280479128
cp3019.esams.wikimedia.org 2137887090 2172774146 34532139 34887057 354918 0 0 -1.0173343082507649
Time taken: 265.71 seconds, Fetched: 3 row(s)

kevinator set Security to None.