While the kafkatee zero files and udp2log zero files previously
more or less agreed in number of lines, the kafkatee zero files
currently have ~10% less lines than udp2log zero files. As the
files start to run appart for the 2014-03-27 file [1], it seems
to have started around 2014-03-26.
Is this difference expected?
P.S.: Running
zgrep '^cp3014.*http://en\.m\.wikipedia\.org/wiki/Mamunur_Rashid' /a/{squid/archive,log/webrequest}/zero/zero.tsv.log-20140419.gz
on stat1002 will give you a harmless looking log line that's only in the udp2log stream, and not in the kafka stream.
[1]
qchris@stat1002 0 11:44:00
cwd: ~
for i in /a/squid/archive/zero/zero.tsv.log-20140[34]*.gz ; do echo "$i: $(zcat $i | wc -l)" ; done
[...]
/a/squid/archive/zero/zero.tsv.log-20140324.gz: 4092113
/a/squid/archive/zero/zero.tsv.log-20140325.gz: 3953345
/a/squid/archive/zero/zero.tsv.log-20140326.gz: 3944141
/a/squid/archive/zero/zero.tsv.log-20140327.gz: 4145574
/a/squid/archive/zero/zero.tsv.log-20140328.gz: 6191020
/a/squid/archive/zero/zero.tsv.log-20140329.gz: 8338602
/a/squid/archive/zero/zero.tsv.log-20140330.gz: 10344328
/a/squid/archive/zero/zero.tsv.log-20140331.gz: 11867766
/a/squid/archive/zero/zero.tsv.log-20140401.gz: 12191221
/a/squid/archive/zero/zero.tsv.log-20140402.gz: 13204630
/a/squid/archive/zero/zero.tsv.log-20140403.gz: 13793041
[...]
qchris@stat1002 0 11:44:15
cwd: ~
for i in /a/log/webrequest/zero/zero.tsv.log-20140[34]*.gz ; do echo "$i: $(zcat $i | wc -l)" ; done
[...]
/a/log/webrequest/zero/zero.tsv.log-20140324.gz: 4087376
/a/log/webrequest/zero/zero.tsv.log-20140325.gz: 3946964
/a/log/webrequest/zero/zero.tsv.log-20140326.gz: 3939808
/a/log/webrequest/zero/zero.tsv.log-20140327.gz: 3918716
/a/log/webrequest/zero/zero.tsv.log-20140328.gz: 5545742
/a/log/webrequest/zero/zero.tsv.log-20140329.gz: 7500078
/a/log/webrequest/zero/zero.tsv.log-20140330.gz: 9303477
/a/log/webrequest/zero/zero.tsv.log-20140331.gz: 10671821
/a/log/webrequest/zero/zero.tsv.log-20140401.gz: 10971750
/a/log/webrequest/zero/zero.tsv.log-20140402.gz: 11885403
/a/log/webrequest/zero/zero.tsv.log-20140403.gz: 12417002
[...]
Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=71056