Page MenuHomePhabricator

Number of Wikipedia Zero increasing drastically in mid March 2014
Closed, ResolvedPublic


It seems the number of log lines has increased a lot over the
last few weeks from ~2.5M/day to 3.3M/day [1].

Is this increase sane?

(Is the increase related to switching on HTTPS for zero?)


qchris@stat1002 0 21:01:43
cwd: ~
for i in /a/squid/archive/zero/zero.tsv.log-201403* ; do echo "$i: $(zcat $i | wc -l)" ; done
/a/squid/archive/zero/zero.tsv.log-20140301.gz: 2572772
/a/squid/archive/zero/zero.tsv.log-20140302.gz: 2550606
/a/squid/archive/zero/zero.tsv.log-20140303.gz: 2687931
/a/squid/archive/zero/zero.tsv.log-20140304.gz: 2749754
/a/squid/archive/zero/zero.tsv.log-20140305.gz: 2669759
/a/squid/archive/zero/zero.tsv.log-20140306.gz: 2733986
/a/squid/archive/zero/zero.tsv.log-20140307.gz: 2680985
/a/squid/archive/zero/zero.tsv.log-20140308.gz: 2517903
/a/squid/archive/zero/zero.tsv.log-20140309.gz: 2577466
/a/squid/archive/zero/zero.tsv.log-20140310.gz: 2845407
/a/squid/archive/zero/zero.tsv.log-20140311.gz: 2945301
/a/squid/archive/zero/zero.tsv.log-20140312.gz: 3010404
/a/squid/archive/zero/zero.tsv.log-20140313.gz: 2871820
/a/squid/archive/zero/zero.tsv.log-20140314.gz: 2880289
/a/squid/archive/zero/zero.tsv.log-20140315.gz: 2744255
/a/squid/archive/zero/zero.tsv.log-20140316.gz: 2771308
/a/squid/archive/zero/zero.tsv.log-20140317.gz: 2958194
/a/squid/archive/zero/zero.tsv.log-20140318.gz: 3192418
/a/squid/archive/zero/zero.tsv.log-20140319.gz: 3352401

Version: unspecified
Severity: normal



Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:04 AM
bzimport set Reference to bz62848.
bzimport added a subscriber: Unknown Object (MLST).

bingle-admin wrote:

Prioritization and scheduling of this bug is tracked on Mingle card

Hi Dan -- can you please triage?



(In reply to Toby Negrin from comment #2)

Hi Dan -- can you please triage?



Sure I'll investigate.

  • Dan

Dan asked me to review. I'll examine over the next few days. Need tomorrow to think about it, then probably Monday to analyze and Tuesday to do a second pass.

Since numbers reported by our monitoring went to >5M today, I had a
quick look, just to make sure our infrastructure is not badly broken.

Lines for SSL requests since yesterday skyrocketed.
Lines for carrier 470-01 since yesterday skyrocketed.

So to me it currently does not look like a problem with the analytics

I don't know if this may be related: bug 62980

(In reply to Jesús Martínez Novo (Ciencia Al Poder) from comment #6)

I don't know if this may be related: bug 62980

Thanks for the pointer!

It's a bit subtle, but the difference between bug 62980 and this bug is
between plain number of log lines (this bug) and which of those log lines
get counted as page views (bug 62980).

So to me, they are separate things.

What's the current thinking here? Has there been any more investigation?

(In reply to Greg Grossmeier from comment #8)

What's the current thinking here? Has there been any more investigation?

We've played whack-a-mole, and will probably need to keep doing so, until the root cause is addressed with the operator.

Is someone still looking into this?

Can we identify which partner (X-CS) is responsible for the increase at that time? I can look into more details once I have that information.

Sorry, this ticket is quite old and refers to an infrastructure we no longer use to count pageviews for zero . Closing.