Page MenuHomePhabricator

Substantial amount of ip addresses in 1:1000 sampled squid logs does not resolve into geo data, from Nov 2013 onwards
Closed, DeclinedPublic

Description

There is a large percentage 'country Unknown' in squid log reports like
http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryOverview2014Q4.htm.

which made me halt publication of N/S view/edit ratios for Q4, so upgrade yet for
http://stats.wikimedia.org/wikimedia/squids/PercentagesEditsViewsSouth_2014_Q3.png

Any fix can only be applied to data for last 3 months, so we need to push this forward.

I seems that only certain ip ranges do not resolve.
I will pursue this further.

Event Timeline

ezachte claimed this task.
ezachte raised the priority of this task from to High.
ezachte updated the task description. (Show Details)
ezachte subscribed.

Do the new UDF functions for handling X-Forwarded-For take care of this problem, Erik?

We are about to remove sample logs in favor of new pageview definition pageview counts.

This report has the newest data: https://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryBreakdown.htm