Page MenuHomePhabricator

Normalize the domain names while querying for uniques based on last-access cookie
Closed, DuplicatePublic

Description

When querying for unique last-access clients grouped by uri_host in the webrequest table - some of the host names with low uniques counts look like they are not Wikimedia projects or in some cases - mixed case version names of existing ones (Eg: EN.wikipedia.org). We should normalize these like, en.wikipedia.org/En.wikipedia.org -> enwiki.