Username: musikanimal
Full name: Leon Ziemba
Public key:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC5jAvhIngD3svnIyBaHkhZTPEJc80jM363NfWUaFNcdi7n/VudTa3t8vL9jb1OZBUWnL/gfIW4VeLU4rKsfQkcpw6BpL9Qmr50Ewex9eU2pN3/tu1JN9OGNoJry8q81ZaxpH2wJD0JmCC4nlL84Ie7YjZQdcDpeDp4NL/eqEN30DilejVc34cMFpxcH2UYtJnoHGgSPBNsRvftrSniENKlWBrNF+Gjeg+awidUnlpTfGA0q8AGa5Fo69GkHxAzUymgNgeCY6w2H/HqgFcKT53YWgkViBZC0vi3Y0X0EDxnTgYbbKmSij7JU7Z4qJzzd+Tscd/xcO20hPsAYXcW/nF5 musikanimal@wikimedia.org
I am an engineer for Community Tech. As I understand it, being part of the analytics-privatedata-users access group allows me to connect to the Analytics team's MariaDB slaves. This will be very helpful for work I'm doing right now. My colleagues have been helping me run test queries on enwiki.cu_changes, see T156318, so I was going to eventually ask for prod db access, but if I have access to identical, unsanitized slaves than I won't need it :) We will be doing numerous similar projects in 2017 as part of the community health initiative to counter harassment.
Next, as (mostly) a volunteer effort, I want to identify bots that inflate pageviews stats returning by the RESTBase /metrics/pageviews/top endpoint. I have a system setup on Topviews where users can report false positives, so that I can autoexclude such pages from the tool. Much of the time this easy, just compare mobile versus desktop, but other times it's hard to say. Being able to dig deeper and see if there are unreasonable requests coming from a single IP, or finite set of IPs, etc., will lend some clarity. Obviously I won't be sharing any private data, but the hope is I can offer more reliable data by filtering out known false positives. In doing this I'll hopefully also be able to help improve bot detection in general for the Pageviews API, passing on my finding to the Analytics team. I admittedly am not very familiar with the database schema, but I suppose getting access is the first step :) I am under the impression that help from some Analytics team members is at my disposal in pursuing this effort, so if I am unsure about something I won't hesitate to reach out to them first.