Page MenuHomePhabricator

Investigation: Test searching for user agents
Closed, ResolvedPublic1 Estimated Story Points

Description

It's possible that there's so much overlap of user agents that you'd get too much info to be useful.

Let's take some sample real user agents (maybe from CommTech staff), and run a query to see how many users and how many edits we'd get.

It's okay if this query takes a long time to run -- 24 hours or more. This is a one-time thing, to make sure we're going in the right direction.

Event Timeline

DannyH set the point value for this task to 1.Oct 11 2016, 9:32 PM
DannyH moved this task from To Be Estimated/Discussed to Estimated on the Community-Tech board.

Is this ticket still valid seeing that T146837: Add ability to search by user agent from CheckUser interface now mentions restricting searches to IP + UA only?

Not anymore. On one hand, if we want a new index it should probably be a multi-column index (both IP and UA). On the other hand, user-agent search without having a wildcard ability is really limited and allowing wildcards will make indexing a different discussion. I am unsure as to what is the best design here. @DannyH should opine as well.

For the record, I've already ran into scenarios were searching for an exact match would have been helpful. This is mainly when there's a wave of socks being created during a short period of time.

I am not denying that. What I am saying is we should try to envision a model where we can search by partial strings as well, efficiently.

Just like we can search for IP ranges efficiently.

@Huji I copied your suggestion about the index to T147894.

The plan right now is to build the useragent search without the wildcard ability, and see if there's demand for it once the first version's in use.

Closing this ticket. Thanks!

DannyH claimed this task.
DannyH edited projects, added Community-Tech; removed Community-Tech-Sprint.
DannyH moved this task from Estimated to Archive on the Community-Tech board.