Page MenuHomePhabricator

Automatically convert to wildcard searches when applicable
Open, Needs TriagePublic

Description

T156318#3030343 for context. We should skip the BETWEEN query for an IP range if we can get the same results with a much more efficient wildcard search.

For IPv6, this should be done when the CIDR is a multitude of 16, e.g. rev_user_text like "2602:306:%" instead of checking 2602:306::/32

For IPv4 when it is a multitude of 8, e.g. rev_user_text like "10.0.0.%" instead of 10.0.0.0/24

However in the interface I would not convert the inputted value to the wildcard that was actually used on the backend, just to avoid confusion. So if I enter 10.0.0.0/24 it should still say 10.0.0.0/24 after submission.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 15 2017, 7:25 PM

Actual, thinking about this further, its not clear that this would be faster than the dedicated table + query. They may have similar performance.

I'll run some more tests... but from the little bit of querying I did before it seemed like wildcards were faster. E.g. https://en.wikipedia.org/w/api.php?action=query&list=usercontribs&uclimit=50&ucdir=older&ucuserprefix=2607:FB90

I'll run some more tests... but from the little bit of querying I did before it seemed like wildcards were faster. E.g. https://en.wikipedia.org/w/api.php?action=query&list=usercontribs&uclimit=50&ucdir=older&ucuserprefix=2607:FB90

Keep in mind, the api is being sorted by IP address not by timestamp.

Keep in mind, the api is being sorted by IP address not by timestamp.

Gotcha. We are going to support two views, one by date and one by IP. So maybe for the latter we could use wildcards, if it proves to outperform.

Samwilson renamed this task from Automatically covert to wildcard searches when applicable to Automatically convert to wildcard searches when applicable.Feb 16 2017, 1:09 AM