Page MenuHomePhabricator

Instrument the location search API usage from the iOS app
Closed, ResolvedPublic

Description

In order to evaluate building new RESTful location APIs to the RESTBase infrastructure, we need to understand the current usage in order to predict scaling.

Logging the API calls in the same way that we log page views would be helpful here.

@Tbayer I assume you need the exact API calls being made by the iOS app to be able to do this?

Event Timeline

Yes, so as briefly discussed back in March, if someone posts some example URLs for these API requests, I can try to write a query that counts them as recorded in the webrequest table. (I assume this is about getting a one-time estimate rather than instrumentation in the sense of setting up a permanent mechanism to count them on an ongoing basis.)

@Tbayer yes this is more about a one time estimate.

@JoeWalsh are you able to provide example API calls that you are calling for the map view? I think you are doing a few different ones for the search box autocomplete vs searching vs panning and zooming

For populating the map:
en.wikipedia.org/w/api.php?action=query&cirrusIncLinkssW=1000&colimit=50&coprop=type%7Cdim&format=json&generator=search&gsrlimit=50&gsrsearch=nearcoord%3A5021m%2C37.332%2C-122.031&pilimit=50&piprop=thumbnail&pithumbsize=240&prop=coordinates%7Cpageimages%7Cpageterms

For search completion:
en.wikipedia.org/w/api.php?action=query&continue=&coprop=type%7Cdim&format=json&generator=prefixsearch&gpslimit=24&gpsnamespace=0&gpssearch=A&list=search&pilimit=24&piprop=thumbnail&pithumbsize=120&prop=pageterms%7Cpageimages%7Crevisions%7Ccoordinates&redirects=1&rrvlimit=1&rvprop=ids&srinfo=suggestion&srlimit=1&srnamespace=0&sroffset=0&srprop=&srsearch=A&srwhat=text&wbptterms=description

For bounding boxes:
www.wikidata.org/w/api.php?action=wbgetentities&titles=France&sites=enwiki&format=json

Those are specific requests, let me know if you need any other info.

thanks @JoeWalsh

@Tbayer let us know if you need anything else

...

For bounding boxes:
www.wikidata.org/w/api.php?action=wbgetentities&titles=France&sites=enwiki&format=json

Just to double-check: That seems to be a query to the generic API on Wikidata for the full information about the item. (In this example it includes population numbers back to 1901 and a link to the pronunciation of the country's name in Hungarian ;) I guess that's because the Wikibase API does not yet support queries for the value of a particular claim only.
Are there other Wikidata queries from the app that are in the same format but should be disregarded for the present estimate because they serve a different purpose? Or are bounding boxes the only case where the app accesses the wbgetentities module of this API?

@Tbayer there aren't other queries from the app that access`wbgetentities` - only the places map view to get the bounding box

@Tbayer if it’s difficult we can probabaly forego the Wikidata query. That’s not as important as the other queries.

@Tbayer need any additional information for this task?

Also to help with prioritization: Timing for running these queries is early next quarter (before beginning any work around new location APIs).

Thanks again!

@Fjalapeno & @JoeWalsh Thanks! I think we're good for now. I'll next make some regexes matching these example queries, double-check that they don't catch too much, and run them over a sufficiently long period of time. Will aim do have it done by the beginning of October per your note (it shouldn't be a lot of work, but there's always other timely stuff...).

@Tbayer obviously there is a lot going on and this is low priority but just wanted to see if this has made it into view.

FWIW because of OCG and Services team changes, any work based on this research would not begin until well after the new year.

Thanks for the update on the timing! We can chat about this in person at the offsite now, but to record a result about the first part already (populating the map): This seems to be getting a bit less than 400k requests per day on average, with 94-95% of them for a single query string (all parameters identical - is this for some kind of initial coordinate?):

yearmonthdayratioall_requests
20171010.9412016102687794407137
20171020.9428859460987723350159
20171030.9426164060013512358186
20171040.9413738248451068334424
20171050.9433589227508327336858
20171060.9430774211825651345926
20171070.9446154854749603396591
20171080.9425425343546711415664
20171090.9421539447903453361857
201710100.9431132438619899345511
201710110.9462533530989694339984
201710120.9449807633299502338936
201710130.9474694478213578343838
201710140.9464420722345649387543
201710150.9452706184622824409451
SELECT year, month, day,
SUM(IF(uri_query = '?action=query&cirrusIncLinkssW=1000&colimit=50&coprop=type%7Cdim&format=json&generator=search&gsrlimit=50&gsrsearch=nearcoord%3A40075000m%2C48.866%2C2.314&pilimit=50&piprop=thumbnail&pithumbsize=240&prop=coordinates%7Cpageimages%7Cpageterms',1,0))/SUM(1) AS ratio,
SUM(1) AS all_requests
FROM wmf.webrequest
WHERE
year = 2017 AND month = 10 AND day <= 15
AND uri_path = '/w/api.php'
AND uri_query LIKE '%cirrusIncLinkssW%'
GROUP BY year, month, day
ORDER BY year, month, day LIMIT 100000;
JMinor claimed this task.