Page MenuHomePhabricator

Cache categories, establishing a short-lived area->categories dictionary
Closed, ResolvedPublic

Description

If a picture is uploaded within a certain radius of a previously uploaded pic, retrieve category suggestions from cache to prevent making redundant calls to the MediaWiki API. A 3rd party library is likely to be required for this task. Library that we plan on using: https://github.com/varunpant/Quadtree , which is the Java implementation of a Quadtree.

Having gone through the QuadTree documentation and associated articles about quadtrees, my impression of how I would go about this task is:

  1. Construct a QuadTree
public QuadTree(double minX, double minY, double maxX, double maxY)

with min/max values according to
http://stackoverflow.com/questions/15965166/what-is-the-maximum-length-of-latitude-and-longitude

  • Latitude : max/min +90 to -90
  • Longitude : max/min +180 to -180

Where longitude = X and latitude = Y.

  1. For each picture uploaded that has GPS coordinates, set a point in that QuadTree using the method
public void set(double x, double y, Object value)

The Object value is the List of categories that were previously found for that picture.

  1. For each picture uploaded that has GPS coordinates, also
public Point[] searchWithin(final double xmin, final double ymin, final double xmax, final double ymax)

searchWithin(x-50, y-50, , x+50, y+50) of new point to see if there are any old points within that area. I will need to convert distance in metres to difference in decimal degrees for this. As accuracy is not that important for this task, I will convert using the formula at http://gis.stackexchange.com/questions/2951/algorithm-for-offsetting-a-latitude-longitude-by-some-amount-of-meters . It states that

111,111 meters (111.111 km) in the y direction is 1 degree (of latitude) and 111,111 * cos(latitude) meters in the x direction is 1 degree (of longitude).

Edit 7/1/16: Code for this task complete, pull request submitted.

Event Timeline

josephine_l claimed this task.
josephine_l raised the priority of this task from to Medium.
josephine_l updated the task description. (Show Details)
josephine_l moved this task from Backlog to Week-size tasks on the Commons-App-Android-Upload board.

Thanks! I will get started on that as soon as I can get the bug in this week's task ironed out.

@Nicolas_Raoul , I have drafted a rough plan of how I will be going about this task (see task description). Please let me know if I'm on the right track?

I will try to get a dummy app up by manually setting arbitrary points and categories, and if it works as intended I will go back to the code in the last merged pull request (since the current PR isn't functional) and implement it there.

@Nicolas_Raoul thanks. :) It appears that the projection of latitude/longitude to X/Y coordinates actually depends on the projection system used? http://gis.stackexchange.com/questions/11626/does-y-mean-latitude-and-x-mean-longitude-in-every-gis-software

I will be using Lat = Y Long = X as suggested in an answer there. Although this does mean that I will have to flip the lat|long values around since x usually comes before y but lat usually comes before longitude...

Indeed that is a very common bug cause in GIS.
OK!

@Nicolas_Raoul - Now that I can get the QuadTree methods to work (roughly speaking) with my data, I need to decide how I want to implement the storage. I looked at the storage options for Android...

  • SharedPreferences only allows storage of primitive data types so that is out
  • Saving as a file (on internal or external storage) doesn't seem to work well for our purpose since it needs to read a whole stream from the file, but maybe I could get it to work?
  • Using a SQLite DB. The problem is that QuadTree itself stores its own data as nodes, which I don't know if it's compatible with the standard SQLite table. I shouldn't technically need SQLite to query/update etc the data, since QuadTree itself as methods for that. But how do I make QuadTree data 'persist' across app lifecycles?

We don't need to store: it is very short-lived cache. No problem if it
disappears every minute.

@Nicolas_Raoul - Is it okay for the cache to only return one nearby point with one nearby list of categories, or do we have to return all the points, weed out unique categories, and suggest all of them?

Second solution sounds best, if there is no big difficulty with it.

I can't seem to figure out why only one point is ever returned by the QuadTree. But given that we are only looking within a 100m area radius in our searchWithin() method, the overlap of categories returned is likely to be very high. So for the time being my code only returns the 'first' category list found (as we will only find one list attached to one point), until we can figure out the QuadTree point return issue.

josephine_l moved this task from Doing to Done on the Commons-App-Android-Upload board.