Page MenuHomePhabricator

Verify if the Python recommendation API can support the use cases of the nodejs one
Open, Stalled, Needs TriagePublic

Description

The recommendation API written in nodejs, and exposed via Restbase, seems to be called by the Android app in this way (data from https://w.wiki/6sxS):

# top 10

Uri Path,Hits
-------------
/api/rest_v1/data/recommendation/description/addition/en,199
/api/rest_v1/data/recommendation/caption/addition/en,176
/api/rest_v1/data/recommendation/caption/addition/de,37
/api/rest_v1/data/recommendation/description/addition/de,31
/api/rest_v1/data/recommendation/description/translation/from/de/to/en,27
/api/rest_v1/data/recommendation/description/addition/fr,21
/api/rest_v1/data/recommendation/description/addition/ru,20
/api/rest_v1/data/recommendation/caption/addition/ru,18
/api/rest_v1/data/recommendation/description/addition/es,18
/api/rest_v1/data/recommendation/description/translation/from/fr/to/en,18

The Python API offers, afaics, the following:

https://recommend.wmflabs.org/types/translation/
https://recommend.wmflabs.org/api/spec

In this task I'd like to explore the possibility of moving the Android app to the Python API, when it will run on Lift Wing. It would allow us to deprecate the nodejs recommendation API focusing only on one product.

Event Timeline

@Isaac if you have time and patience, do you mind to have a chat about this task?

@elukey sure anytime! for what it's worth, as part of an analysis a few years back, I translated the logic into Python to simulate calls for some of the services so my guess is that it's not hard to port over the whole thing. Example: https://github.com/geohci/wiki-prioritization/blob/master/recommendation_evaluation/suggested_edits/SE_imagecaptions.py#L157

Just leaving some thoughts here on what it would mean to migrate from nodejs to python. My understanding of the nodejs service is that all but one of the endpoints are purely Mediawiki API calls + rule-based logic:

  • That exception is the article-creation-morelike endpoint (REST endpoint; code), which does use the model/database mentioned in the README but is not receiving any traffic per your statistics in the task description and presumably is quite outdated. I would assume this can be dropped, greatly reducing the complexity.
  • The other article creation endpoint (code) also isn't seeing traffic so can be dropped. It seems to be what exists on the Python service right now and is pure API calls + logic.
  • There are two core modules that are left and seeing traffic: description (code) and caption (code). I have already ported both into Python for an analysis a few years back so the core code essentially already exists (caption; description) and the work would be stripping out my unnecessary statistical logging and put a simple FastAPI etc. framework around it. As I said, these are just Mediawiki/Wikibase API calls plus filtering rules so quite simple and unlikely to cause issues with dependencies, resources, etc. and with a very low maintenance overhead once implemented.
  • As far as I can tell, most of the other code in the repo is around templates for API calls or error logging, which I assume can be removed as LiftWing has its own logging infrastructure/standards and each service is independent.
elukey changed the task status from Open to Stalled.Jul 13 2023, 2:48 PM
elukey removed elukey as the assignee of this task.

Thanks a lot for the inputs Isaac! I think this task should be progressed by whoever owns / will-own the nodes recommendation-api service. Setting the status to stalled for the moment.