The Language Team wants improvements to the content translation recommendation system.
This is the epic for the ML team for this task.
The Language Team wants improvements to the content translation recommendation system.
This is the epic for the ML team for this task.
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Update repo docs to match LiftWing API GW docs | research/recommendation-api | master | +62 -51 |
@calbon @kevinbazira should we keep this task open? If so, what are the next steps and/or subtasks?
We are working on T338471 to figure out if the old recommendation-api service can be deprecated.
hey all (not sure who exactly to tag but maybe I'll start with @kevinbazira just because I know you did a lot of good work on this) -- I'm working on some planning for improvements to our recommender systems for next fiscal year around what topic filters we provide to editors. Content Translation is of special interest but Android's SuggestedEdits is important too. The recommendation logic for both of these systems is still hosted on GapFinder as far as I can tell, but deploying any improvements is going to require moving them to a proper service (LiftWing). Does anyone know why this effort to move Content Translation's recommendation API over to LiftWing (along with Android's endpoints T340854) stalled last year?
Hi @Isaac thank you for following up on this. The Content Translation recommendation API is now live in LiftWing production. It can be accessed through:
1.External endpoint:
curl -s "https://api.wikimedia.org/service/lw/recommendation/v1/api?s=en&t=fr&n=3&article=Apple"
2.Internal endpoint:
curl "https://recommendation-api-ng.discovery.wmnet:31443/api?s=en&t=fr&n=3&article=Apple"
Feel free to explore these endpoints, and if you encounter any edge cases, please don't hesitate to let us know. :)
Ahh this is great news @kevinbazira ! @KartikMistry is there any reason from the Content Translation side why we can't switch over to the LiftWing endpoint? My read is that the code is quite simple -- e.g., if I go to Content Translation on Spanish Wikipedia, the tool hits this endpoint:
https://recommend.wmflabs.org/types/translation/v1/articles?source=en&target=es&seed=Music%20Modernization%20Act|Felony%20disenfranchisement&search=morelike&application=CX
To switch to LiftWing, the client would instead have to hit:
https://api.wikimedia.org/service/lw/recommendation/v1/api?s=en&t=es&article=Music%20Modernization%20Act|Felony%20disenfranchisement&search=morelike&application=CX
So just a change of base URL and I think a few parameters will have to change names too but the result is in the exact same format and will return the exact same results (so no change from the UI side yet needed).
This is great news! CC @ngkountas I think we need to do some more changes apart from changing some parameters here though.
The response from the two endpoints Isaac listed above, are identical. If this is the case, it seems like a straightforward switch.
@ngkountas I think you're right. Let me know if you have a task for making the switch because once you all have completed the transfer and verified it's working for a bit, I'll look into deprecating the current endpoint that you're using (as it's long overdue for being shut down).
I created a ticket focused on the changes of the endpoints in Content and Section Translation: T365347: Update endpoints used in Content and Section Translation to use the LiftWing version of the Recommendation API
Hi @Pginer-WMF, do you have an estimate of the expected traffic the Content and Section Translation features will generate on LiftWing after the migration? This information will enable us to tune production to handle the expected load effectively.
I don't have direct visibility on the number of requests the tool makes to the current Recommendation API. Maybe @KCVelaga_WMF has some numbers at hand.
If an estimation is useful, the tool has been used to publish 30K - 50K articles per month. Not everyone accessing the tool ends up publishing a translation. So I'd expect visits to the dashboard to be 3 times larger. Visits are shown suggestions view (where the Recommendations API is requested) when they have no translations in-progress. Visits can still change views and refresh the suggestions view which will generate more requests.
Thank you for sharing the esitimates @Pginer-WMF, we are going to T365554: Run load tests for the rec-api-ng and update production resources to meet expected load
@Pginer-WMF is right.
For the past 90 days, here are the numbers from cx event logs:
event | n_events | Recommendation API |
---|---|---|
dashboard_open | 96423 | yes |
dashboard_translation_start | 84475 | no |
editor_segment_add | 66954 | no |
dashboard_translation_continue | 5112 | no |
dashboard_search | 3473 | yes |
dashboard_translation_discard | 1178 | no |
dashboard_discard_suggestion | 919 | no |
dashboard_refresh_suggestions | 528 | yes |
I am not very confident about all the steps where the tool makes requests to the recommendation API. Based on the table above, it might be ~100K requests for 3 months. But if there are other steps that use the API, for example, when an editor adds or loads a section, it can be added to the count.
@kevinbazira will we have a page on the API Gateway to link to as documentation purposes (something like this)? I've been helping a few folks who are migrating user scripts and other uses of the old Cloud VPS endpoint to the new LiftWing one and it'd be nice to have a page showing the expected parameters etc.
Yes @Isaac, here's the API Gateway documentation for the content translation recommendation API hosted on LiftWing:
https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_content_translation_recommendation
Thank you for supporting users who are migrating to the new endpoint!
here's the API Gateway documentation for the content translation recommendation API hosted on LiftWing: https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_content_translation_recommendation
:face-palm: I completely missed that. Thanks @kevinbazira ! For posterity, I used Global Search (query) to find user-script usages of the old API and either left a message or asked Amir S. to fix them (as he had been involved with their creation):
When the Cloud VPS endpoint is finally deprecated (sometime between July 1 and 15 likely per T367549), I'll return a basic note to any requests there. You can see what it'll say at this staging instance: https://gapfinder-deprecation.wmcloud.org/
I don't have any insight into toolforge tools that use the endpoint but hopefully the deprecation note will be sufficient to help them move over if they exist.
Super! The deprecation message you prepared is thorough. It will help users transition smoothly.
Regarding other tools that use the old endpoint, I used the search term recommend.wmflabs.org across GitHub, GitLab, and CodeSearch. Here are the results:
Platform | Search URL | Results |
GitHub | https://github.com/search?q=recommend.wmflabs.org&type=code | 94 files |
GitLab | https://gitlab.wikimedia.org/search?search=recommend.wmflabs.org&nav_source=navbar | 0 files |
CodeSearch | https://codesearch.wmcloud.org/search/?q=recommend.wmflabs.org&files=&excludeFiles=&repos= | 7 files (overlap with GitHub) |
I can start reaching out to these users to inform them about the migration and point them to the new resources. Let me know if you would like me to proceed with that!
Thanks for digging this up @kevinbazira ! I glanced through and much of it was the Content Translation Extension (which Language is working on porting) or copies of configuration from that repo. I did leave a message around the beta-labs settings from CodeSearch because that felt like something that should be removed when appropriate (T365347#9910134) and could possibly be missed. The only piece I wasn't sure about was the uMatrix code. And then maybe some other random uses but they all seemed to be older, unmaintained repos. Feel free to reach out obviously where you feel useful but based on the searches you pulled together, I feel like we're in a pretty good place!
Change #1049489 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):
[research/recommendation-api@master] Update repo docs to match LiftWing API GW docs
Change #1049489 merged by jenkins-bot:
[research/recommendation-api@master] Update repo docs to match LiftWing API GW docs
Closing this task as we completed migrating the content translation recommendation API to LiftWing, and the Language Team is making improvements to it in T369484: Modernize recommendation API.