Run load tests for the rec-api-ng and update production resources to meet expected load
Open, Needs TriagePublic3 Estimated Story Points
Actions

Assigned To

Authored By

	kevinbazira
	May 22 2024, 6:42 AM

Description

In T308164, the ML team migrated the Content Translation Recommendation API to LiftWing.

The Product Team shared (T308164#9815882) an estimate of the expected traffic that the Content and Section Translation features will generate on LiftWing.

We are going to run load tests, measure the rec-api-ng performance, and then tune production to handle the expected load effectively.

Details

	Subject	Repo	Branch	Lines +/-
	locust: use multiple payloads for load testing	research/recommendation-api	master	+50 -7
	test: add locust load test	research/recommendation-api	master	+160 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T296994 Observations from research study for Section Translation on Thai Wikipedia
Open	None	T293648 Content Translation Recommendations API
Open	kevinbazira	T308164 Migrate Content Translation Recommendation API to Lift Wing
Open	kevinbazira	T365554 Run load tests for the rec-api-ng and update production resources to meet expected load

Event Timeline

kevinbazira created this task.May 22 2024, 6:42 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 22 2024, 6:43 AM

kevinbazira mentioned this in T308164: Migrate Content Translation Recommendation API to Lift Wing.May 22 2024, 6:44 AM

Isaac subscribed.Fri, May 24, 3:24 PM

Change #1035868 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[research/recommendation-api@master] test: add locust load test

https://gerrit.wikimedia.org/r/1035868

gerritbot added a project: Patch-For-Review.Mon, May 27, 1:01 PM

klausman set the point value for this task to 3.Tue, May 28, 2:27 PM

klausman moved this task from Unsorted to In Progress on the Machine-Learning-Team board.

Change #1035868 merged by jenkins-bot:

[research/recommendation-api@master] test: add locust load test

https://gerrit.wikimedia.org/r/1035868

Maintenance_bot removed a project: Patch-For-Review.Thu, May 30, 2:31 PM

Change #1038346 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[research/recommendation-api@master] locust: use multiple payloads for load testing

https://gerrit.wikimedia.org/r/1038346

gerritbot added a project: Patch-For-Review.Mon, Jun 3, 1:44 PM

Change #1038346 merged by jenkins-bot:

[research/recommendation-api@master] locust: use multiple payloads for load testing

https://gerrit.wikimedia.org/r/1038346

Maintenance_bot removed a project: Patch-For-Review.Wed, Jun 5, 1:30 PM

I ran load tests for the rec-api-ng hosted on LiftWing using the locust configurations set in the repo. As shown in the test results below, over a 60s period, the API received 33 requests, and all of them were successfully processed without any failures, indicating stable performance:

Timestamp	Type	Name	Request Count	Failure Count	Median Response Time	Average Response Time	Min Response Time	Max Response Time	Average Content Size	Requests/s	Failures/s	50%	66%	75%	80%	90%	95%	98%	99%	99.9%	99.99%	100%
20240606090807	GET	/api/	33	0	1800	1979.909090909091	767	3566	336.3939393939394	0.5763204548178809	0.0	1800	2000	2600	2700	3100	3400	3600	3600	3600	3600	3600

Additionally, I compared the performance of the rec-api instance hosted on LiftWing with that of wmflabs.org, and the test results showed that they are within the same range:

# rec-api hosted on wmflabs.org
$ time curl -s "https://recommend.wmflabs.org/api/?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 1067, "title": "White_currant", "wikidata_id": "Q621670", "rank": 499.0}, {"pageviews": 33, "title": "Sphaceloma_perseae", "wikidata_id": "Q7576474", "rank": 498.0}, {"pageviews": 84, "title": "Cadra_calidella", "wikidata_id": "Q5016600", "rank": 497.0}]

real	0m3.913s
user	0m0.028s
sys	0m0.011s

# rec-api-ng hosted on LiftWing
$ time curl -s "https://api.wikimedia.org/service/lw/recommendation/v1/api?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 1067, "title": "White_currant", "wikidata_id": "Q621670", "rank": 499.0}, {"pageviews": 33, "title": "Sphaceloma_perseae", "wikidata_id": "Q7576474", "rank": 498.0}, {"pageviews": 84, "title": "Cadra_calidella", "wikidata_id": "Q5016600", "rank": 497.0}]

real	0m3.281s
user	0m0.050s
sys	0m0.009s

Based on these results, the current resources allocated to the rec-api LiftWing instance will be able to effectively handle the expected load.

Run load tests for the rec-api-ng and update production resources to meet expected loadOpen, Needs TriagePublic3 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

Run load tests for the rec-api-ng and update production resources to meet expected load
Open, Needs TriagePublic3 Estimated Story Points
Actions

Related Objects
Search...