Assess runtime performance impact of pydantic data models in the RRLA model-server
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	kevinbazira
	Jan 24 2024, 8:07 AM

Description

A new version of the knowledge integrity system was released by the Research Team with some data models now being pydantic based. This introduces additional runtime typechecking for every request, potentially impacting performance.

We are going to assess this performance impact by re-running benchmarks for the RRLA model-server hosted on LiftWing and comparing results with previous versions.

Details

	Subject	Repo	Branch	Lines +/-
	RRLA: upgrade KI from v5 to v6	machinelearning/liftwing/inference-services	main	+10 -10

Customize query in gerrit

Related Objects

Mentioned In: T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0
rMLIS388d8f4bae58: RRLA: upgrade KI from v5 to v6
T358744: Deploy RR-language-agnostic batch version to prod
Mentioned Here: P59447 [load test] revertrisk language agnostic
P59464 [load test] revertrisk language agnostic (KI v0.6)
T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0
P58692 RRLA model-server: test envs to assess runtime performance between KI v5 vs KI v6

Event Timeline

kevinbazira created this task.Jan 24 2024, 8:07 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 24 2024, 8:07 AM

MunizaA subscribed.Jan 24 2024, 8:20 AM

I have been working on updating knowledge-integrity in the RRLA model-server. Tried running it locally and I am currently getting dependency conflicts between kserve's fastapi pydantic and knowledge-integrity's pydantic as shown below:

ERROR: Cannot install -r revert_risk_model/model_server/revertrisk/requirements.txt (line 1), knowledge-integrity[revertrisk]==0.6.0 and kserve because these package versions have conflicting dependencies.

The conflict is caused by:
    knowledge-integrity[revertrisk] 0.6.0 depends on pydantic<3.0.0 and >=2.1.1
    knowledge-integrity 0.6.0 depends on pydantic<3.0.0 and >=2.1.1
    fastapi 0.95.2 depends on pydantic!=1.7, !=1.7.1, !=1.7.2, !=1.7.3, !=1.8, !=1.8.1, <2.0.0 and >=1.6.2
    knowledge-integrity[revertrisk] 0.6.0 depends on pydantic<3.0.0 and >=2.1.1
    knowledge-integrity 0.6.0 depends on pydantic<3.0.0 and >=2.1.1
    fastapi 0.95.1 depends on pydantic!=1.7, !=1.7.1, !=1.7.2, !=1.7.3, !=1.8, !=1.8.1, <2.0.0 and >=1.6.2
    knowledge-integrity[revertrisk] 0.6.0 depends on pydantic<3.0.0 and >=2.1.1
    knowledge-integrity 0.6.0 depends on pydantic<3.0.0 and >=2.1.1
    fastapi 0.95.0 depends on pydantic!=1.7, !=1.7.1, !=1.7.2, !=1.7.3, !=1.8, !=1.8.1, <2.0.0 and >=1.6.2

I see that this is caused because of the fastapi version installed with kserve which supports only pydantic versions <2.0.0
You can try to install a newer version of fastapi e.g. fastapi==0.109.0 and see if this solved the issue.
To do this you can just insert the fastapi dependency before the kserve one in the requirements.txt

Thank you for the suggestion @isarantopoulos, I tried fastapi==0.109.0 and run into the error below. It looks like kserve 0.11.2 doesn't support it.

ERROR: Cannot install -r revert_risk_model/model_server/revertrisk/requirements.txt (line 3) and fastapi==0.109.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested fastapi==0.109.0
    kserve 0.11.2 depends on fastapi<0.96.0 and >=0.95.0

I see that kserve does not yet support pydantic v2 and there is work in that direction.
This means that we will have to loosen the knowledge_integrity dependency to allow for versions < 2.0.0 to be installed.

I pinged Muniza about the possibility of loosening the knowledge-integrity constraint to allow for pydantic < 2.0.0 and here is her response:

On Slack, Muniza wrote:

... there are breaking code changes between pydantic v1 and v2 so it won't be possible to just loosen constraints. We'd need to downgrade pydantic in KI to v1 which would require some code changes but more importantly pydantic v1 is supposed to be considerably slower than v2 which might impact the latency of the models.

Since knowledge-integrity 0.6.0 now supports pydantic >=2.1.1 and kserve's fastapi currently supports pydantic < 2.0.0, we will need to wait until fastapi adds support for pydantic > 2.0.0 to effectively run the KI 6.0 performance assessment.

Yesterday I had a discussion about this with Muniza. There is a more recent PR to upgrade to new fastapi and support pydantic v2 in kserve, opened just last week.
https://github.com/kserve/kserve/pull/3374
https://github.com/kserve/kserve/issues/3373

It's unclear when these will be completed and merged into kserve. If it doesn't take too long, we can wait. But if it may take a few months, we would need to downgrade pydantic to v1, which would require making some changes to the code and it would be considerably slower than v2.

Maybe we can ask about the estimated timeline for supporting pydantic v2 in the upcoming Kserve community meeting next Wednesday, so that we can decide our next step.

At the moment it would be great if we could do one of the following:

have the ability to skip/disable pydantic validation through a flag or something similar. I'm not aware if this is possible so we'll have to investigate if its feasible
add support for v1 and have the ability to switch between these two. Again I'm not aware if it is possible but it stands from a software engineering point of view.

I would advise against rewriting the code to support v1 instead of v2 as Muniza has done such great work which we would have to redo when kserve moves to v2.
Also waiting for the kserve support would leave us blocked for the time being. Kserve release cycles happen every ~3-6 months and that means we would have to wait for at least 5-6 months.

If you think that my suggestions make sense I can help in assessing the above options. Otherwise we still need to coordinate on which approach to follow.

isarantopoulos moved this task from Unsorted to Blocked on the Machine-Learning-Team board.Feb 9 2024, 5:16 PM

Following a discussion with Ilias, we will keep an eye on the progress of https://github.com/kserve/kserve/pull/3374. Once the PR is merged, we will use the pre-release version for testing.

achou mentioned this in T358744: Deploy RR-language-agnostic batch version to prod.Feb 29 2024, 9:50 AM

The PR for pydantic v2 in kserve has been merged! We can use this commit https://github.com/kserve/kserve/commit/426fe21da0612ea6ef4a116b5114270313e02bbb to test the RRLA model-server :)

Following the workflow we use to build LiftWing model-servers, which involves installing pip dependencies listed in the requirements.txt file. I added the above pre-release commit to the RRLA requirements.txt file, and when I ran pip install -r requirements.txt, the error below was thrown:

Collecting kserve@ git+https://github.com/kserve/kserve.git@426fe21da0612ea6ef4a116b5114270313e02bbb
  Cloning https://github.com/kserve/kserve.git (to revision 426fe21da0612ea6ef4a116b5114270313e02bbb) to /tmp/pip-install-uwt31r3m/kserve_7e7029202b4b49449c96d8b0f6a3185d
  Running command git clone -q https://github.com/kserve/kserve.git /tmp/pip-install-uwt31r3m/kserve_7e7029202b4b49449c96d8b0f6a3185d
  Running command git rev-parse -q --verify 'sha^426fe21da0612ea6ef4a116b5114270313e02bbb'
  Running command git fetch -q https://github.com/kserve/kserve.git 426fe21da0612ea6ef4a116b5114270313e02bbb
  Running command git checkout -q 426fe21da0612ea6ef4a116b5114270313e02bbb
    ERROR: Command errored out with exit status 1:
     command: /home/thevenv/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uwt31r3m/kserve_7e7029202b4b49449c96d8b0f6a3185d/setup.py'"'"'; __file__='"'"'/tmp/pip-install-uwt31r3m/kserve_7e7029202b4b49449c96d8b0f6a3185d/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-nd559lke
         cwd: /tmp/pip-install-uwt31r3m/kserve_7e7029202b4b49449c96d8b0f6a3185d/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/lib/python3.9/tokenize.py", line 392, in open
        buffer = _builtin_open(filename, 'rb')
    FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-uwt31r3m/kserve_7e7029202b4b49449c96d8b0f6a3185d/setup.py'
    ----------------------------------------
WARNING: Discarding git+https://github.com/kserve/kserve.git@426fe21da0612ea6ef4a116b5114270313e02bbb. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement kserve (unavailable)
ERROR: No matching distribution found for kserve (unavailable)

Since both setup.py and wheel files are missing, this kserve pre-release commit cannot be installed directly via pip.

I have looked into ways to manually build and install this kserve commit locally but these may not align with the standard installation process for LiftWing model-servers, potentially leading to compatibility issues and inconsistencies.

Unfortunately, until this commit is incorporated into an official kserve release, we may not be able to seamlessly use it in our workflow and effectively test the RRLA model-server.

This happens because the kserve repository is not a python package and as the error message tells us there is no setup.py file. The python package can be found in the subdirectory python/kserve.
In order to install it you can add this line in the requirements.txt file:
replace this:

kserve==0.11.2

with this

-e "git+https://github.com/kserve/kserve.git@426fe21da0612ea6ef4a116b5114270313e02bbb#egg=kserve&subdirectory=python/kserve"

Note: the quotes are required otherwise the subdirectory won't be used. For more info you can check the pip VCS support documentation(I got the example from there).

Thanks @isarantopoulos, earlier on I was missing the python/kserve subdirectory. After changing:

kserve @ git+https://github.com/kserve/kserve.git@426fe21da0612ea6ef4a116b5114270313e02bbb

kserve @ git+https://github.com/kserve/kserve.git@426fe21da0612ea6ef4a116b5114270313e02bbb#egg=kserve&subdirectory=python/kserve

this pre-release commit was able to be installed. I also checked to confirm the recently added fastapi and pydantic:

pip list | grep -E '(fastapi|pydantic)'
fastapi                   0.108.0
pydantic                  2.6.3
pydantic_core             2.16.3

Now that we are able to install both kserve and KI v6 without encountering dependency conflicts from T355742#9485732, I am going to prepare an environment to perform the runtime performance impact of KI v6.

I noticed that in KI v6, pydantic data models were added to the BaseRevision class in the knowledge integrity schema. The get_revision method used in RRLA relies on the Revision class, which inherits from BaseRevision. Since RRLA uses this method in the preprocess step, I have compared the runtime of the preprocess step for two model servers: one running KI v5 (commit: 026c11a7b3bdb6bd16ef8826bc23b782e8c4e8c8) and another running KI v6 (commit: c8de64b8766e10223eabed73dad1bb2ac68c6b03). Below are the results that show the runtimes based on sample inputs that we use in RRLA load tests and test envs in P58692:

Request Payload	Preprocess Runtime (s)
	KI v5	KI v6
{"lang": "es", "rev_id": 144593484}	0.1010565758	0.1377308369
{"lang": "de", "rev_id": 224199451}	0.09622120857	0.1309299469
{"lang": "ru", "rev_id": 123744978}	0.1045227051	0.1144728661
{"lang": "de", "rev_id": 224285471}	0.1131651402	0.1167194843
{"lang": "en", "rev_id": 1096349097}	0.1421649456	0.1646904945
{"lang": "pl", "rev_id": 67533865}	0.1243362427	0.1153821945
{"lang": "en", "rev_id": 1096728668}	0.1668889523	0.169686079
{"lang": "en", "rev_id": 1096851393}	0.122885704	0.1490731239
{"lang": "pl", "rev_id": 67538140}	0.106388092	0.1116890907
{"lang": "en", "rev_id": 1096609909}	0.1272881031	0.1196594238
{"lang": "es", "rev_id": 144616722}	0.1168558598	0.1163015366
{"lang": "uk", "rev_id": 36418681}	0.1185092926	0.1388361454
{"lang": "ru", "rev_id": 123727072}	0.1143059731	0.141433239
{"lang": "en", "rev_id": 1096855066}	0.1432712078	0.1390919685
{"lang": "ru", "rev_id": 123758382}	0.1263678074	0.1209347248

Average	0.1216151873	0.132442077

In general, for most request payloads, the preprocess runtime for KI v6 is slightly higher than that of KI v5. However, there are some exceptional cases where KI v6 performs better than KI v5.

On average, there is a slight difference in the preprocess runtime, with KI v5 being slightly faster (0.12) than KI v6 (0.13).

@kevinbazira this is very helpful, thank you!

Please correct me if I'm wrong but I assume the preprocessing time for each payload was calculated over multiple requests? If so, are those values in the Preprocess Runtime (s) column average preprocessing times?

@MunizaA, we're happy to hear that the information provided was helpful. For more context, the preprocessing time for each payload was recorded after every request made in both KI v5 and v6 environments. This is why the average value for the Preprocess Runtime (s) column was calculated in the last row of the table.

isarantopoulos moved this task from Blocked to In Progress on the Machine-Learning-Team board.Mar 12 2024, 2:49 PM

Change 1010672 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[machinelearning/liftwing/inference-services@main] RRLA: upgrade KI from v5 to v6

https://gerrit.wikimedia.org/r/1010672

gerritbot added a project: Patch-For-Review.Mar 13 2024, 6:56 AM

Change 1010672 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] RRLA: upgrade KI from v5 to v6

https://gerrit.wikimedia.org/r/1010672

kevinbazira mentioned this in rMLIS388d8f4bae58: RRLA: upgrade KI from v5 to v6.Mar 14 2024, 2:47 PM

achou mentioned this in T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.Mar 19 2024, 12:00 PM

FYI @MunizaA :)

In T360423#9690484, @achou wrote:

The new RRLA model server featuring KI v.0.6 has been deployed to ML-staging. I used wrk to conduct load testing and compare the performance between the old and new versions. The results for the previous version are under P59447, and the results for the new version are under P59464. From these results, it's clear that the new KI version does not affect the performance metrics, such as average latency and RPS.

isarantopoulos moved this task from In Progress to 2023-2024 Q4 Done on the Machine-Learning-Team board.Tue, Apr 9, 2:57 PM

kevinbazira closed this task as Resolved.Tue, Apr 9, 2:59 PM

Assess runtime performance impact of pydantic data models in the RRLA model-serverClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

Assess runtime performance impact of pydantic data models in the RRLA model-server
Closed, ResolvedPublic
Actions