Maniphest T319178

Decide external URL scheme (on API GW) for models on Lift Wing
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	klausman
	Oct 3 2022, 10:46 AM

Description

As it is currently set up, to reach an inference model on Liftwing directly (i.e. without invo;ving the API GW) uses the following URLs and Host headers:

URL                                                                                Host
---------------------------------------------------------------------------------- ----------------------------------------------------------------------
https://inference.discovery.wmnet:30443/v1/models/enwiki-articlequality:predict    enwiki-articlequality.revscoring-articlequality.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/enwiki-articletopic:predict      enwiki-articletopic.revscoring-articletopic.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/enwiki-damaging:predict          enwiki-damaging.revscoring-editquality-damaging.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/enwiki-draftquality:predict      enwiki-draftquality.revscoring-draftquality.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/enwiki-drafttopic:predict        enwiki-drafttopic.revscoring-drafttopic.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/enwiki-goodfaith:predict         enwiki-goodfaith.revscoring-editquality-goodfaith.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/translatewiki-reverted:predict   translatewiki-reverted.revscoring-editquality-reverted.wikimedia.org
https://inference.discovery.wmnet:30443/v1/models/outlink-topic-model:predict      outlink-topic-model.articletopic-outlink.wikimedia.org

Note that while superficially, there is repetition in the Host header (e.g. having the drafttopic token twice), the semantics are different. The basic form of the header is:

isvc-service-name.k8s-namespace.wikimedia.org

Since the first and second sub-parts of the header refer to different things that may have similar naming, the same strings can show up twice.

For the outside world (i.e. API GW users), the above scheme is not very useful since it exposes too many implementation details and is more complex than the end user really needs. Since we are not the only tenants of the API GW, there also is a fixed prefix we will have to use:

https://api.wikimedia.org/service/lw/

Everything after lw is for us to decide, but we should do so with several things in mind:

The scheme we use should be logical and simple to understand
It should allow us to construct both the internal URL and Host header with relative ease and without encoding too much static mapping in the API GW config
It should allow us to expand in the future, e.g. to run non-inference services on LW
Adding a new ML service that doesn't use any existing namespaces or libraries (like revscoring) should be straightforward. Ideally, it would "just work" without touching the API GW config.
The scheme should avoid requiring us to change the internal scheme too much. Renaming a few pods or even namespaces is fine, but we should not have to dig into kserve/istio/k8s code and config too deeply.

The scheme we decide on will likely with us for years and cannot easily be deprecated or changed, so we must have good confidence that it will serve us and the users well.

Related Objects
Search...

Status	Assigned	Task
Resolved	None	T272917 Lift Wing proof of concept
Resolved	klausman	T288789 API Gateway Integration
Resolved	klausman	T319178 Decide external URL scheme (on API GW) for models on Lift Wing

Event Timeline

klausman created this task.Oct 3 2022, 10:46 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 3 2022, 10:46 AM

klausman triaged this task as High priority.Oct 3 2022, 10:47 AM

There are two things to keep in mind when querying Lift Wing. Let's pick and example:

https://inference.svc.codfw.wmnet:30443/v1/models/enwiki-goodfaith:predict

Host: enwiki-goodfaith.revscoring-editquality-goodfaith.wikimedia.org

The above are details about how to query the internal endpoint of LiftWing. From curl it would look like the following:

curl "https://inference.svc.codfw.wmnet:30443/v1/models/enwiki-goodfaith:predict" -X POST -d @input.json -i -H "Host: enwiki-goodfaith.revscoring-editquality-goodfaith.wikimedia.org" --http1.1

The two things to keep in mind are related to routing:

https://inference.svc.codfw.wmnet:30443/v1/models/enwiki-goodfaith:predict is needed so that we can hit the Istio Gateway endpoint, referenced by inference.svc.codfw.wmnet:. The URI is needed by KServe to properly invoke the model and generate a score (for example, if you invoke enwiki-goodfaith:predict on a damaging model, you'll get an error).

The Host header is needed by the Istio Gateway to decide what backend needs to serve the URI /v1/models/enwiki-goodfaith:predict.

There may be a way to simplify this, but so far we didn't find one. This dual "config" between Istio and KServe complicates a little things, but we should find a way to hide the complexity to the external user (so the API-GW needs to offer some configurations to route properly a request without requiring the user to pass extra details).

Aklapper removed a subscriber: Machine-Learning-Team.Oct 7 2022, 7:02 AM

calbon moved this task from Unsorted to Active Tasks on the Machine-Learning-Team board.Oct 25 2022, 2:26 PM

calbon edited projects, added Machine-Learning-Team (Active Tasks); removed Machine-Learning-Team.

klausman claimed this task.Oct 25 2022, 2:42 PM

klausman moved this task from Parked to In Progress on the Machine-Learning-Team (Active Tasks) board.Oct 25 2022, 2:54 PM

calbon moved this task from Active Tasks to In Progress on the Machine-Learning-Team board.Oct 25 2022, 6:24 PM

calbon edited projects, added Machine-Learning-Team; removed Machine-Learning-Team (Active Tasks).

klausman added a parent task: T288789: API Gateway Integration.Nov 22 2022, 3:45 PM

After some discussion, we have decided that the API-GW side URL scheme for LW should look like:

/lw/inference/v1/models/[model name]:predict

so for example to reach the enwiki-articlequality model, you would use:

https://api.wikimedia.org/lw/inference/v1/models/enwiki-articlequality:predict

Or, as a curl command line:

curl -s "https://api.wikimedia.org/service/lw/inference/v1/models/enwiki-articlequality:predict" -X POST -d '{ "rev_id": 123555 }'

This scheme is relatively light for the API GW to implement (a prototype/WIP change is https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/844452

Note that this currently requires getting a JWT from https://api.wikimedia.org/wiki/Special:AppManagement. In the future, the API GW may allow POST access in an anonymous fashion.

Further progress on the concrete config changes for the API GW will be track in T288789 (or more likely one of its child tasks).

klausman moved this task from In Progress to Complete Q3 2022/23 on the Machine-Learning-Team board.Nov 25 2022, 9:49 AM

calbon closed this task as Resolved.Jan 24 2023, 3:37 PM

Decide external URL scheme (on API GW) for models on Lift WingClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Decide external URL scheme (on API GW) for models on Lift Wing
Closed, ResolvedPublic
Actions

Related Objects
Search...