Investigate Explainer for Revert-Risk model
Open, Needs TriagePublic
Actions

Assigned To

Authored By

	achou
	Feb 21 2023, 10:01 AM

Description

KServe provides the ability to attach an Explainer to an Inference Service in order to provide an explanation for a prediction given by an ML model. The explanation can be invoked using the :explain endpoint.

KServe integrates with the Alibi Explainer and the AI Explainability 360 (AIX360) toolkit.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T333453 Lift Wing improvements to get out of MVP state
		Open		achou	T330131 Investigate Explainer for Revert-Risk model

Event Timeline

achou created this task.Feb 21 2023, 10:01 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2023, 10:01 AM

Before proceeding with attaching an explainer to revert-risk isvc, we should test the explanation algorithm of interest on statbox to see if the explanation the model returned makes sense to us.

Also, the anchor algorithm is just one candidate, there are a lot of BB (black-box) explanation model Alibi provides, and they have different capabilities and restrictions. For example, anchors may need access to training data. (need more investigation, but if so, it's not ideal for us)

See https://docs.seldon.io/projects/alibi/en/stable/overview/algorithms.html#model-explanations for the overview of current algorithms.

elukey moved this task from Unsorted to Backlog/SRE on the Machine-Learning-Team board.Feb 21 2023, 3:20 PM

elukey moved this task from Backlog/SRE to Backlog/Lift Wing on the Machine-Learning-Team board.Feb 21 2023, 3:26 PM

@isarantopoulos mentioned that we can also explore the AI Explainability 360 (AIX360), another explainability open-source library Kserve integrated.

Kserve example: https://kserve.github.io/website/0.10/modelserving/explainer/aix/mnist/aix/

achou renamed this task from Investigate AlibiExplainer for Revert-Risk model to Investigate Explainer for Revert-Risk model.Feb 21 2023, 3:58 PM

Previously, we tested the TreeSHAP algorithm for Multilingual model explainability (from here: https://shap.readthedocs.io/en/latest/). It is supported by the tools provided in the task description. The main benefit is that it works with our classifiers and provides local explainability (so we can have an explanation for each specific sample without any other data needed).

Additionally, the final output user receives for explainability should be discussed according to the use of those values by the final user. Raw SHAP values can be not the best decision. Maybe additional postprocessing (for example, rescaling of values or grouping the scores that evaluate one entity (text, media, user features, etc.)) can be needed.

@isarantopoulos I was able to run the kserve AIX explainer example on ml-sandbox \o/

First I deployed the inferenceservice aix-explainer.yaml (modified the namespace to kserve-test):

aikochou@ml-sandbox:~$ kubectl apply -f aix/aix-explainer.yaml -n kserve-test

Checked the predictor and explainer pod running:

aikochou@ml-sandbox:~$ kubectl get po -n kserve-test
NAME                                                              READY   STATUS    RESTARTS   AGE
aix-explainer-explainer-default-7mhqd-deployment-54569d597mchqc   2/2     Running   0          44m
aix-explainer-predictor-default-5g7rf-deployment-bd878cd74qjmcq   2/2     Running   0          44m

Set some env variables:

MODEL_NAME=aix-explainer
INGRESS_HOST=$(minikube ip)
INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
SERVICE_HOSTNAME=$(kubectl get inferenceservice -n kserve-test ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)

Then I created a venv to install the matplotlib, requests, and scikit-learn, and then run the query_explain.py script:

(venv) aikochou@ml-sandbox:~/aix$ python query_explain.py http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:explain ${SERVICE_HOSTNAME}

Returned results:

************************************************************
************************************************************
************************************************************
starting query
Sending Explain Query
TIME TAKEN:  14.321728706359863
<Response [200]>

Explanation of autos:
['low', 0.011684259246210107]
['government', -0.011531991783027254]
['most', -0.007102600869778604]
['worst', 0.0068011313331184]
['be', -0.006723790631208644]
['cannot', -0.005791967075218682]
['them', 0.005194493737558092]
['If', -0.004526687592100251]
['will', 0.004459571232696949]
['model', 0.0042881826395665]

Explanation of hockey:
...
...

achou moved this task from Backlog/Lift Wing to In Progress on the Machine-Learning-Team board.Mar 1 2023, 3:00 PM

calbon assigned this task to achou.Mar 7 2023, 3:50 PM

diego mentioned this in T331401: Design event schema for ML scores/recommendations on current page state.Mar 7 2023, 6:09 PM

achou moved this task from In Progress to New Projects to review on the Machine-Learning-Team board.Mar 24 2023, 12:18 PM

achou updated the task description. (Show Details)Mar 27 2023, 8:42 AM

Tree SHAP is a white-box model, meaning we need to load the revert-risk model into the explainer to use it. This makes it like another predictor, as it needs to perform all the tasks that the predictor does, while also requiring a lot of resources. As a result, we don't fully utilize the advantages that Kserve provides for explainers. It would be better to use a black-box model instead.

achou added a parent task: T333453: Lift Wing improvements to get out of MVP state.Mar 31 2023, 10:48 AM

I've read the following doc: https://kserve.github.io/website/0.10/modelserving/data_plane/v2_protocol/

It seems that upstream suggest to migrate to v2 from v1, but they also add:

Note on changes between V1 & V2

V2 protocol does not currently support the explain endpoint like V1 protocol does. If this is a feature you wish to have in the V2 protocol, please submit a github issue.

No idea when/if we'll have to migrate away from v1, but it is worth to note that the :explain functionality may not be available in the future if we don't ask to upstream :(

elukey mentioned this in T340824: Provide an API endpoint that returns the model's metadata in Lift Wing.Jun 30 2023, 9:51 AM

elukey moved this task from New Projects to review to Backlog/Lift Wing on the Machine-Learning-Team board.Sep 5 2023, 1:18 PM

Investigate Explainer for Revert-Risk modelOpen, Needs TriagePublicActions

Description

Related ObjectsSearch...

Event Timeline

Investigate Explainer for Revert-Risk model
Open, Needs TriagePublic
Actions

Related Objects
Search...