Page MenuHomePhabricator

Connect Outlink topic model to eventgate
Closed, ResolvedPublic

Description

We have deployed the Outlink topic model to Lift Wing in T287056.

A prime motivation for productionizing this model is to have these topics available for Growth's tools. My understanding is that the ORES articletopic model is connected in via eventgate and the goal is to replace the ORES enwiki model with this language-agnostic model.

This task serves to track the status and details of the support for eventgate for the Outlink topic model. We'll use the event module developed in T301878 to implement it.

Event Timeline

Change 828481 had a related patch set uploaded (by AikoChou; author: AikoChou):

[machinelearning/liftwing/inference-services@main] outlink: add code to send events to EventGate

https://gerrit.wikimedia.org/r/828481

There are some high level points to discuss before proceeding in my opinion. We can definitely use the events.py module but we'd need to figure out first what EventGate schema/stream to use for the events. At the moment we have a schema called mediawiki.revision-score, and two streams following it (definited in mediawiki's config):

We are currently (and temporary) send events from LiftWing to the test stream, but the idea is to create new ones called mediawiki.revision-score-$MODEL in the future. I am not 100% sure what was the scope of the revision-score schema, and if we want to keep going with it in the future. Probably a chat with the Data Engineering and Research teams is needed. I'll help if you want, this is something that we have already discussed (even if briefly) in T301878.

The schema of mediawiki.revision-score that ORES uses is as follows:

{
    "awesomeness": {
        "model_name": "awesomeness",
        "model_version": "1.0.1",
        "prediction": ["yes", "mostly"],
        "probability": {
            "yes": 0.99,
            "mostly": 0.90,
            "hardly": 0.01
        }
}

The schema is used for all revscoring-based models (articlequality, damaging, goodfaith, articletopic, etc). In the future, revscoring-based models on Lift Wing will send events to its own stream named mediawiki.revision-score-$MODEL, but the schema will remain the same.

For Outlink topic model, one question is: do we want to follow the current revision-score schema or create a new schema? If following the revision-score schema, like how ORES articletopic model does: the prediction field is a list of topics with probability higher than 0.5, and the probability field is a list of key/value of topic name and probability for all topics.

For example,

{
    "outlink-topic-model": {
        "model_name": "outlink-topic-model",
        "model_version": "v202012",
        "prediction": [
            "Culture.Biography.Biography*",
            "Culture.Biography.Women",
            "Geography.Regions.Americas.North_America",
            "Culture.Literature",
            "History_and_Society.History"
        ],
        "probability": {
            "Culture.Biography.Biography*": 0.9566442370414734, 
            "Culture.Biography.Women": 0.9019306898117065,
            "Geography.Regions.Americas.North_America": 0.7248802781105042,
            "Culture.Literature": 0.5621865391731262,
            "History_and_Society.History": 0.5078218579292297,
            "History_and_Society.Society": 0.18714269995689392,
            "Culture.Media.Media*": 0.14415885508060455,
            ...
            "Geography.Regions.Africa.Central_Africa": 0.00042731568100862205
        }
}

The page_title, database, and other metadata will be carried in the event as well (see full schema).

In this case, when the Lift Wing API gets a revision-create event, there will be no threshold input, so it always uses the default value of 0.5 for the threshold.

Another question is: do we want to change the output schema for all requests or just for revision-create events? If later one, is it a good idea to have two different output schemas for on-demand requests and the streaming use case?

@Isaac What's your thoughts on this? :)

For Outlink topic model, one question is: do we want to follow the current revision-score schema or create a new schema?

@AikoChou thanks for asking -- I'll think a bit on this but my default response is to keep the same schema unless you have a strong reason not to (for instance, if the event size is too big I'm happy to work to help reduce). We'd like to see this model used where the existing articletopic model is used so using the same schema helps it be a direct 1:1 replacement.

Another question is: do we want to change the output schema for all requests or just for revision-create events? If later one, is it a good idea to have two different output schemas for on-demand requests and the streaming use case?

Hmm...I'm not sure I'm fully understanding. My assumption is that an event would only be triggered for the revision-create events and not for the on-demand requests? Perhaps moot because I don't think we need to change the schema.

In this case, when the Lift Wing API gets a revision-create event, there will be no threshold input, so it always uses the default value of 0.5 for the threshold.

Yeah, that's still our recommendation for best use of the model so a reasonable default behavior.

@Isaac thanks for answering. The reason why I was asking is because ORES articletopic has kind of the same output schema for the revision-create events and the on-demand requests. You can see the result of https://ores.wikimedia.org/v3/scores/enwiki?models=articletopic&revids=1112902771 contains a prediction field and a probability field like I described. My concern is that end users might expect the same output schema from EventStreams (External Steam, a Publishing Service) and the LiftWing endpoint. They might first try the Lift Wing endpoint to see what the output is like, and then they build an application to consume the mediawiki.revision-score-articletopic-outlink (just an example name) stream.

For now the output schema for on-demand requests is like:

{
   "prediction":{
      "article":"https://en.wikipedia.org/wiki/Wings of Fire (novel series)",
      "results":[
         {
            "topic":"Culture.Literature",
            "score":1.0000100135803223
         },
         {
            "topic":"Culture.Media.Books",
            "score":0.9954004287719727
         },
         {
            "topic":"Culture.Media.Media*",
            "score":0.8479777574539185
         },
         {
            "topic":"Culture.Biography.Women",
            "score":0.523430347442627
         }
      ]
   }
}

The number of the items in results depends on the threshold the user gives.

If we change the output schema, I think we can still keep the article link in the result:

{
        "article":"https://en.wikipedia.org/wiki/Wings of Fire (novel series)",
        "prediction": [
            "Culture.Literature",
            "Culture.Media.Books",
            "Culture.Media.Media*",
            "Culture.Biography.Women"
        ],
        "probability": {
            "Culture.Biography.Biography*": 0.4566442370414734, 
            "Culture.Biography.Women": 0.523430347442627,
            "Geography.Regions.Africa.Central_Africa": 0.00042731568100862205
             ...<all the topic names and probabilities>
        }
}

What do you think? Or do you think we should remain as it is now for on-demand requests?

Change 828481 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] outlink: add code to send events to EventGate

https://gerrit.wikimedia.org/r/828481

ahh sorry @achou -- I meant to get you feedback on this and it slipped. in general I'm okay with whatever ML Platform decides but agree that aligning output schema across LiftWing and events would be ideal. also if the article key is causing headache as far as alignment, let me know and we can figure out a good solution there. I think it'd be okay to drop it or use whatever the existing ORES models return.

@Isaac Got it! For the moment I only aligned the event output with the existing ORES model. I think it's not super urgent to change the output for on-demand requests now and it doesn't seem very relevant to this task. If in the future we deem it necessary, I'll open a task for that. :)

Change 848245 had a related patch set uploaded (by AikoChou; author: AikoChou):

[operations/deployment-charts@master] ml-services: update outlink Docker images

https://gerrit.wikimedia.org/r/848245

Change 848245 merged by jenkins-bot:

[operations/deployment-charts@master] ml-services: update outlink Docker images

https://gerrit.wikimedia.org/r/848245

Change 848291 had a related patch set uploaded (by AikoChou; author: AikoChou):

[operations/deployment-charts@master] ml-services: add EventGate settings for outlink

https://gerrit.wikimedia.org/r/848291

Change 848291 merged by jenkins-bot:

[operations/deployment-charts@master] ml-services: add EventGate settings for outlink

https://gerrit.wikimedia.org/r/848291

Change 848348 had a related patch set uploaded (by AikoChou; author: AikoChou):

[operations/deployment-charts@master] ml-services: move eventgate env variables to outlink predictor

https://gerrit.wikimedia.org/r/848348

Change 848348 merged by Elukey:

[operations/deployment-charts@master] ml-services: move eventgate env variables to outlink predictor

https://gerrit.wikimedia.org/r/848348

Change 848355 had a related patch set uploaded (by AikoChou; author: AikoChou):

[operations/deployment-charts@master] ml-services: add MODEL_VERSION env to outlink

https://gerrit.wikimedia.org/r/848355

Change 848355 merged by Elukey:

[operations/deployment-charts@master] ml-services: add MODEL_VERSION env to outlink

https://gerrit.wikimedia.org/r/848355

Current status:

  • code changes for the model server are done
  • created a Benthos config to process wikipedia domain revision-create events and call outlink isvc.
  • tested the above config on stat1004 and verified the revision-score events landed on the "mediawiki.revision-score-test" stream.

Next steps:

  • check out the outlink metrics on Grafana when Benthos is running.
  • check out the new page state stream that DE team is working on and see how to align with it.

This task has been done. I'm going to mark this as RESOLVED. We'll follow up DE team's new page state stream, and open other tasks when needed.

Some notes:

I tested a streaming workflow from Kafka revision-create events to Lift Wing to EventGate revision-score-test with Benthos. Here is the Benthos config I use. I wrote a simple pipeline to only process events from the wikipedia domain (exclude testwiki), and pages with namespace=0 (Line 40-45).

I used the following command to launch a benthos instance:

aikochou@stat1004:~/benthos$ ./benthos -c ./outlink-config.yaml

and used the following command to verify if outlink events landed in revision-score-test:

aikochou@stat1004:~$ kafkacat -C -b kafka-jumbo1001.eqiad.wmnet:9092 -t eqiad.mediawiki.revision-score-test -o -5 -e -q | jq .

I also checked the traffic and latency in grafana dashboard here.

Thanks @achou ! Exciting to see how close we are! Do me a favor and subscribe me when you open up a task for the official stream so I can alert the Growth team and begin to figure out with them how to best consume that stream.

I just realized I didn't mark this task as RESOLVED. I've created a task T328899 to work on a new stream for the outlink topic model. There is another task T328276 aims to replace ORES articletopic model in CirrusSearch with outlink topic model predictions.