Page MenuHomePhabricator

achou (AikoChou)
Machine Learning Engineer

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Feb 15 2022, 2:51 PM (114 w, 22 m)
Availability
Available
IRC Nick
aiko
LDAP User
Unknown
MediaWiki User
AChou-WMF [ Global Accounts ]

Recent Activity

Today

achou added a comment to T356102: Allow calling revertrisk language agnostic and revert risk multilingual APIs in a pre-save context.

Thanks, that is what I am proposing as well. @achou, how feasible do you think this is from your side? It would involve accepting a POST with all the features (https://gitlab.wikimedia.org/repos/research/knowledge_integrity/-/blob/main/knowledge_integrity/featureset.py?ref_type=heads) needed.

Tue, Apr 23, 3:13 PM · Research, Machine-Learning-Team
achou moved T356045: Test revertrisk-multilingual with GPU from In Progress to Ready To Go on the Machine-Learning-Team board.
Tue, Apr 23, 12:04 PM · Patch-For-Review, Machine-Learning-Team

Fri, Apr 19

achou committed rMLIS0706f1a55693: revertrisk: add support for base model's payloads in batch model.
revertrisk: add support for base model's payloads in batch model
Fri, Apr 19, 9:29 AM

Tue, Apr 16

achou created M335: training workflow.
Tue, Apr 16, 2:15 PM
achou added a comment to T356102: Allow calling revertrisk language agnostic and revert risk multilingual APIs in a pre-save context.

@kostajh @XiaoXiao-WMF thanks for tagging. Sorry I was unaware of the discussion here. The ML team is currently in the middle of quarterly planning. I will bring up the proposal during our planning and get back to you shortly!

Tue, Apr 16, 1:31 PM · Research, Machine-Learning-Team

Fri, Apr 12

achou created P60462 test.
Fri, Apr 12, 10:15 AM

Thu, Apr 11

achou committed rMLIS901a1b20990b: revertrisk: use the Pytorch base image for RRML GPU inference.
revertrisk: use the Pytorch base image for RRML GPU inference
Thu, Apr 11, 3:35 PM

Tue, Apr 9

achou added a comment to T356045: Test revertrisk-multilingual with GPU.

I built a RRML image locally using the Pytorch 2.2.x base image from T360638.

Tue, Apr 9, 10:03 AM · Patch-For-Review, Machine-Learning-Team
achou moved T356045: Test revertrisk-multilingual with GPU from Ready To Go to In Progress on the Machine-Learning-Team board.
Tue, Apr 9, 8:24 AM · Patch-For-Review, Machine-Learning-Team

Mon, Apr 8

achou committed rMLIS25333d8fb60c: revertrisk: update KI to v0.6 for RRML and RR-wikidata.
revertrisk: update KI to v0.6 for RRML and RR-wikidata
Mon, Apr 8, 3:47 PM

Fri, Apr 5

achou closed T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0 as Resolved.

We have deployed the new RRLA model server to production.

Fri, Apr 5, 7:00 PM · Patch-For-Review, Machine-Learning-Team
achou moved T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0 from In Progress to 2023-2024 Q4 Done on the Machine-Learning-Team board.
Fri, Apr 5, 7:00 PM · Patch-For-Review, Machine-Learning-Team
achou updated the task description for T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.
Fri, Apr 5, 6:56 PM · Patch-For-Review, Machine-Learning-Team

Thu, Apr 4

achou closed T361234: Fix locust load testing for Revert Risk models as Resolved.

This task is complete. I've created T361881 to follow up on the above test results issue.

Thu, Apr 4, 8:13 PM · Patch-For-Review, Machine-Learning-Team
achou moved T361234: Fix locust load testing for Revert Risk models from Unsorted to 2023-2024 Q4 Done on the Machine-Learning-Team board.
Thu, Apr 4, 8:11 PM · Patch-For-Review, Machine-Learning-Team
achou created T361881: Investigate the inconsistent load test results (locust) for revertrisk.
Thu, Apr 4, 8:08 PM · Machine-Learning-Team
achou moved T358744: Deploy RR-language-agnostic batch version to prod from Ready To Go to In Progress on the Machine-Learning-Team board.
Thu, Apr 4, 7:54 PM · Patch-For-Review, Machine-Learning-Team
achou moved T356045: Test revertrisk-multilingual with GPU from Blocked to Ready To Go on the Machine-Learning-Team board.
Thu, Apr 4, 7:54 PM · Patch-For-Review, Machine-Learning-Team
achou moved T360406: Error handling in Batch Predictions for RevertRisk Models from In Progress to 2023-2024 Q4 Done on the Machine-Learning-Team board.
Thu, Apr 4, 7:53 PM · Patch-For-Review, Machine-Learning-Team
achou moved T351278: Improving error message for Revertrisk models from In Progress to 2023-2024 Q4 Done on the Machine-Learning-Team board.
Thu, Apr 4, 7:53 PM · Patch-For-Review, Machine-Learning-Team
achou moved T358748: Prep work for (re)training workflow sprint from Ready To Go to 2023-2024 Q4 Done on the Machine-Learning-Team board.
Thu, Apr 4, 7:52 PM · Machine-Learning-Team
achou closed T358748: Prep work for (re)training workflow sprint as Resolved.
Thu, Apr 4, 7:52 PM · Machine-Learning-Team
achou added a comment to T355742: Assess runtime performance impact of pydantic data models in the RRLA model-server.

FYI @MunizaA :)

The new RRLA model server featuring KI v.0.6 has been deployed to ML-staging. I used wrk to conduct load testing and compare the performance between the old and new versions. The results for the previous version are under P59447, and the results for the new version are under P59464. From these results, it's clear that the new KI version does not affect the performance metrics, such as average latency and RPS.

Thu, Apr 4, 7:50 PM · Patch-For-Review, Machine-Learning-Team
achou added a comment to T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.

The new RRLA model server featuring KI v.0.6 has been deployed to ML-staging. I used wrk to conduct load testing and compare the performance between the old and new versions. The results for the previous version are under P59447, and the results for the new version are under P59464. From these results, it's clear that the new KI version does not affect the performance metrics, such as average latency and RPS.

Thu, Apr 4, 7:45 PM · Patch-For-Review, Machine-Learning-Team
achou updated the task description for T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.
Thu, Apr 4, 7:25 PM · Patch-For-Review, Machine-Learning-Team
achou updated subscribers of T358744: Deploy RR-language-agnostic batch version to prod.

I repost what I previously wrote here as the issue is more related to deployment.

Thu, Apr 4, 7:25 PM · Patch-For-Review, Machine-Learning-Team
achou closed T360406: Error handling in Batch Predictions for RevertRisk Models as Resolved.

This task is complete. Check out these examples:

Thu, Apr 4, 7:15 PM · Patch-For-Review, Machine-Learning-Team
achou closed T360406: Error handling in Batch Predictions for RevertRisk Models, a subtask of T358744: Deploy RR-language-agnostic batch version to prod, as Resolved.
Thu, Apr 4, 7:15 PM · Patch-For-Review, Machine-Learning-Team
achou updated the task description for T360406: Error handling in Batch Predictions for RevertRisk Models.
Thu, Apr 4, 6:58 PM · Patch-For-Review, Machine-Learning-Team
achou edited P59464 [load test] revertrisk language agnostic (KI v0.6).
Thu, Apr 4, 12:25 PM
achou created P59464 [load test] revertrisk language agnostic (KI v0.6).
Thu, Apr 4, 12:24 PM
achou closed T351278: Improving error message for Revertrisk models as Resolved.

This task is complete. Check out these examples of new error messages:

$ curl "https://inference-staging.svc.codfw.wmnet:30443/v1/models/revertrisk-language-agnostic:predict" -d '{"rev_id": 15925124, "lang": "ro"}' -H "Host: revertrisk-language-agnostic.revertrisk.wikimedia.org" --http1.1 -k |  jq '.'
{
  "detail": "Could not make prediction for revision 15925124 (ro). Reason: revision_missing"
}
Thu, Apr 4, 12:19 PM · Patch-For-Review, Machine-Learning-Team
achou committed rMLIS891bacff86f4: revertrisk: error handling for batch requests.
revertrisk: error handling for batch requests
Thu, Apr 4, 9:40 AM
achou created P59447 [load test] revertrisk language agnostic.
Thu, Apr 4, 9:33 AM

Wed, Apr 3

achou updated subscribers of T360406: Error handling in Batch Predictions for RevertRisk Models.

@kevinbazira posed a question - how can end users switch between batch and non-batch requests?

Wed, Apr 3, 4:20 PM · Patch-For-Review, Machine-Learning-Team
achou committed rMLIS7af63a06e2ee: locust: fix missing host header for revertrisk load tests.
locust: fix missing host header for revertrisk load tests
Wed, Apr 3, 11:13 AM

Tue, Apr 2

achou moved T355656: Investigate how to implement batch inference for revertrisk-multilingual from Ready To Go to Backlog/Lift Wing on the Machine-Learning-Team board.
Tue, Apr 2, 9:40 AM · Patch-For-Review, Machine-Learning-Team

Thu, Mar 28

achou updated subscribers of T361234: Fix locust load testing for Revert Risk models.

@isarantopoulos do you remember the config values in locust.conf when you ran the revertrisk tests? I can't reproduce the result in revertrisk_stats.csv. I haven't deployed RRLA to staging yet, so it's the same model you tested.

Thu, Mar 28, 4:59 PM · Patch-For-Review, Machine-Learning-Team
achou created P58996 load test #2.
Thu, Mar 28, 4:07 PM
achou created P58995 load test #1.
Thu, Mar 28, 4:06 PM
achou created T361238: Update and fix locust load testing for revscoring models .
Thu, Mar 28, 2:38 PM · Machine-Learning-Team
achou created T361234: Fix locust load testing for Revert Risk models.
Thu, Mar 28, 2:23 PM · Patch-For-Review, Machine-Learning-Team

Wed, Mar 27

achou created P58959 load testing .
Wed, Mar 27, 2:46 PM

Tue, Mar 26

achou moved T355656: Investigate how to implement batch inference for revertrisk-multilingual from In Progress to Ready To Go on the Machine-Learning-Team board.
Tue, Mar 26, 4:07 PM · Patch-For-Review, Machine-Learning-Team
achou claimed T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.
Tue, Mar 26, 4:06 PM · Patch-For-Review, Machine-Learning-Team
achou moved T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0 from Ready To Go to In Progress on the Machine-Learning-Team board.
Tue, Mar 26, 4:06 PM · Patch-For-Review, Machine-Learning-Team
achou set the point value for T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0 to 2.
Tue, Mar 26, 4:02 PM · Patch-For-Review, Machine-Learning-Team
achou moved T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0 from Backlog/Lift Wing to Ready To Go on the Machine-Learning-Team board.
Tue, Mar 26, 4:00 PM · Patch-For-Review, Machine-Learning-Team
achou created P58922 dep.
Tue, Mar 26, 1:57 PM
achou committed rMLIS050e347821c5: revertrisk: improve error messages.
revertrisk: improve error messages
Tue, Mar 26, 11:12 AM

Mon, Mar 25

achou created P58905 rr-ml-gpu.
Mon, Mar 25, 12:48 PM
achou created P58904 docker history.
Mon, Mar 25, 12:47 PM

Mar 22 2024

achou created P58900 debug.
Mar 22 2024, 4:58 PM
achou created P58899 docker-pkg-build.log.
Mar 22 2024, 4:41 PM
achou created P58898 docker-pkg build.
Mar 22 2024, 3:51 PM

Mar 20 2024

achou created P58825 HF image.
Mar 20 2024, 3:45 PM

Mar 19 2024

achou created P58818 HF model server.
Mar 19 2024, 3:26 PM
achou updated Other Assignee for T360406: Error handling in Batch Predictions for RevertRisk Models, removed: achou.
Mar 19 2024, 2:57 PM · Patch-For-Review, Machine-Learning-Team
achou added a comment to T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.

It would be nice to wait for an additional patch (improving error messages) to be merged.

Mar 19 2024, 12:05 PM · Patch-For-Review, Machine-Learning-Team
achou created T360423: Deploy RevertRisk language-agnostic with knowledge integrity v0.6.0.
Mar 19 2024, 12:00 PM · Patch-For-Review, Machine-Learning-Team
achou added a comment to T360177: Support building and running of articletopic-outlink model-server via Makefile.

Running into the error below which is caused by a missing events module. This module is used to generate and send a topic prediction event to EventGate. Turns out this module is in python/events.py and the model-server can't locate it because it is not running like a python module.

Traceback (most recent call last):
  File "/home/inference-services/outlink-topic-model/model-server/model.py", line 6, in <module>
    import events
ModuleNotFoundError: No module named 'events'
make[1]: *** [Makefile:97: run-server] Error 1
make[1]: Leaving directory '/home/inference-services'
make: *** [Makefile:76: articletopic-outlink] Error 2
Mar 19 2024, 10:29 AM · Machine-Learning-Team
achou added a parent task for T360406: Error handling in Batch Predictions for RevertRisk Models: T358744: Deploy RR-language-agnostic batch version to prod.
Mar 19 2024, 9:36 AM · Patch-For-Review, Machine-Learning-Team
achou added a subtask for T358744: Deploy RR-language-agnostic batch version to prod: T360406: Error handling in Batch Predictions for RevertRisk Models.
Mar 19 2024, 9:36 AM · Patch-For-Review, Machine-Learning-Team
achou created T360406: Error handling in Batch Predictions for RevertRisk Models.
Mar 19 2024, 9:35 AM · Patch-For-Review, Machine-Learning-Team

Mar 15 2024

achou moved T351278: Improving error message for Revertrisk models from Ready To Go to In Progress on the Machine-Learning-Team board.
Mar 15 2024, 4:28 PM · Patch-For-Review, Machine-Learning-Team

Mar 14 2024

achou committed rMLIS764d93c97325: revertrisk-ml: add a RevertRiskMultilingualGPU object.
revertrisk-ml: add a RevertRiskMultilingualGPU object
Mar 14 2024, 12:08 PM
achou moved T358744: Deploy RR-language-agnostic batch version to prod from Backlog/Lift Wing to Ready To Go on the Machine-Learning-Team board.
Mar 14 2024, 10:40 AM · Patch-For-Review, Machine-Learning-Team
achou moved T359793: Add a util function in python to detect GPU from In Progress to 2023-2024 Q3 Done on the Machine-Learning-Team board.
Mar 14 2024, 10:39 AM · Machine-Learning-Team

Mar 13 2024

achou committed rMLIS6a640444c922: Add a util function to detect GPU in resource_utils module.
Add a util function to detect GPU in resource_utils module
Mar 13 2024, 9:42 AM

Mar 12 2024

achou created P58771 gpu_is_available.
Mar 12 2024, 2:07 PM
achou committed rMLIS510811b2b2cf: Makefile: install requirements.txt for python/*_utils.
Makefile: install requirements.txt for python/*_utils
Mar 12 2024, 1:13 PM
achou claimed T359793: Add a util function in python to detect GPU.
Mar 12 2024, 8:52 AM · Machine-Learning-Team

Mar 5 2024

achou created P58530 detect_amd_gpu using pyopencl.
Mar 5 2024, 7:33 PM
achou added a comment to T355742: Assess runtime performance impact of pydantic data models in the RRLA model-server.

The PR for pydantic v2 in kserve has been merged! We can use this commit https://github.com/kserve/kserve/commit/426fe21da0612ea6ef4a116b5114270313e02bbb to test the RRLA model-server :)

Mar 5 2024, 7:10 PM · Patch-For-Review, Machine-Learning-Team
achou created P58516 (An Untitled Masterwork).
Mar 5 2024, 5:00 PM
achou created P58515 (An Untitled Masterwork).
Mar 5 2024, 5:00 PM
achou created P58479 docker history --no-trunc revertrisk-ml-gpu:2.
Mar 5 2024, 2:42 PM
achou created P58478 docker history --no-trunc revertrisk-ml-gpu:1.
Mar 5 2024, 2:41 PM

Mar 1 2024

achou moved T355656: Investigate how to implement batch inference for revertrisk-multilingual from Ready To Go to In Progress on the Machine-Learning-Team board.
Mar 1 2024, 12:46 PM · Patch-For-Review, Machine-Learning-Team

Feb 29 2024

achou created T358748: Prep work for (re)training workflow sprint.
Feb 29 2024, 10:35 AM · Machine-Learning-Team
achou added a comment to T351278: Improving error message for Revertrisk models.

Knowledge Integrity v0.6.0 improved error representations by introducing an Error data class and different error codes for various situations when fetching MediaWiki API for revisions. (See https://gitlab.wikimedia.org/repos/research/knowledge_integrity/-/blob/main/knowledge_integrity/mediawiki.py?ref_type=heads#L14-26)

Feb 29 2024, 10:12 AM · Patch-For-Review, Machine-Learning-Team
achou added a subtask for T348153: Q3 2024 Goal: Lift Wing users can request multiple predictions using a single request.: T358744: Deploy RR-language-agnostic batch version to prod.
Feb 29 2024, 9:55 AM · Goal, Machine-Learning-Team
achou added a parent task for T358744: Deploy RR-language-agnostic batch version to prod: T348153: Q3 2024 Goal: Lift Wing users can request multiple predictions using a single request..
Feb 29 2024, 9:55 AM · Patch-For-Review, Machine-Learning-Team
achou created T358744: Deploy RR-language-agnostic batch version to prod.
Feb 29 2024, 9:50 AM · Patch-For-Review, Machine-Learning-Team
achou renamed T355656: Investigate how to implement batch inference for revertrisk-multilingual from Implement batch prediction for revertrisk-multilingual to Investigate how to implement batch inference for revertrisk-multilingual.
Feb 29 2024, 9:32 AM · Patch-For-Review, Machine-Learning-Team
achou moved T351278: Improving error message for Revertrisk models from In Progress to Ready To Go on the Machine-Learning-Team board.
Feb 29 2024, 9:27 AM · Patch-For-Review, Machine-Learning-Team

Feb 27 2024

achou committed rMLIS33bd6fc3af60: revertrisk-multilingual: bump torch and transormers version.
revertrisk-multilingual: bump torch and transormers version
Feb 27 2024, 1:31 PM

Feb 23 2024

achou moved T351278: Improving error message for Revertrisk models from Ready To Go to In Progress on the Machine-Learning-Team board.
Feb 23 2024, 12:37 PM · Patch-For-Review, Machine-Learning-Team
achou claimed T351278: Improving error message for Revertrisk models.
Feb 23 2024, 12:37 PM · Patch-For-Review, Machine-Learning-Team
achou moved T356045: Test revertrisk-multilingual with GPU from In Progress to Blocked on the Machine-Learning-Team board.
Feb 23 2024, 12:36 PM · Patch-For-Review, Machine-Learning-Team
achou moved T351278: Improving error message for Revertrisk models from Backlog/Lift Wing to Ready To Go on the Machine-Learning-Team board.
Feb 23 2024, 12:36 PM · Patch-For-Review, Machine-Learning-Team

Feb 21 2024

achou added a comment to T355742: Assess runtime performance impact of pydantic data models in the RRLA model-server.

Following a discussion with Ilias, we will keep an eye on the progress of https://github.com/kserve/kserve/pull/3374. Once the PR is merged, we will use the pre-release version for testing.

Feb 21 2024, 2:57 PM · Patch-For-Review, Machine-Learning-Team
achou added a comment to T356045: Test revertrisk-multilingual with GPU.

The latest changes to requirements.txt still resulted in a failed docker image build. Therefore, the torch version conflict between the knowledge integrity and inference services repo was not the cause of the failure.

Feb 21 2024, 2:47 PM · Patch-For-Review, Machine-Learning-Team
achou committed rMLISa787e43f587c: revertrisk-multilingual: reorder requirements.txt.
revertrisk-multilingual: reorder requirements.txt
Feb 21 2024, 12:49 PM

Feb 14 2024

achou created P56759 build.
Feb 14 2024, 12:53 PM
achou moved T356501: Support running revertrisk-multilingual model-server via Makefile from In Progress to 2023-2024 Q3 Done on the Machine-Learning-Team board.
Feb 14 2024, 11:08 AM · Machine-Learning-Team
achou closed T356501: Support running revertrisk-multilingual model-server via Makefile as Resolved.
Feb 14 2024, 11:07 AM · Machine-Learning-Team
achou committed rMLIS31b4d86059c2: revertrisk: use GPU for revertrisk-multilingual.
revertrisk: use GPU for revertrisk-multilingual
Feb 14 2024, 10:22 AM

Feb 13 2024

achou committed rMLIS76ce78827ed8: Makefile: add support for revertrisk-multilingual.
Makefile: add support for revertrisk-multilingual
Feb 13 2024, 4:33 PM
achou moved T355656: Investigate how to implement batch inference for revertrisk-multilingual from In Progress to Ready To Go on the Machine-Learning-Team board.
Feb 13 2024, 3:16 PM · Patch-For-Review, Machine-Learning-Team