Page MenuHomePhabricator
Feed Search

Nov 6 2025

gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 6 2025, 10:26 AM

Nov 5 2025

gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 3:06 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 1:43 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 12:56 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 11:39 AM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 11:36 AM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 11:20 AM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 11:04 AM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 10:49 AM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 10:34 AM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 5 2025, 10:27 AM

Nov 4 2025

gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 5:48 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 5:48 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 4:01 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:58 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:55 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:50 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:48 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:40 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:25 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:14 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:07 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:04 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 3:00 PM
gkyziridis moved T408607: AI/ML Infrastructure Request: Assistance in Rolling out Revert Risk to wikis that don't have damaging/goodfaith models from Unsorted to In Progress on the Machine-Learning-Team board.
Nov 4 2025, 2:58 PM · Patch-For-Review, MediaWiki-Recent-changes, PersonalDashboard, Moderator-Tools-Team, Machine-Learning-Team
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:53 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:49 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:46 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:44 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:37 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:33 PM
gkyziridis added a comment to T408607: AI/ML Infrastructure Request: Assistance in Rolling out Revert Risk to wikis that don't have damaging/goodfaith models.

✅ The RevertRisk-Threshold-Analysis is finished running for all the wikis listed here: https://phabricator.wikimedia.org/P84306.

Nov 4 2025, 2:31 PM · Patch-For-Review, MediaWiki-Recent-changes, PersonalDashboard, Moderator-Tools-Team, Machine-Learning-Team
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:27 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:19 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:13 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:09 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:08 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 2:02 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:58 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:53 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:48 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:45 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:43 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:40 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:36 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:32 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:27 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:23 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:21 PM
gkyziridis updated the language for P84751 Revert Risk Threshold Analysis Results from autodetect to shell.
Nov 4 2025, 1:17 PM
gkyziridis updated the language for P84751 Revert Risk Threshold Analysis Results from text to autodetect.
Nov 4 2025, 1:17 PM
gkyziridis updated the language for P84751 Revert Risk Threshold Analysis Results from shell to text.
Nov 4 2025, 1:16 PM
gkyziridis edited P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:15 PM
gkyziridis created P84751 Revert Risk Threshold Analysis Results.
Nov 4 2025, 1:12 PM
gkyziridis claimed T408607: AI/ML Infrastructure Request: Assistance in Rolling out Revert Risk to wikis that don't have damaging/goodfaith models.
Nov 4 2025, 11:55 AM · Patch-For-Review, MediaWiki-Recent-changes, PersonalDashboard, Moderator-Tools-Team, Machine-Learning-Team

Nov 3 2025

gkyziridis added a comment to T367048: Update kserve to 0.15.2.

Thank you so much for working on that one @klausman!

Nov 3 2025, 1:01 PM · Essential-Work, Patch-For-Review, Machine-Learning-Team

Oct 24 2025

gkyziridis added a comment to T406217: Export retrained Tone-check model to an S3 bucket.

After some discussions with people from the DE team, I am pasting here some ideas and good practices which answer the above comments.

Oct 24 2025, 3:02 PM · Patch-For-Review, Machine-Learning-Team

Oct 23 2025

gkyziridis updated subscribers of T407155: [SPIKE] Define process for validating Tone Check model eval data for languages staff members do not speak.

I checked for dewiki and that still contains lots of English text. What was the comment regex used for dewiki? I'm not an active editor, but I'm sure we can ask WMDE for quick suggestions for more dewiki-specific comment-terms indicating a tone-issue.

This happens in the dewiki indeed, I parsed that point as well when digging in the training data.
The above list of wiki samples were directly parsed from the training data, so we did not use any regexes or any preprocess steps since these are the data that were fed into the mode in order to be trained to capture peacock language. Since, these data are used in the training process we assume that they contain high quality signals for training a model to capture peacock tone.
The problem raises right now is that on the one hand these data contain clear signals for peacock langue (since they were selected as training data, we can get high probability predictions), on the other hand, the revisions' diff of these samples maybe does not contains "tone-check/peacock" related text, which is what we want for Annotool.
As @AikoChou says:

We can use samples from the model's training/evaluation datasets in order to assure that the data samples that we'll provide on Annotool will be very "tone-check/peacock" related.

Oct 23 2025, 11:40 AM · Machine-Learning-Team, EditCheck, VisualEditor
gkyziridis moved T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing from Unsorted to In Progress on the Machine-Learning-Team board.
Oct 23 2025, 10:12 AM · Patch-For-Review, OKR-Work, Goal, Wikimedia Enterprise - Content Integrity, Wikimedia Enterprise, Wikidata, Lift-Wing, Machine-Learning-Team
gkyziridis claimed T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing.
Oct 23 2025, 10:12 AM · Patch-For-Review, OKR-Work, Goal, Wikimedia Enterprise - Content Integrity, Wikimedia Enterprise, Wikidata, Lift-Wing, Machine-Learning-Team
gkyziridis claimed T406217: Export retrained Tone-check model to an S3 bucket.
Oct 23 2025, 10:11 AM · Patch-For-Review, Machine-Learning-Team
gkyziridis moved T406217: Export retrained Tone-check model to an S3 bucket from Unsorted to In Progress on the Machine-Learning-Team board.
Oct 23 2025, 10:10 AM · Patch-For-Review, Machine-Learning-Team
gkyziridis added a comment to T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing.

I am pasting here some useful information and links based on our meeting with @Trokhymovych.

Oct 23 2025, 10:03 AM · Patch-For-Review, OKR-Work, Goal, Wikimedia Enterprise - Content Integrity, Wikimedia Enterprise, Wikidata, Lift-Wing, Machine-Learning-Team

Oct 22 2025

gkyziridis updated the language for P84250 Error during running export_model_s3 via WMFKubernetesPodOperator on airflow from shell to python.
Oct 22 2025, 12:04 PM
gkyziridis created P84250 Error during running export_model_s3 via WMFKubernetesPodOperator on airflow.
Oct 22 2025, 11:58 AM

Oct 20 2025

gkyziridis added a comment to T406217: Export retrained Tone-check model to an S3 bucket.
  1. Using airflow:2025-06-23-122527-798105179248020e79ffe44a8cb442d8675e9db7 tag it never loaded in the airflow UI (it was parsing the code for ever).
Oct 20 2025, 2:26 PM · Patch-For-Review, Machine-Learning-Team
gkyziridis added a comment to T406217: Export retrained Tone-check model to an S3 bucket.

Can I do the opposite? Use the wmf_airflow_common.clients.s3 directly in WMFKubernetesPodOperator ?

Oct 20 2025, 11:51 AM · Patch-For-Review, Machine-Learning-Team
gkyziridis added a comment to T406217: Export retrained Tone-check model to an S3 bucket.

Q: How do you find this architecture to build the model_export functionality on the side of ml-piplines repo and not in the Airflow-DAGs repo?
A: If it works for you, then all the better. If you think that this is something that could benefit everyone, then wmf_airflow_common might be a better place for it, but if it is ml-related and you're happy with it, then by all means.

Oct 20 2025, 11:27 AM · Patch-For-Review, Machine-Learning-Team

Oct 17 2025

gkyziridis added a project to T406217: Export retrained Tone-check model to an S3 bucket: Patch-For-Review.
Oct 17 2025, 2:34 PM · Patch-For-Review, Machine-Learning-Team
gkyziridis updated subscribers of T406217: Export retrained Tone-check model to an S3 bucket.

Hey @brouberol, we are currently working on this task and there are some open questions.
Here is the idea:
We will build a kokkuri image which will include the logic for copying the trained model from the PVC to an S3 bucket.
This logic will be implemented in ml-piplines repo following the same logic and CI/CD processes with the retraining image. More precisely, this code will be containerised via kokkuri pipelines during the gitlab-ci process, and the generated image will be pushed to machine-learning/ml-pipelines/ docker-registry.
This image will be run via WMFKubernetesPodOperator in the tone-check retraining DAG as the last step of the pipeline.

Questions
  • Do we need specific permissions for copying files from the PVC to an S3 bucket on the POD side and/or in the container itself?
  • Do we need something specific when creating the S3 bucket that we will use for exporting the models?
  • How do you find this architecture to build the model_export functionality on the side of ml-piplines repo and not in the Airflow-DAGs repo?
Oct 17 2025, 11:04 AM · Patch-For-Review, Machine-Learning-Team

Oct 16 2025

gkyziridis updated the task description for T406217: Export retrained Tone-check model to an S3 bucket.
Oct 16 2025, 10:13 AM · Patch-For-Review, Machine-Learning-Team

Oct 15 2025

gkyziridis added a comment to T407155: [SPIKE] Define process for validating Tone Check model eval data for languages staff members do not speak.

How will samples be validated?
Validation regarding the text quality (how much tone-check/peacock related) is a manual process translating and reviewing one by one the gathered samples. A native speaker could understand the quality of the text diff, and if it contains peacock words in correspondence with the model outcome (prediction and probability). Otherwise we can use translation.
Validation regarding data cleaning is made by postprocessing methods during the sampling.

Oct 15 2025, 2:36 PM · Machine-Learning-Team, EditCheck, VisualEditor

Oct 13 2025

gkyziridis added a comment to T367048: Update kserve to 0.15.2.

Trying to build the Apple Silicon version (reproducing the current one in inference-services) following the Readme.md getting the following error:

$ docker build --target production -f .pipeline/huggingface/blubber_m1.yaml -t hf:kserve-m1 .
[+] Building 2.8s (18/24)                                                                                                                                                                                                                                                                                                                                                                                        docker:desktop-linux
 => [internal] load build definition from blubber_m1.yaml                                                                                                                                                                                                                                                                                                                                                                        
 => => transferring dockerfile: 1.74kB                                                                                                                                                                                                                                                                                                                                                                                           
 => resolve image config for docker-image://docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v0.23.0                                                                                                                                                                                                                                                                                                                  
 => CACHED docker-image://docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v0.23.0@sha256:6b1535a39497bb6c5e0a733595721a91cee33dba99ab59d8323d077665073a53                                                                                                                                                                                                                                                            
 => => resolve docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v0.23.0@sha256:6b1535a39497bb6c5e0a733595721a91cee33dba99ab59d8323d077665073a53                                                                                                                                                                                                                                                                       
 => local://dockerfile                                                                                                                                                                                                                                                                                                                                                                                                           
 => => transferring dockerfile: 1.74kB                                                                                                                                                                                                                                                                                                                                                                                           
 => local://context                                                                                                                                                                                                                                                                                                                                                                                                              
 => => transferring context: 34B                                                                                                                                                                                                                                                                                                                                                                                                 
 => [internal] load metadata for docker.io/arm64v8/debian:stable                                                                                                                                                                                                                                                                                                                                                                 
 => [internal] load build context                                                                                                                                                                                                                                                                                                                                                                                                
 => => transferring context: 261B                                                                                                                                                                                                                                                                                                                                                                                                
 => CACHED [build  1/10] FROM docker.io/arm64v8/debian:stable@sha256:5168358b65bf8037b57bb3f7bdf8abf59c3853d7a0c1b846f65ad764b57aa356                                                                                                                                                                                                                                                                                            
 => => resolve docker.io/arm64v8/debian:stable@sha256:5168358b65bf8037b57bb3f7bdf8abf59c3853d7a0c1b846f65ad764b57aa356                                                                                                                                                                                                                                                                                                           
 => CACHED [build  2/10] RUN apt-get update && apt-get install -y "build-essential" "git" "python3-pip" "python3-dev" "python3-setuptools" "python3-venv" && rm -rf /var/lib/apt/lists/*                                                                                                                                                                                                                                         
 => CACHED [build  3/10] RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "65533" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"                                                                                                       
 => CACHED [build  4/10] RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "900" -u "900" "runuser")                                                                                                                                                                                                                                            
 => CACHED [build  5/10] WORKDIR /srv/app                                                                                                                                                                                                                                                                                                                                                                                        
 => CACHED [build  6/10] RUN git "clone" "--branch" "apple-silicon" "https://github.com/wikimedia/kserve.git" "kserve_repo"                                                                                                                                                                                                                                                                                                      
 => CACHED [build  7/10] COPY --chown=65533:65533 [src/models/huggingface_modelserver/requirements_apple_silicon.txt, src/models/huggingface_modelserver/]                                                                                                                                                                                                                                                                       
 => CACHED [build  8/10] RUN python3 "-m" "venv" "/opt/lib/venv" "--system-site-packages"                                                                                                                                                                                                                                                                                                                                        
 => CACHED [build  9/10] RUN python3 "-m" "pip" "install" "-U" "setuptools!=60.9.0" && python3 "-m" "pip" "install" "-U" "wheel" "tox" "pip"                                                                                                                                                                                                                                                                                     
 => ERROR [build 10/10] RUN python3 "-m" "pip" "install" "-r" "src/models/huggingface_modelserver/requirements_apple_silicon.txt"                                                                                                                                                                                                                                                                                                
 => CANCELED [production 2/8] RUN apt-get update && apt-get install -y "python3" "python3-distutils" "python3-pip" "python3-setuptools" && rm -rf /var/lib/apt/lists/*                                                                                                                                                                                                                                                           
------                                                                                                                                                                                                                                                                                                                                                                                                                                
 > [build 10/10] RUN python3 "-m" "pip" "install" "-r" "src/models/huggingface_modelserver/requirements_apple_silicon.txt":                                                                                                                                                                                                                                                                                                           
0.308 Processing ./kserve_repo/python/kserve (from -r src/models/huggingface_modelserver/requirements_apple_silicon.txt (line 1))                                                                                                                                                                                                                                                                                                     
0.309   Installing build dependencies: started                                                                                                                                                                                                                                                                                                                                                                                        
0.752   Installing build dependencies: finished with status 'done'                                                                                                                                                                                                                                                                                                                                                                    
0.752   Getting requirements to build wheel: started                                                                                                                                                                                                                                                                                                                                                                                  
0.862   Getting requirements to build wheel: finished with status 'done'                                                                                                                                                                                                                                                                                                                                                              
0.863   Preparing metadata (pyproject.toml): started                                                                                                                                                                                                                                                                                                                                                                                  
1.030   Preparing metadata (pyproject.toml): finished with status 'done'                                                                                                                                                                                                                                                                                                                                                              
1.033 Obtaining file:///srv/app/kserve_repo/python/huggingfaceserver (from -r src/models/huggingface_modelserver/requirements_apple_silicon.txt (line 2))                                                                                                                                                                                                                                                                             
1.034   Installing build dependencies: started                                                                                                                                                                                                                                                                                                                                                                                        
1.288   Installing build dependencies: finished with status 'done'                                                                                                                                                                                                                                                                                                                                                                    
1.288   Checking if build backend supports build_editable: started
1.398   Checking if build backend supports build_editable: finished with status 'done'
1.399   Getting requirements to build editable: started
1.462   Getting requirements to build editable: finished with status 'done'
1.463   Preparing editable metadata (pyproject.toml): started
1.574   Preparing editable metadata (pyproject.toml): finished with status 'done'
1.579 INFO: pip is looking at multiple versions of kserve to determine which version is compatible with other requirements. This could take a while.
1.651 ERROR: Package 'kserve' requires a different Python: 3.13.5 not in '<3.13,>=3.9'
------
ERROR: failed to solve: process "/bin/sh -c python3 \"-m\" \"pip\" \"install\" \"-r\" \"src/models/huggingface_modelserver/requirements_apple_silicon.txt\"" did not complete successfully: exit code: 1

Trying to update kserve==0.15.2 for haggingface adding it in the requirements.txt.
It seems that I can build the docker image locally:

$ docker build --target production -f .pipeline/huggingface/blubber.yaml --platform=linux/amd64 -t hf:kserve .
[+] Building 238.5s (19/19) FINISHED                                                                                                                                                                                                                                                                                                                                                                             docker:desktop-linux
 => [internal] load build definition from blubber.yaml                                                                                                                                                                                                                                                                                                                                                                           
 => => transferring dockerfile: 1.69kB                                                                                                                                                                                                                                                                                                                                                                                           
 => resolve image config for docker-image://docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.1.0                                                                                                                                                                                                                                                                                                                   
 => CACHED docker-image://docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.1.0@sha256:ed64673e7c362f5f264f92ea36bf379f47c90699ee36d32ed6a96aba7de858f2                                                                                                                                                                                                                                                             
 => => resolve docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.1.0@sha256:ed64673e7c362f5f264f92ea36bf379f47c90699ee36d32ed6a96aba7de858f2                                                                                                                                                                                                                                                                        
 => [internal] load .dockerignore                                                                                                                                                                                                                                                                                                                                                                                                
 => => transferring context: 88B                                                                                                                                                                                                                                                                                                                                                                                                 
 => [internal] load build context                                                                                                                                                                                                                                                                                                                                                                                                
 => => transferring context: 1.71MB                                                                                                                                                                                                                                                                                                                                                                                              
 => [build]      🌐 docker-registry.wikimedia.org/amd-pytorch23:2.3.0rocm6.0-2@sha256:61bca3a10ca76c96aa663f53b0adc9598b9f07b735846059e441f24f5ca77084                                                                                                                                                                                                                                                                          
 => => resolve docker-registry.wikimedia.org/amd-pytorch23:2.3.0rocm6.0-2@sha256:61bca3a10ca76c96aa663f53b0adc9598b9f07b735846059e441f24f5ca77084                                                                                                                                                                                                                                                                                
 => => sha256:2418fb0183302501eff228114f00be99215561edf1137cd3e1c122cc463df2e7 2.50GB / 2.50GB                                                                                                                                                                                                                                                                                                                                 
 => => extracting sha256:2418fb0183302501eff228114f00be99215561edf1137cd3e1c122cc463df2e7                                                                                                                                                                                                                                                                                                                                       
 => [production] 🖥️ # (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "65533" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"                                                                                                                
 => [build]      🖥️ # apt-get update && apt-get install -y "build-essential" "git" "python3-venv" && rm -rf /var/lib/apt/lists/*                                                                                                                                                                                                                                                                                                
 => [production] 🖥️ # (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "900" -u "900" "runuser")                                                                                                                                                                                                                                                   
 => [build]      🖥️ # (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "65533" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
 => [build]      🖥️ # (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "900" -u "900" "runuser")                                                                                                                                                                                                                                                   
 => [build]      📂 [src/models/huggingface_modelserver/requirements.txt] -> src/models/huggingface_modelserver/                                                                                                                                                                                                                                                                                                                  
 => [build]      🖥️ @65533 $ python3 "-m" "venv" "/opt/lib/venv"                                                                                                                                                                                                                                                                                                                                                                 
 => [build]      🖥️ @65533 $ python3 "-m" "pip" "install" "-U" "setuptools!=60.9.0" && python3 "-m" "pip" "install" "-U" "wheel" "tox" "pip"                                                                                                                                                                                                                                                                                     
 => [build]      🖥️ @65533 $ python3 "-m" "pip" "install" "-r" "src/models/huggingface_modelserver/requirements.txt"                                                                                                                                                                                                                                                                                                            
 => [production] 📦 {build}[/opt/lib/python/site-packages] -> /opt/lib/python/site-packages                                                                                                                                                                                                                                                                                                                                       
 => [production] 📦 {build}[/opt/lib/venv/lib/python3.11/site-packages/] -> /opt/lib/venv/lib/python3.11/site-packages/                                                                                                                                                                                                                                                                                                           
 => [production] 📂 [src/models/huggingface_modelserver/entrypoint.sh] -> .                                                                                                                                                                                                                                                                                                                                                       
 => exporting to image                                                                                                                                                                                                                                                                                                                                                                                                           
 => => exporting layers                                                                                                                                                                                                                                                                                                                                                                                                          
 => => exporting manifest sha256:abf4a516922b56ace4d0ee806c59913e9632463a81e411abb79682f37cd79cfd                                                                                                                                                                                                                                                                                                                                
 => => exporting config sha256:3dd855bb5159730dc5a12da47537ebb2512bc538758dd6c3f189933dbef9ee91                                                                                                                                                                                                                                                                                                                                  
 => => exporting attestation manifest sha256:be3ea92b76c3e132319b83a34b5544607895f5cf1b10f1004a856c4d81fbff9e                                                                                                                                                                                                                                                                                                                   
 => => exporting manifest list sha256:705dd69b803362093c919214a0f748222e50dc01c5960059042c5f93e4dace66                                                                                                                                                                                                                                                                                                                          
 => => naming to docker.io/library/hf:kserve
Oct 13 2025, 12:52 PM · Essential-Work, Patch-For-Review, Machine-Learning-Team
gkyziridis updated the task description for T396495: Build model training pipeline for tone check using WMF ML Airflow instance.
Oct 13 2025, 8:59 AM · Data-Platform-SRE (2026.01.23 - 2026.02.13), Essential-Work, Editing-team (Tracking), Machine-Learning-Team

Oct 7 2025

gkyziridis closed T403236: Fix revscoring load tests to match staging deployments as Resolved.
Oct 7 2025, 1:59 PM · Essential-Work, Machine-Learning-Team
gkyziridis updated the task description for T403236: Fix revscoring load tests to match staging deployments.
Oct 7 2025, 12:47 PM · Essential-Work, Machine-Learning-Team
gkyziridis added a comment to T403236: Fix revscoring load tests to match staging deployments.

Locust tests results for all revscoring models

1gkyziridis@stat1010:~/inference-services/test/locust$ MODEL_LOCUST_DIR=revscoring make run-locust-test
2# users = 35
3# spawn-rate = 2
4# run-time = 120s
5
6MODEL=revscoring my_locust_venv/bin/locust --headless --csv results/revscoring
7[2025-10-07 12:31:35,632] stat1010/INFO/locust.main: Run time limit set to 120 seconds
8[2025-10-07 12:31:35,632] stat1010/INFO/locust.main: Starting Locust 2.31.5
9[2025-10-07 12:31:35,633] stat1010/INFO/locust.runners: Ramping to 35 users at a rate of 2.00 per second
10[2025-10-07 12:31:52,653] stat1010/INFO/locust.runners: All users spawned: {"EnwikiArticlequality": 5, "EnwikiArticletopic": 5, "EnwikiDamaging": 5, "EnwikiDraftquality": 5, "EnwikiDrafttopic": 5, "EnwikiGoodfaith": 5, "ViwikiReverted": 5} (35 total users)
11[2025-10-07 12:33:35,096] stat1010/INFO/locust.main: --run-time limit reached, shutting down
12Load test results are within the threshold
13[2025-10-07 12:33:35,188] stat1010/INFO/locust.main: Shutting down (exit code 0)
14Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
15--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
16POST /v1/models/enwiki-articlequality:predict 134 0(0.00%) | 1032 369 4788 710 | 1.12 0.00
17POST /v1/models/enwiki-articletopic:predict 183 0(0.00%) | 168 114 705 140 | 1.53 0.00
18POST /v1/models/enwiki-damaging:predict 133 0(0.00%) | 1175 154 7984 480 | 1.11 0.00
19POST /v1/models/enwiki-draftquality:predict 140 0(0.00%) | 970 157 4897 460 | 1.17 0.00
20POST /v1/models/enwiki-drafttopic:predict 174 0(0.00%) | 211 111 865 150 | 1.46 0.00
21POST /v1/models/enwiki-goodfaith:predict 139 0(0.00%) | 1015 163 4293 420 | 1.16 0.00
22POST /v1/models/viwiki-reverted:predict 162 0(0.00%) | 279 125 1246 170 | 1.36 0.00
23--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
24 Aggregated 1065 0(0.00%) | 642 111 7984 270 | 8.91 0.00
25
26Response time percentiles (approximated)
27Type Name 50% 66% 75% 80% 90% 95% 98% 99% 99.9% 99.99% 100% # reqs
28--------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------
29POST /v1/models/enwiki-articlequality:predict 720 830 970 1100 2900 3800 4100 4300 4800 4800 4800 134
30POST /v1/models/enwiki-articletopic:predict 140 160 170 180 260 290 370 520 710 710 710 183
31POST /v1/models/enwiki-damaging:predict 480 960 1500 2000 3300 4200 4900 6800 8000 8000 8000 133
32POST /v1/models/enwiki-draftquality:predict 460 960 1300 1800 2400 3600 4000 4600 4900 4900 4900 140
33POST /v1/models/enwiki-drafttopic:predict 150 170 230 250 340 660 700 790 870 870 870 174
34POST /v1/models/enwiki-goodfaith:predict 420 720 1100 1500 3200 3900 4100 4200 4300 4300 4300 139
35POST /v1/models/viwiki-reverted:predict 170 240 310 330 640 760 900 920 1200 1200 1200 162
36--------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------
37 Aggregated 270 460 660 780 1600 3100 4000 4300 6800 8000 8000 1065

Oct 7 2025, 12:40 PM · Essential-Work, Machine-Learning-Team
gkyziridis created P83650 Locust test results for all revscoring models..
Oct 7 2025, 12:35 PM
gkyziridis closed T405358: Add LiftWing streams data to event_sanitized (increase data retention) as Resolved.
Oct 7 2025, 11:03 AM · Lift-Wing, Machine-Learning-Team
gkyziridis added a comment to T405358: Add LiftWing streams data to event_sanitized (increase data retention).

Thanks for clearing that up! I assume data eng are the ones that deploy this but we could open the patch for it.

Oct 7 2025, 10:36 AM · Lift-Wing, Machine-Learning-Team
gkyziridis added a comment to T405358: Add LiftWing streams data to event_sanitized (increase data retention).

Can we verify that the tables exist before we resolve this? I ran a quick check and the table event_sanitized.mediawiki_page_outlink_topic_prediction_change_v1 doesn't seem to exist. iiuc from the documentation there is a cron job that runs every hours but perhaps something else is needed the first time (?)

Oct 7 2025, 9:33 AM · Lift-Wing, Machine-Learning-Team
gkyziridis moved T405358: Add LiftWing streams data to event_sanitized (increase data retention) from 2025-2026 Q1 Done to In Progress on the Machine-Learning-Team board.
Oct 7 2025, 8:39 AM · Lift-Wing, Machine-Learning-Team
gkyziridis added a comment to T405358: Add LiftWing streams data to event_sanitized (increase data retention).

So this way, both tables are exported to event_sanitized schema keeping all of their columns.

Oct 7 2025, 8:08 AM · Lift-Wing, Machine-Learning-Team

Oct 6 2025

gkyziridis closed T405358: Add LiftWing streams data to event_sanitized (increase data retention) as Resolved.
Oct 6 2025, 1:42 PM · Lift-Wing, Machine-Learning-Team
gkyziridis updated the task description for T406217: Export retrained Tone-check model to an S3 bucket.
Oct 6 2025, 12:06 PM · Patch-For-Review, Machine-Learning-Team
gkyziridis moved T403236: Fix revscoring load tests to match staging deployments from Ready To Go to In Progress on the Machine-Learning-Team board.
Oct 6 2025, 11:50 AM · Essential-Work, Machine-Learning-Team
gkyziridis added a comment to T403088: Investigate revscoring-editquality-damaging alert triggered by MW API fetch errors.

One thing that stands out from the logs is that there are multiple concurrent requests hitting the same revisions. For example, there are 107 requests from the same rev_id: 41731385. Each of these requests fetching from MW API. A caching mechanism would be a great solution for this issue to reduce redundant API calls and lower the risk of timeouts under load.

Oct 6 2025, 9:37 AM · Essential-Work, Machine-Learning-Team
gkyziridis updated the task description for T403236: Fix revscoring load tests to match staging deployments.
Oct 6 2025, 8:16 AM · Essential-Work, Machine-Learning-Team

Oct 3 2025

gkyziridis claimed T403236: Fix revscoring load tests to match staging deployments.
Oct 3 2025, 4:00 PM · Essential-Work, Machine-Learning-Team
gkyziridis added a comment to T403236: Fix revscoring load tests to match staging deployments.

Locust tests results

1gkyziridis@stat1010:~/inference-services/test/locust$ MODEL_LOCUST_DIR=revscoring make run-locust-test
2# users = 30
3# spawn-rate = 2
4# run-time = 120s
5
6MODEL=revscoring my_locust_venv/bin/locust --headless --csv results/revscoring
7[2025-10-03 15:35:03,793] stat1010/INFO/locust.main: Run time limit set to 120 seconds
8[2025-10-03 15:35:03,793] stat1010/INFO/locust.main: Starting Locust 2.31.5
9[2025-10-03 15:35:03,794] stat1010/INFO/locust.runners: Ramping to 30 users at a rate of 2.00 per second
10[2025-10-03 15:35:17,813] stat1010/INFO/locust.runners: All users spawned: {"EnwikiArticlequality": 5, "EnwikiArticletopic": 5, "EnwikiDamaging": 5, "EnwikiDraftquality": 5, "EnwikiDrafttopic": 5, "EnwikiGoodfaith": 5} (30 total users)
11[2025-10-03 15:37:03,242] stat1010/INFO/locust.main: --run-time limit reached, shutting down
12Load test results are within the threshold
13[2025-10-03 15:37:03,400] stat1010/INFO/locust.main: Shutting down (exit code 0)
14Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
15--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
16POST /v1/models/enwiki-articlequality:predict 144 0(0.00%) | 910 410 4292 660 | 1.20 0.00
17POST /v1/models/enwiki-articletopic:predict 179 0(0.00%) | 189 114 696 150 | 1.50 0.00
18POST /v1/models/enwiki-damaging:predict 130 0(0.00%) | 1373 166 10566 530 | 1.09 0.00
19POST /v1/models/enwiki-draftquality:predict 146 0(0.00%) | 934 166 3963 490 | 1.22 0.00
20POST /v1/models/enwiki-drafttopic:predict 178 0(0.00%) | 199 107 762 150 | 1.49 0.00
21POST /v1/models/enwiki-goodfaith:predict 136 0(0.00%) | 1068 173 5963 450 | 1.14 0.00
22--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
23 Aggregated 913 0(0.00%) | 723 107 10566 330 | 7.64 0.00
24
25Response time percentiles (approximated)
26Type Name 50% 66% 75% 80% 90% 95% 98% 99% 99.9% 99.99% 100% # reqs
27--------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------
28POST /v1/models/enwiki-articlequality:predict 660 810 910 970 1400 3100 4100 4200 4300 4300 4300 144
29POST /v1/models/enwiki-articletopic:predict 150 170 180 210 300 380 680 690 700 700 700 179
30POST /v1/models/enwiki-damaging:predict 530 960 2000 2900 3500 4400 7700 8300 11000 11000 11000 130
31POST /v1/models/enwiki-draftquality:predict 510 910 1200 1600 2400 3000 3400 3600 4000 4000 4000 146
32POST /v1/models/enwiki-drafttopic:predict 150 170 180 210 320 660 720 730 760 760 760 178
33POST /v1/models/enwiki-goodfaith:predict 450 590 1100 1800 3300 4300 4900 5900 6000 6000 6000 136
34--------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------
35 Aggregated 330 550 710 850 1900 3200 4200 4500 11000 11000 11000 913

Oct 3 2025, 3:50 PM · Essential-Work, Machine-Learning-Team
gkyziridis created P83594 Locust (load) tests for revscoring models.
Oct 3 2025, 3:39 PM
gkyziridis added a comment to T403236: Fix revscoring load tests to match staging deployments.

@gkyziridis the error is quite self explanatory the file is just not there :)

It seems that it has been renamed in https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/1192833 from test/locust/data/articles_lang_and_title.csv to test/locust/data/articles_lang_title_id.csv

I'm not sure if all tests are actually being ran or it is just evaluating the imports.

Oct 3 2025, 3:19 PM · Essential-Work, Machine-Learning-Team
gkyziridis added a comment to T403236: Fix revscoring load tests to match staging deployments.
(.venv) gkyziridis@stat1010:~/inference-services/test/locust$ MODEL=edit_check locust
MODEL: edit-check
models.edit-check
Traceback (most recent call last):
  File "/srv/home/gkyziridis/.venv/bin/locust", line 8, in <module>
    sys.exit(main())
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/locust/main.py", line 112, in main
    docstring, _user_classes, shape_classes = load_locustfile(_locustfile)
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/locust/util/load_locustfile.py", line 70, in load_locustfile
    loader.exec_module(imported)
  File "<frozen importlib._bootstrap_external>", line 790, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/srv/home/gkyziridis/inference-services/test/locust/locustfile.py", line 13, in <module>
    _models = importlib.import_module(f"models.{model}")
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 790, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/srv/home/gkyziridis/inference-services/test/locust/models/__init__.py", line 1, in <module>
    from .article_country import *  # noqa
  File "/srv/home/gkyziridis/inference-services/test/locust/models/article_country/__init__.py", line 1, in <module>
    from .article_country import ArticleCountry  # noqa
  File "/srv/home/gkyziridis/inference-services/test/locust/models/article_country/article_country.py", line 5, in <module>
    articles = pd.read_csv("data/articles_lang_and_title.csv")
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 620, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1880, in _make_engine
    self.handles = get_handle(
  File "/srv/home/gkyziridis/.venv/lib/python3.9/site-packages/pandas/io/common.py", line 873, in get_handle
    handle = open(
FileNotFoundError: [Errno 2] No such file or directory: 'data/articles_lang_and_title.csv'

Oct 3 2025, 2:49 PM · Essential-Work, Machine-Learning-Team
gkyziridis added a comment to T403236: Fix revscoring load tests to match staging deployments.

We could remove the zhwiki deployment and free up some resources. In staging it makes sense to have 1 deployment for each family of models or an additional one if there is something different about it that would be worth testing (e.g. an additional deployment for on wikidata)

I totally agree, I already removed it in the patch above.

Oct 3 2025, 11:52 AM · Essential-Work, Machine-Learning-Team
gkyziridis added a comment to T403236: Fix revscoring load tests to match staging deployments.
$ kube_env revscoring-editquality-goodfaith ml-staging-codfw
$ kubectl get pods
NAME                                                              READY   STATUS    RESTARTS   AGE
enwiki-goodfaith-predictor-00001-deployment-59dbbf5966-jxkpb      3/3     Running   0          41s
zhwiki-goodfaith-predictor-default-00029-deployment-79c9fdvdnl4   3/3     Running   0          36d
Oct 3 2025, 11:18 AM · Essential-Work, Machine-Learning-Team

Oct 2 2025

gkyziridis removed a project from T406217: Export retrained Tone-check model to an S3 bucket: Goal.
Oct 2 2025, 11:12 AM · Patch-For-Review, Machine-Learning-Team
gkyziridis created T406217: Export retrained Tone-check model to an S3 bucket.
Oct 2 2025, 11:11 AM · Patch-For-Review, Machine-Learning-Team

Oct 1 2025

gkyziridis added a comment to T405358: Add LiftWing streams data to event_sanitized (increase data retention).

For revert_risk_prediction_change I used the following schema:

mediawiki_page_revert_risk_prediction_change_v1:
    dt 
    revision:
        rev_id
    predicted_classification:
        model_name
        model_version
        predictions
        probabilities

Not sure, but I thought that we can keep the rev_id for further usage (joins, etc...)

Oct 1 2025, 2:36 PM · Lift-Wing, Machine-Learning-Team

Sep 25 2025

gkyziridis added a comment to T400423: Q1 FY2025-26 Goal: Enable volunteer evaluation of Tone Check model in additional languages.

I had already reviewed applied ad-hoc post process on the languages below.
Please click the link on each wiki to retrieve the new postprocessed data.
✅ : Done
❌ : Not Yet

Sep 25 2025, 11:18 AM · OKR-Work, Goal, Machine-Learning-Team

Sep 24 2025

gkyziridis added a comment to T400423: Q1 FY2025-26 Goal: Enable volunteer evaluation of Tone Check model in additional languages.

Current Status:

  • hewiki clean dataset obtained
  • arwiki need to find more samples, (use training data if needed)
  • rowiki pending
Sep 24 2025, 3:15 PM · OKR-Work, Goal, Machine-Learning-Team