Page MenuHomePhabricator

Deploy "add a link" to 13th round of wikis
Closed, ResolvedPublic

Description

  • Training models
    • Novial Wikipedia nov
    • N'Ko Wikipedia nqo
    • Nouormand Wikipedia nrm
    • Northern Sotho Wikipedia nso
    • Navajo Wikipedia nv
    • Nyanja Wikipedia ny
    • Occitan Wikipedia oc
    • Livvi-Karelian Wikipedia olo
    • Oromo Wikipedia om
    • Oriya Wikipedia or
    • Ossetic Wikipedia os
    • Punjabi Wikipedia pa
    • Pangasinan Wikipedia pag
    • Pampanga Wikipedia pam
    • Papiamento Wikipedia pap
    • Picard Wikipedia pcd
    • Pennsylvania German Wikipedia pdc
    • Palatine German Wikipedia pfl
    • Pali Wikipedia pi see T308138#8708597
    • Norfuk / Pitkern Wikipedia pih
    • Piedmontese Wikipedia pms
    • Western Punjabi Wikipedia pnb
    • Pontic Wikipedia pnt
    • Pashto Wikipedia ps
  • Models verification
  • Publish Datasets
  • Populate the excluded section titles, except for nov, nrm, nv wikis, see T308138#9090826
  • Deploy back-end
  • Check how the model works on the wikis
  • In Search, use hasrecommendation:link to find articles
  • Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
  • Inform communities
  • Deploy front-end

Event Timeline

24/24 models were trained successfully in the 13th round of wikis.

Model evaluation has been completed and below are the backtesting results:

Precision@0.5Recall@0.5
novwiki0.880.61
nqowiki0.730.11
nrmwiki0.870.56
nsowiki0.960.40
nvwiki0.990.80
nywiki0.910.67
ocwiki0.890.66
olowiki0.920.51
omwiki0.840.53
orwiki0.710.22
oswiki0.790.28
pawiki0.740.29
pagwiki0.920.69
pamwiki0.940.76
papwiki0.880.60
pcdwiki0.920.75
pdcwiki0.880.73
pflwiki0.980.79
piwiki0.000.00
pihwiki0.910.77
pmswiki0.940.69
pnbwiki0.800.53
pntwiki0.930.81
pswiki0.760.47

CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.

The conclusion on the backtesting results is that most of the languages look fine besides:

  • piwiki's precision and recall are nil.
  • orwiki (0.71) and pawiki (0.74) have a precision slightly lower than the recommended one (0.75).
  • nqowiki has a slightly low precision (0.73) and low recall (0.11).

Talked to @MGerlach about these results and agreed orwiki, pawiki, and nqowiki should be deployed but piwiki shouldn't.

@kostajh, we published datasets for all 23/24 models that passed the evaluation in this round.

@kostajh, we published datasets for all 23/24 models that passed the evaluation in this round.

@kevinbazira thanks!

elukey moved this task from In Progress to Watching on the Machine-Learning-Team board.
elukey added a subscriber: kevinbazira.
Sgs subscribed.

I ran this script for adding the link-recommendation task type and populating the excluded sections entries:

PHAB=T308138
for WIKI in novwiki nqowiki nrmwiki nsowiki nvwiki nywiki ocwiki olowiki omwiki orwiki oswiki pawiki pagwiki pamwiki papwiki pcdwiki pdcwiki pflwiki pihwiki pmswiki pnbwiki pntwiki pswiki; do
    ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'`
    mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --create-only \
            --json \
            --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \
            link-recommendation \
            '{ "type": "link-recommendation", "group": "easy" }'
    jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \
        | jq --slurp --compact-output "unique" \
        | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --json \
            --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \
            link-recommendation.excludedSections \
            "`cat`"
    echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json"
    echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next"
    echo "Press <Enter> to continue"
    read # give time for manual verification
done

Note that the script didn't populate excludedSections for novwiki, nrmwiki and nvwiki because these were not present in the wiki_sections.jsonl. These might be the case for other wikis which presented the same problem in prior rounds. I have asked in about how to add more wikis to the sections file in T306792#9090825.

Planning to enable the cronjob in all round wikis (but the ones without excludedSections) tomorrow.

Change 948631 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend 13th round of wikis

https://gerrit.wikimedia.org/r/948631

Sgs updated the task description. (Show Details)
Sgs changed the task status from Open to In Progress.Aug 14 2023, 5:31 PM
Sgs edited projects, added Growth-Team (Sprint 0 (Growth Team)); removed Growth-Team.

Change 948631 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend 13th round of wikis

https://gerrit.wikimedia.org/r/948631

Mentioned in SAL (#wikimedia-operations) [2023-08-15T13:06:14Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:948631|GrowthExperiments: enable AddLink backend 13th round of wikis (T308138)]]

Mentioned in SAL (#wikimedia-operations) [2023-08-15T13:07:53Z] <urbanecm@deploy1002> sgimeno and urbanecm: Backport for [[gerrit:948631|GrowthExperiments: enable AddLink backend 13th round of wikis (T308138)]] synced to the testservers mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)

Mentioned in SAL (#wikimedia-operations) [2023-08-15T13:17:02Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:948631|GrowthExperiments: enable AddLink backend 13th round of wikis (T308138)]] (duration: 10m 47s)

Sgs triaged this task as Medium priority.Aug 18 2023, 4:23 PM

Change 951897 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend 13th round of wikis

https://gerrit.wikimedia.org/r/951897

All wikis have good amount of results except pagwiki which was showing ~20 results at the time of the last check. I think we can proceed with all of them. cc @Trizek-WMF

Let's go then.
can we schedule a deployment for Sept 6th?

Let's go then.
can we schedule a deployment for Sept 6th?

Sure, but should we release the 12th round first or together (T308137)?

Change 954004 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for swwiki

https://gerrit.wikimedia.org/r/954004

Change 954004 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for swwiki

https://gerrit.wikimedia.org/r/954004

Mentioned in SAL (#wikimedia-operations) [2023-08-31T14:09:02Z] <sgimeno@deploy1002> Started scap: Backport for [[gerrit:954004|GrowthExperiments: enable AddLink backend for swwiki (T308138 T308139)]]

Mentioned in SAL (#wikimedia-operations) [2023-08-31T14:10:43Z] <sgimeno@deploy1002> sgimeno: Backport for [[gerrit:954004|GrowthExperiments: enable AddLink backend for swwiki (T308138 T308139)]] synced to the testservers mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)

Mentioned in SAL (#wikimedia-operations) [2023-08-31T14:16:37Z] <sgimeno@deploy1002> Finished scap: Backport for [[gerrit:954004|GrowthExperiments: enable AddLink backend for swwiki (T308138 T308139)]] (duration: 07m 34s)

Change 948144 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable add a link in 12 and 13th round of wikis

https://gerrit.wikimedia.org/r/948144

Change 951897 abandoned by Sergio Gimeno:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend 13th round of wikis

Reason:

Squashed in Ie11b4524bb796429e55bbf8e0ce45110ce9d110c

https://gerrit.wikimedia.org/r/951897

Change 948144 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable add a link in 12 and 13th round of wikis

https://gerrit.wikimedia.org/r/948144

Mentioned in SAL (#wikimedia-operations) [2023-09-06T20:03:03Z] <taavi@deploy1002> Started scap: Backport for [[gerrit:948144|GrowthExperiments: enable add a link in 12 and 13th round of wikis (T308137 T308138)]]

Mentioned in SAL (#wikimedia-operations) [2023-09-06T20:04:40Z] <taavi@deploy1002> taavi and sgimeno: Backport for [[gerrit:948144|GrowthExperiments: enable add a link in 12 and 13th round of wikis (T308137 T308138)]] synced to the testservers mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)

Mentioned in SAL (#wikimedia-operations) [2023-09-06T20:13:20Z] <taavi@deploy1002> Finished scap: Backport for [[gerrit:948144|GrowthExperiments: enable add a link in 12 and 13th round of wikis (T308137 T308138)]] (duration: 10m 16s)