Page MenuHomePhabricator

Deploy "add a link" to 16th round of wikis
Closed, ResolvedPublic1 Estimated Story Points

Description

  • Training models
    • Samoan Wikipedia sm
    • Shona Wikipedia sn see T308142#8804657
    • Somali Wikipedia so
    • Albanian Wikipedia sq
    • Serbian Wikipedia sr
    • Sranan Tongo Wikipedia srn
    • Swati Wikipedia ss
    • Southern Sotho Wikipedia st
    • Saterland Frisian Wikipedia stq
    • Sundanese Wikipedia su
    • Silesian Wikipedia szl
    • Sakizaya Wikipedia szy see T308142#8804657
    • Tamil Wikipedia ta
    • Tulu Wikipedia tcy
    • Telugu Wikipedia te
    • Tetum Wikipedia tet
    • Tajik Wikipedia tg
    • Thai Wikipedia th
  • Models verification
  • Publish Datasets
  • Populate the excluded section titles
  • Deploy back-end
  • Check how the model works on the wikis
  • In Search, use hasrecommendation:link to find articles
  • Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
  • Inform communities
  • Deploy front-end

Event Timeline

19/19 models were trained successfully in the 16th round of wikis.

Model evaluation has been completed and below are the backtesting results:

Precision@0.5Recall@0.5
smwiki0.870.68
snwiki0.640.16
sowiki0.710.39
sqwiki0.890.58
srwiki0.900.47
srnwiki0.980.77
sswiki0.920.38
stwiki0.990.82
stqwiki0.900.72
suwiki0.980.81
swwiki0.880.63
szlwiki0.960.81
szywiki0.650.32
tawiki0.720.01
tcywiki0.880.11
tewiki0.790.13
tetwiki0.840.69
tgwiki0.900.61
thwiki0.720.21

CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.

The conclusion on the backtesting results is that most of the languages look fine besides:

  • snwiki (0.64), sowiki (0.71), szywiki (0.65), tawiki (0.72), thwiki (0.72) have a precision lower than the recommended one (0.75)

Talked to @MGerlach about these results and agreed that sowiki, tawiki, thwiki should be published but snwiki, szywiki shouldn't.

kevinbazira added a subscriber: kostajh.

@kostajh, we published datasets for all 17/19 models that passed the evaluation in this round.

elukey moved this task from In Progress to Watching on the Machine-Learning-Team board.
elukey added a subscriber: kevinbazira.

I moved Swahili Wikipedia (sw) to an earlier group.

KStoller-WMF moved this task from Triaged to Backlog on the Growth-Team board.
Trizek-WMF set Due Date to Nov 22 2023, 5:00 PM.Nov 8 2023, 5:23 PM

I ran this script for adding the link-recommendation task type and populating the excluded sections entries:

PHAB=T308142
for WIKI in smwiki sowiki sqwiki srwiki srnwiki sswiki stwiki stqwiki suwiki szlwiki tawiki tcywiki tewiki tetwiki tgwiki thwiki; do
    ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'`
    mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --create-only \
            --json \
            --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \
            link-recommendation \
            '{ "type": "link-recommendation", "group": "easy" }'
    jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \
        | jq --slurp --compact-output "unique" \
        | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --json \
            --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \
            link-recommendation.excludedSections \
            "`cat`"
    echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json"
    echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next"
    echo "Press <Enter> to continue"
    read # give time for manual verification
done

Note that the script didn't populate excludedSections for stqwiki because it is not present in the wiki_sections.jsonl, see T345562.

Change 974169 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/974169

Sgs set the point value for this task to 1.Nov 14 2023, 4:47 PM

Change 974169 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/974169

Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:39:11Z] <awight@deploy2002> Started scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]]

Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:41:55Z] <awight@deploy2002> sgimeno and awight: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:47:27Z] <awight@deploy2002> Finished scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 08m 16s)

Change 976804 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/976804

Change 976804 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/976804

Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:09:42Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]]

Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:18:49Z] <taavi@deploy2002> taavi and sgimeno: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:29:36Z] <taavi@deploy2002> Finished scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 19m 54s)

Etonkovidova subscribed.

Checked selected wikis from the list - leaving in the Test in Production column for monitoring during this week.