Page MenuHomePhabricator

Deploy "add a link" to 17th round of wikis
Closed, ResolvedPublic1 Estimated Story Points

Description

  • Training models
    • Tigrinya Wikipedia ti see T308143#8827377
    • Turkmen Wikipedia tk
    • Tagalog Wikipedia tl
    • Tswana Wikipedia tn
    • Tongan Wikipedia to
    • Tok Pisin Wikipedia tpi
    • Turkish Wikipedia tr
    • Tsonga Wikipedia ts
    • Tatar Wikipedia tt
    • Twi Wikipedia tw
    • Tahitian Wikipedia ty
    • Tuvinian Wikipedia tyv
    • Udmurt Wikipedia udm
    • Uyghur Wikipedia ug
    • Urdu Wikipedia ur T308143#8827377
    • Uzbek Wikipedia uz
    • Venda Wikipedia ve
    • Venetian Wikipedia vec
    • Veps Wikipedia vep
    • West Flemish Wikipedia vls
    • Volapük Wikipedia vo
  • Models verification
  • Publish Datasets
  • Populate the excluded section titles
  • Deploy back-end
  • Check how the model works on the wikis
  • In Search, use hasrecommendation:link to find articles
  • Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
  • Inform communities
  • Deploy front-end

Event Timeline

I moved Tumbuka Wikipedia (tum) to an earlier batch as they are interested by the feature.

21/21 models were trained successfully in the 17th round of wikis.

Model evaluation has been completed and below are the backtesting results:

Precision@0.5Recall@0.5
tiwiki0.540.50
tkwiki0.740.27
tlwiki0.810.52
tnwiki0.900.74
towiki0.940.73
tpiwiki0.800.69
trwiki0.750.35
tswiki0.890.59
ttwiki0.930.38
twwiki0.800.61
tywiki0.970.85
tyvwiki0.780.39
udmwiki0.830.33
ugwiki0.880.53
urwiki0.620.23
uzwiki0.800.30
vewiki0.990.93
vecwiki0.960.75
vepwiki0.870.38
vlswiki0.840.55
vowiki0.980.43

CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.

The conclusion on the backtesting results is that most of the languages look fine besides:

  • tiwiki (0.54), tkwiki (0.74), urwiki (0.62) have a precision lower than the recommended one (0.75)

Talked to @MGerlach about these results and agreed that tkwiki should be published but tiwiki and urwiki shouldn't.

kevinbazira added a subscriber: kostajh.

@kostajh, we published datasets for all 19/21 models that passed the evaluation in this round.

elukey moved this task from In Progress to Watching on the Machine-Learning-Team board.
elukey added a subscriber: kevinbazira.
KStoller-WMF triaged this task as Medium priority.
KStoller-WMF moved this task from Triaged to Backlog on the Growth-Team board.
Trizek-WMF set Due Date to Nov 22 2023, 5:00 PM.Nov 8 2023, 5:23 PM

I ran this script for adding the link-recommendation task type and populating the excluded sections entries:

PHAB=T308143
for WIKI in tkwiki tlwiki tnwiki towiki tpiwiki trwiki tswiki ttwiki twwiki tywiki tyvwiki udmwiki ugwiki uzwiki vewiki vecwiki vepwiki vlswiki vowiki; do
    ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'`
    mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --create-only \
            --json \
            --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \
            link-recommendation \
            '{ "type": "link-recommendation", "group": "easy" }'
    jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \
        | jq --slurp --compact-output "unique" \
        | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --json \
            --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \
            link-recommendation.excludedSections \
            "`cat`"
    echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json"
    echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next"
    echo "Press <Enter> to continue"
    read # give time for manual verification
done

Note that the script didn't populate excludedSections for towiki and tywiki because these were not present in the wiki_sections.jsonl, see T345562. Also vowiki didn't populate excluded sections because the probability for the ones in wiki_sections.jsonl were too low.

Change 974169 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/974169

Sgs edited projects, added Growth-Team (Sprint 2 (Growth Team)); removed Growth-Team.
Sgs updated the task description. (Show Details)
Sgs set the point value for this task to 1.Nov 14 2023, 4:47 PM

Change 974169 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/974169

Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:39:11Z] <awight@deploy2002> Started scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]]

Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:41:55Z] <awight@deploy2002> sgimeno and awight: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:47:27Z] <awight@deploy2002> Finished scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 08m 16s)

Change 976804 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/976804

Change 976804 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis

https://gerrit.wikimedia.org/r/976804

Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:09:42Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]]

Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:18:49Z] <taavi@deploy2002> taavi and sgimeno: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:29:36Z] <taavi@deploy2002> Finished scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 19m 54s)

Etonkovidova subscribed.

Checked selected wikis from the list - all works as expected; leaving in the Test in Production column to monitor it during this week.