HomePhabricator

Add translation based 'morelike' API for missing articles
9fd5d81ac2faUnpublished

Authored by bmansurov on Oct 23 2018, 4:10 PM.

Unpublished Commit · Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

Add translation based 'morelike' API for missing articles

To find articles similar to 'kitob' but missing from uzwiki visit [1].
The results looks something like this:

[{"wikidata_id": "Q33057", "score": 0.0067295},
{"wikidata_id": "Q82", "score": 0.00572587}]

(Up to 10 recommendations are output.) The higher the score the more
similar the article is to 'kitob'.

Recommendations are imported from [2] using [3]. The table schemas are
at [4]. Once tables are created, you can add languages you want to
support to the language table, e.g.

INSERT INTO language (id, code) VALUES (NULL, 'ru'), (NULL, 'uz');

Then you can import recommendations to the article_recommendation
table like so:

python scripts/article-recommendation-data-importer.py --source='ru' --target='uzs' --tsv=./predictions-02012018-07312018_ruwiki-uzwiki.tsv

[1] /uz.wikipedia.org/v1/article/morelike/translation/kitob.
[2] https://github.com/wikimedia/research-translation-recommendation-predictions
[3] scripts/article-recommendation-data-importer.py
[4] scripts/article-recommendation.sql

Bug: T201192
Change-Id: I4428f1ebce13b7bd3e379db704007debff63b92b

Details

Committed
bmansurovOct 24 2018, 8:42 PM
Parents
rMSRA3c04fd3082bb: Add MySQL connection info
Branches
Unknown
Tags
Unknown
References
refs/changes/01/450601/3
ChangeId
I4428f1ebce13b7bd3e379db704007debff63b92b