|Resolved||leila||T171224 [Objective 9.1.1] Article expansion recommendations|
|Resolved||leila||T183039 Gather labels as ground truth for translation and synonym section classifiers|
|Resolved||diego||T184213 Gather labels as ground truth for section synonym detection|
Please find the data to labeled here: https://drive.google.com/drive/folders/1pzR3P16ck7FyrE7QgIpcSx1TPumTGA9u?usp=sharing
Those are candidates for synonyms, stratified by section-tfidf-similarty, and fasttext distance. For more details about the procedure, please check the code here: https://github.com/digitalTranshumant/wmf-interlanguage/blob/master/Synonyms.ipynb
@bmansurov : Please, now,we need to upload the sheets, just keeping the columns A (Sec_B) and B (Sec_B), and ask to volunteers to tag the in one of these three categories: synonym, related, not related.
We (@leila and me), have updated the labels, now we will use: same, overlap and different. And translated this in Spanish, and required help from staff and community for translating this labels in the other 4 languages.
We have also added 3 columns, for collecting different assessment in the case different opinions among reviewers.