Page MenuHomePhabricator

Onboarding Task: getting familiar with the machine learning models for Citation Need
Closed, ResolvedPublic

Description

General Suggestions for Onboarding
Applicants are encouraged to read the documentation about the Research Project [1][2], and become familiar with the codebase for the machine learning models [3], as well as with basic notions and functions of the Keras library for Python [4].

Actual Onboarding Task
As an onboarding task to get started with the project, we would suggest to become familiar with machine learning component of the framework. Clone the repository [3], install the libraries required, and work with the provided models to classify a few statements taken from Wikipedia articles. Models are available for English, Italian, and French Wikipedia. Feel free to chose your favorite language. Please ping @Miriam here if you need any help.

[1] https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statements
[2] https://arxiv.org/pdf/1902.11116.pdf
[3] https://github.com/mirrys/citation-needed-paper
[4] https://keras.io
[5] https://phabricator.wikimedia.org/

Event Timeline

Hi @Miriam, @Samwalton9 !

I found this project really interesting and was wondering if it would suit my skill level.

So, I'm comfortable with basic python and while I can understand OOP, I do not have much practical experience with it. I've also picked up Numpy, Pandas and Matplotlib. In addition, I have a decent amount of exposure to NLP models at a conceptual level.

Hi @Lucideuclid - many thanks for your interest in this project :)

Please find the detailed instructions for the inital task here: https://phabricator.wikimedia.org/T234519

Hi all. I am also an Outreachy applicant interested in this project.

Hi @Miriam and @Samwalton9, I'm an outreachy applicant and I would like to know how I can start contributing.

@Lorryaze: Hi and welcome! Did you read the task description of this task before you commented on this task? If yes, do you have a more specific question?