Onboarding Task: getting familiar with the machine learning models for Citation Need
Closed, ResolvedPublic
Actions

Assigned To

None

Authored By

	Miriam
	Sep 24 2019, 11:33 AM

Description

General Suggestions for Onboarding
Applicants are encouraged to read the documentation about the Research Project [1][2], and become familiar with the codebase for the machine learning models [3], as well as with basic notions and functions of the Keras library for Python [4].

Actual Onboarding Task
As an onboarding task to get started with the project, we would suggest to become familiar with machine learning component of the framework. Clone the repository [3], install the libraries required, and work with the provided models to classify a few statements taken from Wikipedia articles. Models are available for English, Italian, and French Wikipedia. Feel free to chose your favorite language. Please ping @Miriam here if you need any help.

[1] https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statements
[2] https://arxiv.org/pdf/1902.11116.pdf
[3] https://github.com/mirrys/citation-needed-paper
[4] https://keras.io
[5] https://phabricator.wikimedia.org/

Related Objects
Search...

Status	Assigned	Task
Declined	None	T199190 [2.4] Improve unsourced statement identification tools and algorithms
Resolved	AikoChou	T233707 A system for releasing data dumps from a classifier detecting unsourced sentences in Wikipedia
Resolved	None	T233709 Onboarding Task: getting familiar with the machine learning models for Citation Need
Resolved	None	T234519 Your first task: classify sample statements using Citation Needed Models
Resolved	None	T234606 Your second task: classify statements within an article

Event Timeline

Miriam created this task.Sep 24 2019, 11:33 AM

srishakatux moved this task from Backlog to Microtasks on the Outreachy (Round 19) board.Sep 24 2019, 6:42 PM

Pgadige01 subscribed.Oct 2 2019, 7:15 AM

AikoChou subscribed.Oct 2 2019, 10:07 AM

Lucideuclid subscribed.Oct 2 2019, 7:30 PM

Hi @Miriam, @Samwalton9 !

I found this project really interesting and was wondering if it would suit my skill level.

So, I'm comfortable with basic python and while I can understand OOP, I do not have much practical experience with it. I've also picked up Numpy, Pandas and Matplotlib. In addition, I have a decent amount of exposure to NLP models at a conceptual level.

H_bushro subscribed.Oct 2 2019, 8:15 PM

Meloju subscribed.Oct 2 2019, 11:10 PM

Shamima19 subscribed.Oct 3 2019, 9:32 AM

Hi @Lucideuclid - many thanks for your interest in this project :)

Please find the detailed instructions for the inital task here: https://phabricator.wikimedia.org/T234519

Lilneus subscribed.Oct 3 2019, 2:02 PM

Thank you so much! @Miriam

Hi all. I am also an Outreachy applicant interested in this project.

Ibia-ahmad subscribed.Oct 4 2019, 7:05 PM

Juju.ba98 subscribed.Oct 6 2019, 11:42 AM

Ghassanmas subscribed.Oct 6 2019, 11:47 AM

KalindiFonda subscribed.Oct 6 2019, 1:01 PM

Hi @Miriam and @Samwalton9, I'm an outreachy applicant and I would like to know how I can start contributing.

@Lorryaze: Hi and welcome! Did you read the task description of this task before you commented on this task? If yes, do you have a more specific question?

Unit-ade subscribed.Oct 10 2019, 2:27 AM

Ferculell subscribed.Oct 15 2019, 2:50 AM

Unit-ade mentioned this in T237422: A System for releasing periodic data dumps from the citation needed model.Nov 5 2019, 3:17 PM

Samwalton9-WMF closed this task as Resolved.Nov 6 2019, 9:46 AM

Samwalton9-WMF claimed this task.

Samwalton9-WMF removed Samwalton9-WMF as the assignee of this task.

Samwalton9-WMF closed subtask T234519: Your first task: classify sample statements using Citation Needed Models as Resolved.

Samwalton9-WMF closed subtask T234606: Your second task: classify statements within an article as Resolved.