[Open Question] Automatically detecting accounts that do paid editing activity
Open, MediumPublic
Actions

Assigned To

None

Authored By

	leila
	Feb 18 2017, 5:42 PM

Description

Investigate whether reliable prediction models can be built to detect accounts with paid editing activity. From some offline discussions, we know:

the patterns of paid editors are fairly clear and it would be useful to have them picked up by AI.
If we ever get good enforcing the TOU than the paid editing pattern will shift more from one sock for multiple jobs to one sock per job. Such algorithms can be even more useful in that case.
Check out the discussion at https://en.wikipedia.org/wiki/User_talk:Jytdog/Archive_16#meta:Grants:IdeaLab.2FBot_to_detect_and_tag_advocacy_editing

Current detection steps
The following steps are taken by users (in group XX) to identify account that do paid editing:

How prevalent are the paid edit accounts?
Let's define paid edit accounts (for now loosely) as those account who have done at least one edit activity associated with paid editing since their creation. Do we have a sense how prevalent such accounts are?

Related Objects

Mentioned Here: T120170: [Epic] Paid editing (COI) detection model

Event Timeline

leila created this task.Feb 18 2017, 5:42 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 18 2017, 5:42 PM

The patterns that can at least be fairly easily recognised by a clueful New Page Reviewers could indeed be recognised by a bot. Managing the matching of similar evidence across multiple pages/multiple accounts is generally beyond the scope of the average patroller, but a bot using some form of semantic searching/syntax recognition could probably do it.

Thanks, @Kudpung, for chiming in. It would be helpful to list those recognizable clues as soon as we have a research page up.

@srijan are you interested to have a chat about this project? I had a chat with @Cervisiarius and we think it's very much aligned with your expertise and interests. (For others reading this comment, Srijan has a strong research interest and background in using machine learning to identify malicious users and activities on the web. Last year, for example, he did research on hoax detection on English Wikipedia.)

@leila @Cervisiarius this sounds super interesting! It would be nice to discuss more.

leila updated the task description. (Show Details)Mar 7 2017, 6:33 PM

@Kudpung @Doc_James @Jytdog Can you help with expanding "Current detection steps" and "How prevalent are the paid edit accounts?" sections under Description? If the answer to either of the sections is not known, feel free to say we don't know or provide best estimates (estimates in specific fields are also welcome, especially for the second question). I'm adding a couple of more sections over time, feel free to add content to them as well. Thank you! :)

On a separate note, both Srijan and Cevisiarius are excited to help on the research front. I'm happy that we have enough people with different expertise on board to look into this further. :)

leila edited projects, added Research; removed Research-Freezer.Mar 7 2017, 6:40 PM

With respect to how prevalent undisclosed paid editing is, that is an excellent research question. How does one accurately measure an activity that those carrying out are trying to keep secret? Expecially in an environment where even research of the topic is looked at negatively by many members of arbcom.

We have maybe 50 individuals listed here involved in paid editing https://www.upwork.com/o/profiles/browse/?q=Wikipedia
We have a number of companies involved in paid editing here https://en.wikipedia.org/wiki/User:Doc_James/Paid_Editing_Companies of which I will expand
We have a huge list of concerns here https://en.wikipedia.org/wiki/Wikipedia:Conflict_of_interest/Noticeboard

I guess one could take a random selection of articles within a specific topic area and analyse them. I would estimate 20% of corporations and articles on living people are paid for. But that is just a ballpark figure. The volunteer community may addresses half of concerns.

• Capt_Swing subscribed.Mar 23 2017, 7:29 PM

Liridon subscribed.Aug 2 2017, 6:05 PM

For when we come back to this open question: Article Wizard is now redesigned to help editors disclose COI and paid editing. (Check this comment for more details.) The link to the page that helps the editor navigate through reporting COI or paid editing is here.

The signs of paid editing:

A document is currently being drafted which will serve as a tutorial for New Page Reviewers. Almost complete, it probably contains all that is needed to feed an AI system.

See: Identifying PR

leila edited projects, added Research-Freezer; removed Research.Jul 11 2019, 12:20 AM

leila removed leila as the assignee of this task.Mar 18 2020, 11:42 PM

leila removed a subscriber: • Tbayer.

Harej added a project: Wikimedia-Medicine.Jan 17 2024, 3:57 AM

Harej moved this task from Backlog: Other to Backlog: COI Detection on the Wikimedia-Medicine board.Jan 23 2024, 5:54 PM

Harej removed a project: Wikimedia-Medicine.Jan 29 2024, 10:27 PM

[Open Question] Automatically detecting accounts that do paid editing activityOpen, MediumPublicActions

Description

Related Objects

Event Timeline

[Open Question] Automatically detecting accounts that do paid editing activity
Open, MediumPublic
Actions