Page MenuHomePhabricator

Improve features for wikibase vandalism detection model
Open, NormalPublic

Description

The number of features of Wikidata vandalism detection is good but it can be better.

Event Timeline

Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptMay 15 2018, 10:34 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

We are working on this with @Lea_Lacroix_WMDE to get feedback and improve them.

Halfak added a subscriber: Halfak.Jun 25 2018, 9:43 PM

Any updates here?

Restricted Application added a project: artificial-intelligence. · View Herald TranscriptJun 25 2018, 9:43 PM

Any updates here?

I just had a meeting with Wikidata's communication manager. She is starting the process and it takes some time.

Aaand now I made the landing pages for the feedback: https://www.wikidata.org/wiki/Wikidata:ORES

And the announcement for feedback will be done on Monday, July 2nd :)

Vvjjkkii renamed this task from Improve features for wikibase vandalism detection model to yxcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii removed Ladsgroup as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot assigned this task to Ladsgroup.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot renamed this task from yxcaaaaaaa to Improve features for wikibase vandalism detection model.
CommunityTechBot added a subscriber: Aklapper.
Halfak added a comment.Aug 3 2018, 4:15 PM

{{merged}}

Restricted Application added a project: Scoring-platform-team. · View Herald TranscriptJan 25 2019, 1:20 PM

@Ladsgroup let's sit down together and flesh this out :)

@Ladsgroup Can you add the list of existing features as we discussed?

Sure:

is_client_move,
is_client_delete,
is_merge_into,
is_merge_from,
is_revert,
is_restore,
is_item_creation,
sex_or_gender_changed,
country_of_citizenship_changed,
member_of_sports_team_changed,
date_of_birth_changed,
image_changed,
signature_changed,
commons_category_changed,
official_website_changed,
en_label_changed,
is_human,
is_blp
comment_longest_repeated_char,
comment_uppercase_ratio,
comment_numbers_ratio,
comment_whitespace_ratio,
comment_english_bad_words,
comment_english_informals,
comment_longest_repeated_uppercase_char,
comment_has_url,
comment_has_first_person_pronouns_en,
comment_has_second_person_pronouns_en,
comment_has_do_or_dont_en,
log(wikibase.revision.parent.claims + 1),
log(wikibase.revision.parent.properties + 1),
log(wikibase.revision.parent.aliases + 1),
log(wikibase.revision.parent.sources + 1),
log(wikibase.revision.parent.qualifiers + 1),
log(wikibase.revision.parent.badges + 1),
log(wikibase.revision.parent.labels + 1),
log(wikibase.revision.parent.sitelinks + 1),
log(wikibase.revision.parent.descriptions + 1)
wikibase.revision.diff.sitelinks_added,
wikibase.revision.diff.sitelinks_removed,
wikibase.revision.diff.sitelinks_changed,
wikibase.revision.diff.labels_added,
wikibase.revision.diff.labels_removed,
wikibase.revision.diff.labels_changed,
wikibase.revision.diff.descriptions_added,
wikibase.revision.diff.descriptions_removed,
wikibase.revision.diff.descriptions_changed,
wikibase.revision.diff.aliases_added,
wikibase.revision.diff.aliases_removed,
wikibase.revision.diff.properties_added,
wikibase.revision.diff.properties_removed,
wikibase.revision.diff.properties_changed,
wikibase.revision.diff.claims_added,
wikibase.revision.diff.claims_removed,
wikibase.revision.diff.claims_changed,
wikibase.revision.diff.identifiers_changed,
wikibase.revision.diff.sources_added,
wikibase.revision.diff.sources_removed,
wikibase.revision.diff.qualifiers_added,
wikibase.revision.diff.qualifiers_removed,
wikibase.revision.diff.badges_added,
wikibase.revision.diff.badges_removed,
wikibase.revision.diff.proportion_of_qid_added,
wikibase.revision.diff.proportion_of_language_added,
wikibase.revision.diff.proportion_of_links_added
revision.comment.suggests_section_edit
revision.comment.has_link
revision.user.is_bot
revision.user.has_advanced_rights
revision.user.is_admin
revision.user.is_trusted
revision.user.is_patroller
revision.user.is_curator
revision_oriented.revision.user.is_anon,
log(temporal.revision.user.seconds_since_registration + 1)

This is all of the features, Tell me if any one them is not clear enough.

Halfak triaged this task as Normal priority.