Page MenuHomePhabricator

Spam blacklist lets edits through
Closed, ResolvedPublic

Description

background thread

Maybe we're all missing something obvious here, but we expected this blacklist to prevent this change. Either there is a bug, or we need a better understanding of how this tool works. :)

Event Timeline

Ammarpad subscribed.

By the time the edit was made (and in fact, upto now) there's already a similar link in the article. SpamBlacklist does not prevent addition of a link if it's already present in the text. For the blacklist to work correctly, you've to remove all references of the link. After that, no one will be able to add it again altogether.

@Ammarpad: Thanks for responding. If I understand what you're saying correctly, the version preceding the edit (HTML, JSON) contains a "similar link" to the one introduced in the offending edit. Does this mean a link that also matches the same blacklist pattern? Or is there some weaker similarity test in play here? Can you pinpoint the specific link that is considered to be similar?

Open the item, copy the value of 'official website' and paste it here.

Before the edit, the official website was "https://chaturbate.com/" which is (as I understand it) correct. I expect that the spam blacklist would NOT prevent that from being added, and this appears to be behaving as expected.
After the edit there was a second value for official website "http://chaturbateme.com/" which is not correct and which I expected to be prevented by the spam blacklist entry:

\bchaturbateme\.com\b

Are you saying that there is there some (undocumented) similarity test that allows editors add the spam site notwithstanding the spam blacklist because the hostname is similar to the correct one?

Hi! The spam blacklist is an important tool for Wikidata and other projects. If there's a big hole in its functionality, we'd like to know more about the bug so we can work around it.

To clarify, it would be really helpful to know either:

  • When we plan to fix this bug; or
  • If we're not fixing it (soon), what the nature of the bug is, so that we can devise workarounds.

@Ammarpad Hi! I was wondering if you intended to clarify your earlier comments. Is this a bug in the code or a deliberate deviance from document behaviour?

@Ammarpad Hi! I was wondering if you planned to respond to my questions.

Hi @Bovlb, it's been 3 years and I'm not sure if I can remember what happened there. However, I boldly tested the link now on Wikidata and it was correctly rejected by the spamfilter.

I think we should consider this as fixed in the meantime (assuming there was a bug in the first place), except if you have a new working example (i.e where the filter fails)?

Boldly closing per last comment. If someone can still reproduce, please reopen with a clear example. Thanks!