Page MenuHomePhabricator

Requests from a specific network are blocked
Closed, ResolvedPublic


A hosting company contacted the German Volunteer Email Response Team (ticket #2015081910012338) as they are having problems reaching Wikipedia from some of their networks. It seems that refuses the connections e. g. from their subnet

Are requests from that network blocked by the WMF infrastructure?

traceroute output and contact information of the hoster are available in the linked OTRS ticket. If you don’t have access to that ticket, please contact me via Wikimail. (I don’t want to post this information in a public issue.)

Event Timeline

Ireas raised the priority of this task from to Needs Triage.
Ireas updated the task description. (Show Details)
Ireas added a subscriber: Ireas.
akosiaris triaged this task as Unbreak Now! priority.
akosiaris set Security to None.

Yes, there has been indeed a block due to a user abusing the API. There has been an effort to report this and the hosting company never answered. In good faith, we will remove the block, but someone from the hosting company should reach out to us so that the complaint does not get lost and the abusing user gets notified. This way, we can avoid a future block.

As the person who initially started the conversation about the block and reported this ISP for hosting abusers, I would very much like to be in on any conversations about the unblock or with the ISP. @akosiaris could you please CC into these threads?

Thanks. I informed our contact at the ISP and asked him to contact you to solve this issue.

akosiaris changed the task status from Open to Stalled.Aug 26 2015, 6:09 PM

The block has been removed, setting this to stalled while awaiting to be contacted by the hosting company/ISP

akosiaris lowered the priority of this task from Unbreak Now! to Low.Aug 26 2015, 6:10 PM

For reference if I don't see a contact email in my inbox by EOW I'm going to ask that it be reinstated.

For reference if I don't see a contact email in my inbox by EOW I'm going to ask that it be reinstated.

Let's hope that we won't have to get to that, but I am of the same mind. In the meantime, we should watch if the abusive behavior starts again

Noting that we 've been contacted by the hosting company.

That's nice. I haven't. Could you CC me in?

Wait, you have. Doh, checking emails from top to bottom ;p

@Ironholds_backup - The contact is detailed at the top of this ticket, it came through OTRS. My unconfirmed suspicion at this point is that the traffic's coming from a user that thinks what they're doing is legitimate, and they'll probably start back up if they haven't already, and this is all just a matter of lack of communication or awareness.

The blocked ip looks like a shared web server. Why does a web server need a connction to Wikipedia?

The blocked ip looks like a shared web server. Why does a web server need a connction to Wikipedia?

To share, remix and build upon? We're not talking about the social construct of "editing Wikipedia from TOR, open proxies and colos is not permitted" - we're talking of general reuse. This is part of our mission and should only be encouraged. Sometimes, however, practical problems arise - like when a certain host is responsible for 1/4 of all fulltext searches to accomplish something that can be much faster and saner done with a pagelinks dump.

There's a dark side to that general openness WRT to heavy/obvious remote proxycaches. People have put up what amounts to complete (logos and legal notices and all) and mostly-live-updating mirrors of our site in the past under alternate domainnames, with the apparent purpose of using them to go phishing when combined with other attacks on the user. We can't stop it in the general case, but it is a reason to care about that kind of traffic.

As far as I know, this was about customers using the Wikipedia API.

@MaxSem, that's not what you generally do from a shared web server. The logical thing woudl be to use local machine, or at least vps/dedicated server.

Yes, you could be accessing wikipedia from an unrelated web server (eg. for connectivity tests). Or they could be nice guys wishing to mirror our dumps. In fact, they were producing 10% of WMF traffic.

Ireas, they were accessing both the API and article pages (although I expect the later would be hitting the caches). With fake User-Agent. The details are in the restricted T109380.

My point is, if they went so far to complain to the hosting, they could as well be questioned about their motives, given their unusual usage pattern.

The user in question has contacted both me and @Ironholds privately providing an explanation for the behavior, noting he did already take a mitigation step. They did request more information as to how they could continue using the API via their application without causing harm., plus some generic guidelines as well as possible usage of the XML dumps was sent to them. I am not expecting any further communication on their part as of now.

Unless someone objects, I think we can resolve this

Just as a note:
I am running a small wiki for the students attending my lecture on the same shared hosting servers we are talking about here. This Wiki was also affected by the block, since it included instant commons (which may be one of the main use cases for API-access from servers).

Apologies for the disruption; we will try to aim for smaller CIDR blocks with future issues.

Agreed with akosiaris that this can be considered resolved.