Page MenuHomePhabricator uses a fake user-agent
Closed, ResolvedPublic

Description (core) contains this comment:

# we fake being Firefox because some webservers block unknown
# clients, e.g. gives a 403
# when using the PyWikipediaBot user agent.
'User-agent': 'Mozilla/5.0 (X11; U; Linux i686; de; rv:1.8) Gecko/20051128 SUSE/1.5-0.1 Firefox/1.5',

Which was added to compat in Jan 2007 (and copied to core):

The mentioned is now a HTTP 404, and the new URL is a HTTP 200 when retrieved using core master (requests) and 2.0 (httplib2), so the justification for this fake user agent is no longer applicable.

This is likely because the user-agent is now more 'normal', e.g. in 2.0:

$ python shell
Welcome to the Pywikibot interactive shell!
>>> from pywikibot.comms.http import user_agent
>>> user_agent()
'shell Pywikibot/2.0rc4 (g5802) httplib2/0.9.1 Python/'

Faking the user-agent should be an option, default disabled, or only used for servers known to be problematic.

Also the fake user-agent should be semi-auto-updating, as the user-agent in is so old (2005) that it will likely be causing problems as browser sniffers will assume that the user agent is too old to render the page correctly, and will fall back to a junky version or redirect to a 'not supported' message.

See Also: T68102: use one library for all http requests

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 3:40 AM
bzimport set Reference to bz69204.
bzimport added a subscriber: Unknown Object (????).
jayvdb set Security to None.
jayvdb added a subscriber: MtDu.
@MtDu, you might want to try this one, as it should be very easy for you to code as you've built the fake user agent function.

I'll go ahead and claim this, as I built the fake user agent function. I'll do this after I finish my current task.
Thanks for making this a GCI task for me!

Don't worry. I'll try to do as many pywikibot tasks as I can. Even after GCI ends. :)

Change 264928 had a related patch set uploaded (by MtDu):
Use new get_fake_user_agent function for User-agent

Change 264928 merged by jenkins-bot:
Use new get_fake_user_agent function for User-agent