Page MenuHomePhabricator

weblinkchecker.LinkChecker throws IndexError: tuple index out of range
Closed, ResolvedPublic

Description

Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1148/
Reported by: masti01
Created on: 2010-03-17 21:49:18
Subject: weblinkchecker.py - anoying exceptions
Assigned to: xqt
Original description:
While processing external links weblinkchecker often trows this exception:

$ python2.6 pwb.py scripts/weblinkchecker.py -lang:pl -family:wikipedia -page:Mistrzostwa_Europy_w_Piłce_Siatkowej_Mężczyzn_1997
Exception while processing URL http://www.cev.lu/mmp-cgi/show.pl?cmd=tmpl&id=851&id2=150&id3=359&id4=4&id5=2&state=p_prj_game_summary&key=0 in page Mistrzostwa Europy w Piłce Siatkowej Mężczyzn 1997
Exception in thread Mistrzostwa Europy w Piłce Siatkowej Mężczyzn 1997 - http://www.cev.lu/mmp-cgi/show.pl?cmd=tmpl&id=851&id2=150&id3=359&id4=4&id5=2&state=p_prj_game_summary&key=0:
Traceback (most recent call last):
File "/usr/lib64/python2.6/threading.py", line 525, in __bootstrap_inner
    self.run()
File "weblinkchecker.py", line 492, in run
    ok, message = linkChecker.check()
File "weblinkchecker.py", line 423, in check
    msg = error[1]
IndexError: tuple index out of range

Details

Reference
bz55282

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:31 AM
bzimport set Reference to bz55282.
bzimport added a subscriber: Unknown Object (????).

I inserted additional debugging information for now. Do you have the page this error occures?

  • assigned_to: nobody --> xqt

it looks that the errors are timeouts. for example:
\#\#\# DEBUG information for \#2972249
Exception while processing URL http://www.football.fo/index.asp?id=\{C3466C04-98FF-405A-AA70-93EB417A19F4\} in page 1. deild Wysp Owczych kobiet \(2009\)
Exception in thread 1. deild Wysp Owczych kobiet \(2009\) - http://www.football.fo/index.asp?id=\{C3466C04-98FF-405A-AA70-93EB417A19F4\}:
Traceback \(most recent call last\):
File "/usr/lib64/python2.6/threading.py", line 525, in \_\_bootstrap\_inner
self.run\(\)
File "weblinkchecker.py", line 496, in run
ok, message = linkChecker.check\(\)
File "weblinkchecker.py", line 427, in check
raise IndexError, error
IndexError: timed out

and the page is really slow to load. So we need to fix is that the error should produce the proper message.

The original is http://sourceforge.net/p/legacy_/tracker/?func=detail&atid=603138&aid=2972249&group_id=93107

This resulted in some debugging being added in bb400fd6 , referring to '2972249' in that URL.

socket.error is a messy exception, which cant be easily inspected consistently on all platforms and Python versions, but requests makes life easier.

jayvdb set Security to None.
jayvdb renamed this task from weblinkchecker.py throws IndexError: tuple index out of range to weblinkchecker.LinkChecker throws IndexError: tuple index out of range.Jan 21 2016, 3:19 PM
jayvdb triaged this task as Lowest priority.
Xqt claimed this task.

This bug is too old and the related code is no longer found in the current implementation