Page MenuHomePhabricator

display of my wordpress.com blog on Planet Wikimedia is broken
Closed, ResolvedPublic

Description

Author: arnomane

Description:
My blog is listed on de.planet.wikimedia.org with http://arnomane.wordpress.com/category/wiki-de/feed/

  1. The blog title URL does not link to the blog entry on my blog but just is empty.
  2. Older blog entries of mine dispear although much older entries of others are still listed.

My blog feed seems to be perfectly okay. At least it can be displayed in my personal feed reader without any troubles. As well the feed URL follows the Meta wiki recommendations for feed URLs of wordpress.com blogs.


Version: unspecified
Severity: normal

Details

Reference
bz22937

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:01 PM
bzimport set Reference to bz22937.
bzimport added a subscriber: Unknown Object (MLST).

Just looking at what is different between your feed and the others, maybe planet doesn't handle 301 redirects properly (just a random uninformed guess, probably wrong).

p.s. Anyone else find the "x-nanana: nocache" header that wordpress sends funny?

Ziko's blog seems to show the same symptoms as Arnomane's (at least the empty links). As I already said at <URI:http://meta.wikimedia.org/wiki/Planet_Wikimedia#Problems>, I installed Planet 2.0 locally on my box and downloaded the configuration files from Subversion. Running that, the links were properly preserved.

So what is the version of Planet running on the server?

Thanks. Unfortunately, version seems to be set to 2.0 in the 2.0 release as well as in the nightly builds. The difference seems to lie in the PKG-INFO file.

Wait! :-) Diff'ing the nightly builds to the 2.0 release showed that the *only* change is the VERSION line PKG-INFO and some tweaks in examples/fancy/config.ini, both not very likely to cause any problems.

ATM, I'm out of guesses what could be wrong. I can't confirm the 301 issues, as both Arnomane's and Ziko's feeds return plain 200s. And my knowledge of Python doesn't extend as far as to assess whether there were recent significant changes in libraries used by Planet.

jeluf wrote:

I get the following error message, no matter whether I'm using vanilla 2.0 or the nightly build.

ERROR:planet.runner:Update of http://arnomane.wordpress.com/tag/wiki-de/feed/ failed
Traceback (most recent call last):

File "/usr/local/planet/NEW/planet-2.0/planet/__init__.py", line 246, in run
  channel.update()
File "/usr/local/planet/NEW/planet-2.0/planet/__init__.py", line 654, in update
  self.update_entries(info.entries)
File "/usr/local/planet/NEW/planet-2.0/planet/__init__.py", line 760, in update_entries
  item.update(entry)
File "/usr/local/planet/NEW/planet-2.0/planet/__init__.py", line 883, in update
  if item.type == 'text/html':
File "/usr/local/planet/NEW/planet-2.0/planet/feedparser.py", line 238, in __getattr__
  raise AttributeError, "object has no attribute '%s'" % key

AttributeError: object has no attribute 'type'

If I run it at nightshade.toolserver.org (Linux; vanilla 2.0 + svn checkout + s!/usr/local/planet/wikimedia/en/!/home/timl/src/planet-en-data/!g), I get:

timl@nightshade:~/src/planet-en-data$ ../planet-2.0/planet.py config.ini
ERROR:planet:Error 404 while updating feed http://sopoforic.blogspot.com/feeds/posts/default/-/wikipedia%20for%20republication?orderby=published
ERROR:planet:Error 404 while updating feed http://aartedepilotarumfogao.blogspot.com/feeds/posts/default
ERROR:planet:Error 404 while updating feed http://pathos.ca/blog/?cat=3&feed=rss2
ERROR:planet:Error 404 while updating feed http://feeds.feedburner.com/PostGeographic-Wiki?format=xml
ERROR:planet:Error 404 while updating feed http://luke.faraone.cc/tag/wikipedia/feed/
ERROR:planet:Error 500 while updating feed http://blog.ut7.in/feeds/posts/default/-/wikipedia?alt=rss
ERROR:planet:Error 404 while updating feed http://giggyisms.blogspot.com/feeds/posts/default/-/wiki
ERROR:planet:Error 404 while updating feed http://anonymous101-wiki.blogspot.com/feeds/posts/default
ERROR:planet:Error 404 while updating feed http://davidarussell.co.uk/tag/wikimedia/feed/
ERROR:planet:Error 500 while updating feed http://www.martinp23.com/category/wikimedia/feed
ERROR:planet:Error 503 while updating feed http://beta.blogger.com/feeds/12152702/posts/full/-/wiki
ERROR:planet:Error 500 while updating feed http://freelayers.org/tag/wiki/feed/atom/
ERROR:planet:Error 500 while updating feed http://blog.mets501.com/feed/
timl@nightshade:~/src/planet-en-data$ python --version
Python 2.5.2
timl@nightshade:~/src/planet-en-data$

If I run the same (after cleaning up cache and output directory) on wolfsbane (Solaris):

timl@wolfsbane:~/src/planet-en-data$ ../planet-2.0/planet.py config.ini
/home/timl/src/planet-2.0/planet/__init__.py:33: DeprecationWarning: the md5 module is deprecated; use hashlib instead
import md5
Traceback (most recent call last):
File "../planet-2.0/planet.py", line 23, in <module>
import planet
File "/home/timl/src/planet-2.0/planet/__init__.py", line 35, in <module>
import dbhash
File "/opt/ts/python/2.6/lib/python2.6/dbhash.py", line 8, in <module>
import bsddb
File "/opt/ts/python/2.6/lib/python2.6/bsddb/__init__.py", line 64, in <module>
import _bsddb
ImportError: No module named _bsddb
timl@wolfsbane:~/src/planet-en-data$ python --version
Python 2.6.5
timl@wolfsbane:~/src/planet-en-data$

My local linux box (Python 2.6) shows the warning about md5, but runs with no problems (except for the 404/500s).

Both nightshade and my local runs have correct links to <URI:http://zikoblog.wordpress.com/2010/06/07/dont-give-me-a-link-give-me-an-explanation/>.

jeluf wrote:

"singer" is using the following versions:

ii python 2.5.2-0ubuntu1 An interactive high-level object-oriented language (default v
ii python-apt 0.7.4ubuntu7.5 Python interface to libapt-pkg
ii python-central 0.6.7ubuntu0.1 register and build utility for Python packages
ii python-gdbm 2.5.2-0ubuntu2 GNU dbm database support for Python
ii python-gnupginterface 0.3.2-9ubuntu1 Python interface to GnuPG (GPG)
ii python-minimal 2.5.2-0ubuntu1 A minimal subset of the Python language (default version)
ii python-support 0.7.5ubuntu1 automated rebuilding support for python modules
ii python2.5 2.5.2-2ubuntu6.1 An interactive high-level object-oriented language (version 2
ii python2.5-minimal 2.5.2-2ubuntu6.1 A minimal subset of the Python language (version 2.5)

jeluf wrote:

Judging from the error messages, the tests in comment 6 were made using the English config, while this ticket is about problems with two blogs from the German planet.

lars wrote:

Only the most recent entry (problem 2 in the original post) from
Wikimediasverige.wordpress.com is shown on the Scandinavian
branch of Wikimedia Planet,
http://gmq.planet.wikimedia.org/index.sv.html

We had another kind of problem on another blog aggregator, which was
solved by adding /feed/atom/ to the URL instead of just /feed/
Maybe that is worth trying?

jeluf wrote:

planet apparently doesn't like wordpress.com's /feed/ format. /feed/atom works fine for arnomane, so I'm going to change all other configs to use /feed/atom for wordpress.com blogs.

thewub.wiki wrote:

Same problem is occurring for Sue Gardner's blog (http://suegardner.org/) which despite the domain is also hosted on wordpress.com. Could someone apply JeLuF's fix to that as well?

Could someone apply JeLuF's fix to that as well?

Was fixed in r70445.