Page MenuHomePhabricator

Wikimedia Planet not showing any external blog posts
Closed, ResolvedPublic

Event Timeline

Seeing "Sponsored Phabricator Improvements" on https://en.planet.wikimedia.org with a timestamp 15:46, Saturday, 05 2021 June UTC as the 7th item, though that was actually published in July 2016 makes me wonder if something with indexing recently changed. (The techblog feed itself should be valid.)

I looked a bit more, and it looks like the only posts showing up on planet are either on Phabricator, or a MediaWiki RSS feed, aka Tech News. Maybe planet isn't able to talk to anything outside the cluster?

Logs and state looked like the expected feed is configured but gets ignored on each run.

I deleted the "state" files from the "en" feed directory and manually ran a fresh update for all en feeds.

Mentioned in SAL (#wikimedia-operations) [2021-07-12T10:05:18Z] <mutante> planet - deleting state files, manually running update for all 161 en feeds - T285251

Hmm.. still can't find the feed in the output. Even though everything looks normal and like it wrote a new state after fetching them all.

There is just the part that "Selected 30 of 2075 articles to write; ignored 0 duplicates".

I tried disabling puppet, editing config to just keep this one feed in it, delete state files and run the update again. It actually times out. But with curl I can get it. Makes me think that Wordpress VIP started blocking us. Will keep looking more tomorrow.

I also get the same "timeout while fetching feed" for feeds outside of Wordpress VIP, tested other random ones. Everything times out. It seems to be related to setting the right HTTPS_PROXY (but this worked until recently).

Just to be clear, I'm 99% confident that this is an issue with planet, not the tech blog.

Legoktm renamed this task from techblog posts not appearing on Wikimedia Planet to Wikimedia Planet not showing any external blog posts.Sep 7 2021, 11:59 AM
Loading state file: state
Adding new feed: https://techblog.wikimedia.org/feed/
Changed feed period: https://techblog.wikimedia.org/feed/
Starting update
Will update 1 feeds
Fetching 1 feeds using 1 threads
[0] Fetching feed: https://techblog.wikimedia.org/feed/
Fetch complete
Updating feed 1 of 1: https://techblog.wikimedia.org/feed/
Loading state file: feeds/86b3899b.state
Feed:        https://techblog.wikimedia.org/feed/
Timeout while reading feed.

...

Selected 30 of 197 articles to write; ignored 0 duplicates
Writing output file: /var/www/planet/en/index.html
Finished write

Did some tests, removing and readding the techblog feed, manually setting the HTTPS_PROXY just for the update command etc and indeed it is related to the proxy setting. I could now fetch the missing content but need to fix it in puppet.

Change 720234 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] planet: replace http_proxy with https_proxy and add it to the update command

https://gerrit.wikimedia.org/r/720234

Change 720234 merged by Dzahn:

[operations/puppet@production] planet: replace http_proxy with https_proxy and add it to the update command

https://gerrit.wikimedia.org/r/720234

Change 720236 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] planet: fix updatejob parameters, languages_keys isn't one

https://gerrit.wikimedia.org/r/720236

Change 720236 merged by Dzahn:

[operations/puppet@production] planet: fix updatejob parameters, languages_keys isn't one

https://gerrit.wikimedia.org/r/720236

Change 720239 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] planet: add HTTPS_PROXY with environment parameter, not directly to cmd

https://gerrit.wikimedia.org/r/720239

Change 720239 merged by Dzahn:

[operations/puppet@production] planet: add HTTPS_PROXY with environment parameter, not directly to cmd

https://gerrit.wikimedia.org/r/720239

Change 720240 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] planet: parameter 'environment' expects a Hash value, got String

https://gerrit.wikimedia.org/r/720240

Change 720240 merged by Dzahn:

[operations/puppet@production] planet: parameter 'environment' expects a Hash value, got String

https://gerrit.wikimedia.org/r/720240

Mentioned in SAL (#wikimedia-operations) [2021-09-10T09:07:53Z] <mutante> planet - deleted all state files for all languages, running fresh update via systemctl start for all languages after proxy changes (T285251)

Thinks should be fixed now, as per above I deleted all the state files for all language versions and started the services and things are looking good.

The relevant mailman post is unfortunately outdated meanwhile but see all the new feed updates. Sorry for the delay.