Page MenuHomePhabricator

Rename wikipedia:be-x-old to be-tarask in the Monuments Database
Closed, ResolvedPublic

Description

Working on countrycode "by" in language "be-x-old"
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['250']
WARNING: Waiting 5 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['125']
WARNING: Waiting 10 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['62']
WARNING: Waiting 20 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['31']
WARNING: Waiting 40 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['15']
WARNING: Waiting 80 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['7']
WARNING: Waiting 120 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['3']
WARNING: Waiting 120 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['1']
WARNING: Waiting 120 seconds before retrying.
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['0']
WARNING: Http response status 301
WARNING: Non-JSON response received from server wikipedia:be-x-old; the server may be down.
Set geilimit = ['0']
Unknown error occurred when processing country by in lang be-x-old
Maximum retries attempted without success.
Working on countrycode "aq" in language "en"
Unknown error occurred when processing country aq in lang en
(2006, 'MySQL server has gone away')
Working on countrycode "it-88" in language "ca"
Unknown error occurred when processing country it-88 in lang ca
(2006, 'MySQL server has gone away')
Working on countrycode "de-nrw-k" in language "de"
Unknown error occurred when processing country de-nrw-k in lang de
(2006, 'MySQL server has gone away')
…

Event Timeline

JeanFred claimed this task.
JeanFred raised the priority of this task from to Needs Triage.
JeanFred updated the task description. (Show Details)
JeanFred added a subscriber: JeanFred.

This is quite bad because no harvesting but a few countries before be-x-old in the config gets done. I think I am just going to comment out this config and figure it out later.

JeanFred set Security to None.

I removed be-x-old from the config as a stop-gap measure.

Oh, okay, this is due to T11823. I can update monuments_config.py. Do we also need to rename the database table ?

JeanFred renamed this task from ErfgoedBot harvesting job crashes because “MySQL server has gone away”, possibly after failing to communicate with wikipedia:be-x-old to Rename wikipedia:be-x-old to be-tarask in the Monuments Database.Sep 16 2015, 7:49 PM

Change 241255 had a related patch set uploaded (by Jean-Frédéric):
Commenting out be-x-old configuraiton

https://gerrit.wikimedia.org/r/241255

Change 241255 merged by jenkins-bot:
Commenting out be-x-old configuraiton

https://gerrit.wikimedia.org/r/241255

Change 281220 had a related patch set uploaded (by Lokal Profil):
Replace be-x-old by be-tarask throughout

https://gerrit.wikimedia.org/r/281220

Change 281220 merged by jenkins-bot:
Replace be-x-old by be-tarask throughout

https://gerrit.wikimedia.org/r/281220

I've also changed user-config to be-tarask. Once the ongoing monuments run is done we could try deploying

But I guess the by_be-tarask table msut be crated manually before it is ingested into monuments_all at the next run?

JeanFred changed the task status from Open to Stalled.Apr 3 2016, 9:35 PM

Waiting to see if this is fixed.

I don't think this one worked. Or did we never deploy it (as in recreating the table)?

Ok. So the table exists and the config looks right.

I'll try running a country specific database update manually for this one once https://gerrit.wikimedia.org/r/#/c/287202/ has been deployed and this is possible again. Hopefully any error message thrown will tell me more.

Ok. SO the problem is that we are running a pywikibot install from before the rename. Updating to a later one.

Keep an eye on this link to see if it worked, post harvest.

edit: List harvest worked, should get merged into monuments_all tomorrow and then be accessible on the above link.

The table now gets populated but matchWikiprojectLink() fails since - is not a word character and so not caught by the \w+ regexp. (T134728)

There might also be an issue with source field still being to long but that is hard to verify until the above is fixed.

There might also be an issue with source field still being to long but that is hard to verify until the above is fixed.

Looks like the source field still isn't long enough. At this point we can either increase the size further or give up on storing the urlencoded text (I could suggest a patch for pywikibot) and instead do urlencoding on the fly where necessary. Maybe a different task though.

Closing this one and reopening T112460 to deal with the source field issue.