Page MenuHomePhabricator

New page created, db lag, result in double entries in categories
Closed, ResolvedPublic

Description

Author: anthony

Description:
After creating the page

http://en.wikipedia.org/wiki/Singles_93-03

I received a "page does not exist, please create one", so I tried saved it
again, which resulted in another "page does not exist". Probably because of db
lag, the page had been saved anyway as I was looking for what went wrong. So
that didn't seem to be buggy.

However, in the mean time the page -- which had been saved twice because of db
lag -- had put itself into its categories twice, see e.g.

http://en.wikipedia.org/wiki/Category:The_Chemical_Brothers_albums
http://en.wikipedia.org/wiki/Category:2003_albums
...

Removing the categories from the page didn't help, as only one of two entries
was removed from the category. Bringing the categopries back to the article,
also brought the number of entries in the categories back to two.

Thanks in advance,

[[User:Aliekens]]


Version: unspecified
Severity: major

Details

Reference
bz1239
ReferenceSource BranchDest BranchAuthorTitle
sebastian-berlin-wmse/wikipedia-homepage!12cherry-pick-b965357fmainsebastian-berlin-wmseUpdate search UI
repos/releng/jenkins-deploy!12plugins-proxymasterjnucheplugins: use proxy to download plugins on production hosts
repos/releng/jenkins-deploy!11add-cold-releases-jenkins-targetmasterjnuchejenkins-rel: add missing releases target releases2002.codfw.wmnet
sebastian-berlin-wmse/wikipedia-homepage!11search-fieldmainsebastian-berlin-wmseUpdate search UI
repos/releng/jenkins-deploy!10jenkins-rel-versionsmasterjnuchejenkins-rel: sync plugins/CasC config with latest production version
repos/releng/jenkins-deploy!9jenkins-rel-secretsmasterjnuchejenkins-rel: add CasC configuration for secrets
repos/releng/jenkins-deploy!8jenkins-rel-devtoolsmasterjnuchejenkins-rel: add configuration for devtools environment
repos/releng/scap3-dev!11jenkins-rel-secretsmasterjnuchejenkins-rel: add secrets
repos/releng/scap3-dev!10inject-envmasterjnuchedeploy_services.sh: inject local scap.cfg config
repos/releng/jenkins-deploy!7factor-out-scap-configmasterjnuchescap.cfg: factor common configuration out of environments
repos/releng/scap3-dev!9prod-values-updatesmasterjnuchejenkins-deploy: update deployment paths to match production
repos/releng/jenkins-deploy!6prod-values-updatesmasterjnucheadd production values
repos/releng/scap!55review/dancy/split-big-log-messagesmasterdancykubernetes.py: Ensure log messages aren't too large
repos/releng/jenkins-deploy!5authenticationmasterjnuchereleasing casc: add authentication/authorization config
repos/releng/scap3-dev!8ldap-servicemasterjnucheadd new service with OpenLDAP server
repos/releng/scap!51review/dancy/T323939-k8s-build-transcriptsmasterdancySend image build/deploy transcripts to syslog
repos/releng/jenkins-deploy!3casc-config-and-plugins-releasingmasterjnucheadd plugins and CasC config from production releasing instance
Show related patches Customize query in GitLab

Revisions and Commits

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:08 PM
bzimport set Reference to bz1239.
bzimport added a subscriber: Unknown Object (MLST).

plugwash wrote:

so we have an orphaned catagory entry and no easy way to remove it (the only way
i can think of is probablly for a developer to use sql directly)

i don't know how wikipedia stores pages so i don't know if we also have two page
records in there for the same page name

maybe the database could be made to enforce no duplicates on entrys in a catagory.

i've marked this as major because its database curruption that isn't easy to undo

jeluf wrote:

I removed the duplicate entry from the database.

zigger wrote:

*** Bug 1285 has been marked as a duplicate of this bug. ***

gal86 wrote:

Category "Логика" in ru.wikipedia.org
(http://ru.wikipedia.org/wiki/Category:%D0%9B%D0%BE%D0%B3%D0%B8%D0%BA%D0%B0)
contains 2 exact same entries "Парадокс"
(http://ru.wikipedia.org/wiki/%D0%9F%D0%B0%D1%80%D0%B0%D0%B4%D0%BE%D0%BA%D1%81).

Only one should be there.

gal86 wrote:

Category "X86" in ru.wikipedia.org
(http://ru.wikipedia.org/wiki/Category:X86)
contains 2 exact same entries "Am486 SX2"
(http://ru.wikipedia.org/wiki/Am486_SX2).
Only one should be there.

bugzilla_wikipedia_org.to.jamesd wrote:

Do not save again when you see the page not there after creating it. If you
didn't get a database error, the save worked. Wait 30 seconds, reload, and
you'll probably see the new page.

The database setup is being changed to use a unique index which will make it
impossible to create duplicates. You'll get a database error on the second and
later saves instead. Part of the process of creating that will remove all of the
existing duplicates.

zigger wrote:

*** Bug 1320 has been marked as a duplicate of this bug. ***

  • Bug 2080 has been marked as a duplicate of this bug. ***

zigger wrote:

*** Bug 2179 has been marked as a duplicate of this bug. ***

gangleri wrote:

see http://bugzilla.wikimedia.org/show_bug.cgi?id=2382#c1
bug 2382: "existence of duplicate records as a result of bug 1202"
bug 2388: "handling: add a "purge" link to [[MediaWiki:Noarticletext]]"

1.5 quite firmly fixes this index on the conversion.
We've already fixed most of the Wikimedia sites as well manually.

Resolving FIXED again.

epriestley added a commit: Unknown Object (Diffusion Commit).Mar 4 2015, 8:21 AM