Page MenuHomePhabricator

Post-page move redirects contain two newlines below redirect code
Closed, ResolvedPublic

Description

If move page, page which made redirect is 2 newlines.

aka Page "PSY (rapper)" to "PSY (entertainer), page made redirect is:

"#REDIRECT [[PSY (entertainer)]]

"


"#REDIRECT [[PSY (entertainer)]]

<- 2 newlines

"


Version: 1.21.x
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=43281

Details

Reference
bz42616

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:10 AM
bzimport set Reference to bz42616.

By "2 newlines" did you mean 2 whitespaces in one line?
I do not see two new lines there... :)

Which exact wiki is this about? English Wikipedia?

I've confirmed this bug. It's kind of nasty, as any subsequent save of the page will lead to diffs such as this: https://test.wikipedia.org/w/index.php?title=Some_page_title&diff=prev&oldid=152897 (MediaWiki typically strips trailing newlines in the edit area).

When a page is moved using Special:MovePage, the resulting redirects looks as described above:

"#REDIRECT [[Possible bug below redirect code]]

"

This is "#REDIRECT [[Possible bug below redirect code]]\n\n" when it should be "#REDIRECT [[Possible bug below redirect code]]\n".

Somehow the move page form is bypassing MediaWiki's usual rule of not allowing trailing newlines. Bumping the priority of this, as it's kind of corrupting any redirect that's created as the result of a page move (probably thousands of redirects by now).

(In reply to comment #3)

Gerrit change #36705

Daniel Kinzler merged this on December 12. I suppose this bug can now be marked as resolved/fixed.

(In reply to comment #2)

I've confirmed this bug. It's kind of nasty, as any subsequent save of the
page will lead to diffs such as this:
https://test.wikipedia.org/w/index.php?title=Some_page_title&diff=prev&oldid=152897
(MediaWiki typically strips trailing newlines in the edit area).

I filed a separate bug for the diff issue: bug 42669.

It seems from testing on the English Wikipedia that while a large number of these redirects containing trailing newlines came from this recent ContentHandler bug (solved by Gerrit changeset 36705), there have been older bugs in the code that have caused similarly goofy redirects. I'm seeing some from 2005 and some from 2002 (related to the Conversion script, perhaps).

I'm also noticing byte count irregularities. I believe the maintenance script that populated byte counts for each revision miscalculated in some cases. This probably needs further investigation, as it may have ripple effects on data integrity checks, particularly the SHA1 hashes.

Just pasting this here so I don't lose it later:

mysql> select rc_title, rc_old_len-rc_new_len as diff from recentchanges where rc_user_text = 'MZMcBride' and rc_comment = '[[bugzilla:42616]]' and rc_old_len-rc_new_len not in (1,2,3) order by rc_timestamp desc limit 100;
+--------------------------+------+

rc_titlediff

+--------------------------+------+

Biopolitical21
Mounted_skill-at-arms-564
Mounted_skill_at_arms-564
Equestrian_skill_at_arms-564
Taimur_bin_Faisal0
G.U._Pope-1
Cross_examination0
Useless_languages4
Great_Lake6
Norrie-Warburg_syndrome0
Peter_F._Allgeier-3
Astropetrology-2
DSM-III4
TinaArena4
Langobardi6
Anselm_of_Bec9
Ben_Stiller_Show4
Saint_Adalbert_of_Prague4
Imaginary_numbers6

+--------------------------+------+
19 rows in set (8.37 sec)

mysql> select rc_title, rc_old_len-rc_new_len as diff from recentchanges where rc_user_text = 'MZMcBride' and rc_comment = '[[bugzilla:42616]]' and rc_old_len-rc_new_len not in (1,2,3) order by rc_timestamp desc limit 100;
+-----------------------------------------+--------+

rc_titlediff

+-----------------------------------------+--------+

Handley-Page_Halifax0
Jackie_Joyner-Kersey-417
1_John6
2_Peter4
Serotonin-specific_reuptake_inhibitors4
Victoria_park_hong_kong11
Eight_Painters_of_Nanjing-1
Republican_Party_of_Hawaii38
Photo_electric_effect4
GC&SU19
U.S._Northern_Command9
Vnr0
French_Guinea/Economy6
Matt_Le_Blanc-135
Uralskiy_Khrebet4
Nawaf_Al_Hazmi4
ManiacMansion4
AlexanderDugin-1
Telematic-2114
Noronic24
TotalOrderedSet4
SetTopBox4
LieGroup6
Fundamtenal4
Piemonte4
Verlan_language6
Newspeak_language4
Canis_latrans6
Hippy4
Lycopersicum_lycopersicum4
Loglan_language4
Lojban_language4
Oxidized_assault-1150
Electronic_configuration4
1_Peter4
Gaseous_phase4
Gaseous_state4
D-Von_Dudley4
Dead_or_Alive:_Extreme_Beach_Volleyball0
Julianna_Mauriello5
Suprasegmental_feature31
Cosmothiesm-10191
Quantuum_chemistry4
Mononoke_Hime4
Mood_stabilisers6
JapaneseLanguage4
Sahara_(movie)4
Sample_(music)29
Richard_Matthew_Stallman0
R.A.Wilson4
Ubbi_dubbi_language6
Tom_Paine4
Niedersachsen4
Koenigsberger_klopse0
Brooklyn_Trolley_Museum4
Westminster_chime0
Olmsted_Falls-1
PLAUTIA_URGANILLA-2
Dermabond4
SherlockHolmes4
City_of_New_York23
Jean_Gray4
Bunnyhopping-9
University_of_Algarve0
Cuon_alpinus4
Ludwig_von_Koechel-4
Christian_escathology4
National_Wrestling_Alliance_UK4
Rhincodon_typus0
Paul_W._Bryant10
Biopolitical21
Mounted_skill-at-arms-564
Mounted_skill_at_arms-564
Equestrian_skill_at_arms-564
Taimur_bin_Faisal0
G.U._Pope-1
Cross_examination0
Useless_languages4
Great_Lake6
Peter_F._Allgeier-3
DSM-III4
TinaArena4
Langobardi6
Anselm_of_Bec9
Ben_Stiller_Show4
Saint_Adalbert_of_Prague4
Imaginary_numbers6

+-----------------------------------------+--------+
87 rows in set (42.77 sec)