Page MenuHomePhabricator

Anomalies when an AWB Find+Replace rule contains \r\n
Open, Needs TriagePublic

Description

Odd things happen when a Find+Replace rule contains the string "\r\n". Try these steps:

(1) Reset to original default settings
(2) Add a F+R rule, find "foo\r\n" replace "bar", regex ticked; click OK to dismiss the F+R dialog
(3) Redisplay the F+R dialog and click "Cancel" to dismiss it
(4) Redisplay the F+R dialog and carefully step the text cursor through the "find" string.

The "Find" string has been damaged; there's now an invisible control character between the "foo" and the "\r\n"

(5) Click "OK" to dismiss the F+R dialog
(6) Save the settings
(7) Load the settings
(8) Redisplay the F+R dialog

The "Find" string is further damaged; it now reads "foo\r\n\r\n"

What actually happened is that a couple of weeks back I edited a F+R regex using the regex tester and must have accidentally included a newline when pasting in part of a regex. Thanks to this doubling effect, by the time I realised what was wrong, my settings file was padded with 20 Megabytes of extra newlines!

AWB 5.8.6.0 SVN 12004
Internet Explorer version: 11.0.10586.494
.NET version: 2.0.50727.8689
Windows version: 6.2

Event Timeline

rev 12075 stops the duplication of newlines issue when F&R closed in cancel mode (Cancel mode or escape etc.).

Getting exactly the right behaviour for newlines is problematic. The article page text we receive from mediawiki API uses \n as newline, so we need to use that for applying the find. However, when we save settings as XML it is a Windows-format file so uses \r\n for newlines, we have to convert back. Thirdly non-regex find & replace entries are actually run as regexes with appropriate escaping of characters, however the escaping then causes issues when the user enters \n to mean newline, so we have to undo that bit of escaping. I also need to bear in mind that changing the behaviour of loading or saving F&R settings could break user's existing settings files (12075 only changed the Cancel behaviour not load/save settings or how the F&R are run).

That's a good enough fix for me, as my newline was an accident