The Word class in Wikidiff2 has a "body" part, which is supposed to contain the letters of the word, and the "suffix" part, which is supposed to contain the spaces following the word. However, a bug in TextUtil::explodeWords() from 2010 causes each space to always be a separate word. The suffix part is only populated when the body part is empty.
Fixing the bug improves performance on English text by 27% according to bench.php. However, it will have user-visible impacts, which I would like to seek input on.
Space insertion currently:
Space insertion with spaces joined to preceding words:
So the question is:
- What should space insertion look like?
- Does it matter enough to warrant a 27% slowdown?
If the answer is that it should continue to work like it has since 2010, then we should get rid of the "suffix" concept in the Word class and clean up TextUtil::explodeWords() accordingly.