Author: alefzet
Description:
[[:en:Karakalpak language]] uses digraphs with apostrophes like A', N', O', U'
[[:en:Uzbek language]] uses digraphs with gravis like G`, O` and apstrophe (') as separated letter.
See [[:en:Alphabets derived from the Latin]]
r36253 introduced $linkTrail = '/^(\'?\p{L&}+)(.*)$/usD'; that works as well for many languages but Karakalpak and Uzbek.
[[a'bc]]de becomes <u>a'bcde</u>
[[abc]]'de => <u>abc'de</u>
[[abc]]d'e => <u>abcd</u>'e rather than <u>abcd'e</u>
[[a`bc]]de becomes <u>a`bcde</u>
[[abc]]de => <u>abc</u>de rather than <u>abc`de</u>
[[abc]]d`e => <u>abcd</u>e rather than <u>abcde</u>
I am not expert on regular expressions. What may create correct regex?
May be need introduce a new variable with specifed punctuation characters (dependent to language) treated as letters/letter elements?
Version: 1.13.x
Severity: enhancement