Insufficient span tags stripping from copy-and-paste in Safari
From @TrevorParscal's report on T78540#1157939:

Reproduced with Safari 8.0.4 on MacOS X 10.10.2.

  1. Select an internal link, a space and some plain text
  2. Copy
  3. Paste
  4. Click save
  5. Click preview changes
  6. Notice that there's an extra span around the space and plain text in the pasted content


Wow that's weird. I wonder if this is related to copy/paste in any way?

This has suddenly started happening all over the place. It's also adding language codes. It might be related to copying. I've definitely seen this in Safari 6.2 on Mac OS 10.8.5 is a fairly clean example. adds left-to-right code. (earlier) adds many span tags. Based on the content, it might be adding them to copy-paste content. This bit in particular:

<span lang="FR"><span lang="FR">[1]</span></span>

looks rather like the editor copied a citation from the en.wp article and pasted it into the fr.wp article (and then translated the text).

When a text is copied between different-language wikipedias, it seems to me to be perfectly fine that it is wrapped with a <span> that states the original language. If the user copy/pastes and then erases the text, they should, theoretically (And practically -- it's marked) erase the language annotation.

I can't manage to reproduce the overlapping span tags (the double spans) that used to appear back in August. The current ones are more or less what we want to see, or copy/paste in pieces by the user.

The potential bugs I see here are:

  1. If the user did not see an indication that these copy/paste language spans are language annotations, that's a bug
  2. This line seems to be a bug, since it's a double-wrapper language span that shouldn't happen even if it was a result of a copy/pate from another language.
<span lang="FR"><span lang="FR">[1]</span></span>

By the way, it also makes perfect sense to add directionality to a language block, especially if that language block is being edited. That's the point of language annotations, and it seems to be very convenient that this automatically happens between copy/pastes. It helps not only the editor, but also the page in read mode, as well as indexing, accessiblity, etc. That part I wouldn't call a bug unless there's something I'm completely missing here.

@Mooeypoo What you seem to miss in your two comments is that the language code put in the lang tags doesn't seem to be the original language, but the language of the current wiki...
All the examples above show lang="EN" added to enwiki, lang="FR" added to frwiki: this is totally useless; and if it's due to a copy from a wiki in an other language, it's just plain wrong

Same for the directionality: default directionality on frwiki is "ltr", so adding a dir="ltr" is useless.

@NicoV, you're right. Apologies, I missed that. The <span> languages shouldn't be added from the same language.

Here's another example, with no language tags:

@ssastry thinks this is related to cut-and-paste, and that there used to be bogus ID attributes in the <span>s which were removed by Parsoid (see ).

In case it is useful to VE to debug, open and search for "html2wt" -- you will find logged warnings (1 warning per span found => multiple warnings per page in some cases).

Does this ticket cover all insertions of span tags? Is is useful to provide more diffs? (for example )

Change 200299 had a related patch set uploaded (by Esanders):
Simplify getClipboardHash

Change 200299 merged by jenkins-bot:
Simplify getClipboardHash

Still seeing those span tags in the wild...

Maybe I should look at the Version before commenting though?

Doesn't seem to be entirely fixed, we still get <span lang="EN-US"> on fr wikipedia, for text that is obviously not in English.

Reopening because this doesn't look fixed. On cywiki there are lang=CY span tags yet. The user shouldn't have copy/pasted wikitext, but the span tags indicating the context is in the same language than the wiki it's being pasted on do not seem useful.

And not only it's almost always useless, but it can also be totally wrong...
In this edit, span tags were added with lang="FR" when it's clearly not in French.

Restoring previous priority "high" - Maintainers will take a look at this soon and are aware of this problem, but it is up to them to judge priority in comparison with other open urgent tasks (plus this got reopened on Thursday and there's been a weekend since then).
Sorry for the inconvenience caused by this. :-/

Examples of this not working have been posted both here and at enwiki since the 14th and no answer or no acknowledgement in either place since then (tuesday last week, reopening it on friday was already a consequence of no one answering).
I raised the priority so that someone will do something instead of ignoring the problem.

Moved new bug reports to T96589: More <span> corruption (unknown source). The bug here was fixed, this appears to be a different source.