Page MenuHomePhabricator

Copy-pasting headings from Google Doc to VisualEditor messes up the hierarchy of the headings
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

I made a Google Doc with reproducible steps, you can see (and use it to copy/paste) here: https://docs.google.com/document/d/1nU2rYurrkOQ9HhEmJW_rEwXpDjPxmJIwVwoYkefJEHM/edit#heading=h.418mr829onl0 (Problem #2)

The steps are:

  • Create a Google Doc with headings
  • Copy the entire document to VisualEditor page

What happens?:
All headings are now a level up, starting with "Page Title" (which shouldn't really be used in articles multiple times).

What should have happened instead?:
The transformation should account for "Heading 1" in Google Doc to "Heading" in VE; "Heading 2" to "Sub-heading 1" etc. It seems like an off-by-one-error ;)

Event Timeline

In Google Docs, Heading 1 is an <h1>, which in VE is =Page title= level. For external pastes the best we can do is use the HTML tags we are provided.

It may be possible to detect if the paste came from Google Docs and provide a custom set of transformations (<hN> -> <hN+1>), but that would rely on there being some consistent markup in Google Docs pastes.

It may be possible to detect if the paste came from Google Docs...

It looks like Google Docs pastes are always wrapped in this span:
<span id="docs-internal-guid-<some guid>">...</span>

I doubt this is a guaranteed public API, but it will probably work for a while.

There's also a custom clipboard key in clipboardData (application/x-vnd.google-docs-document-slice-clip+wrapped) which seemingly contains no actualy data.

image.png (179×857 px, 42 KB)

This or the span tag could be used to detect GDocs pastes. Hard to say which is going to be more reliable, but we could check for either.

The downside of this is that if you keep copying and paste back and forth between VE and GDocs, your headings would keep decreasing in level until they were all <h6>.

The downside of this is that if you keep copying and paste back and forth between VE and GDocs, your headings would keep decreasing in level until they were all <h6>.

Is that a bug, or a feature?

There is a theoretically conceptual way to fix this but I think it's too convoluted to be worth it.

Still, sharing for posterity -- we could look at the content of the copied data, and automatically make the "top most" heading be our 'heading', restarting the numbering that way.

That will prevent the back-and-forth-slimming.

... but it's probably overly complicated, and not worth it. I don't think there's many people who would copy-paste back-and-forth from GDoc to VE and back.

Is that a bug, or a feature?

Someone will be upset when we fix it https://xkcd.com/1172/

For reference, GDocs provides the following formats:

GDocsHTMLVE
Titlespan with inline CSSPlain paragraph
Sub titlespan with inline CSSPlain paragraph
Heading 1<h1>heading 1 ("Page title" in VE-MediaWiki)
Heading 2<h2>heading 2 ("Heading" in VE-MediaWiki)
Heading 3<h3>heading 2 ("Sub-heading 1" in VE-MediaWiki)

Still, sharing for posterity -- we could look at the content of the copied data, and automatically make the "top most" heading be our 'heading', restarting the numbering that way.

This works well when copying an entire document, but not so much for sections.

There is a theoretically conceptual way to fix this but I think it's too convoluted to be worth it.

"Patches welcome"