We need a general way to associate information with DOM nodes without having that information inline. The current idea is to set an UID on each DOM node that has associated information, and use that as the key to externally stored metadata. This can then be applied to remove private information like data-parsoid from the DOM we send to the client.
An issue to consider is copy & pasting between pages of the same wiki or even different wikis.
A simple and safe solution would be to discard all associated private information for modified (copy & pasted) content. This means that we would have to leave all semantic information (data-mw primarily) in the DOM even on page views. It also means that blame map information for example would be lost when a paragraph is moved around.
An alternative would be to make uids unique in a wiki, or even across wikis. Example: <wiki id>:<revision id>:<node id>. 1000:40233066:100000 for example can be encoded as Po:CZehq:Yag. This would allow us to move data-mw out of the view DOM as well, and would open up interesting ways to preserve associated metadata like blame maps across copy & pastes. The wiki id would need to be unique though, and there would need to be a public API to retrieve associated metadata. When the wiki id is not recognized or data retrieval fails, we might lose the associated data-mw as well.