Page MenuHomePhabricator

Improve DOM Diff and selective serialization on pages with lots of (bot) edits
Open, LowPublic

Description

See https://delinterbot.toolforge.org/demo/10585754_diff.html - it has some table attributes being reordered and https://delinterbot.toolforge.org/demo/10353696_diff.html which has a number of unrelated dirty diffs. Both these pages were edited by a bot to fix up a number of lint errors (font -> span, div with class instead of align, etc.).

My hunch is that what is going on is that the volume of edits on the page are hitting the limits of what the dom diff code can handle in terms of marking small portions of the dom as modified.

So, it might be useful to look into these at some point and tweak the DOM diff algorithm to better detect and mark modified portions of a DOM.

Event Timeline

I am going to mark this low for now since there are a number of other higher priority tasks on our plate and large scale bot edits on a page are probably limited to fixing lint errors on pages with lots of lints on the page. Dirty diffs may be more acceptable in that context. Plus, one option available to the bot is to split the single edit into multiple smaller edits.

Thanks for filing.

Plus, one option available to the bot is to split the single edit into multiple smaller edits.

Just to clarify, people (including myself) tend to get annoyed when bots take multiple edits to do what could be done in one, my goal with the delinterbot is to fix things in only one edit instead of the other bots taking like 5-10 edits. So I wouldn't consider that an option.