If we go with the approach of having the backlog of the Improve Tone Suggested Edit be generated by the ML team, then those backlog items will contain the plain text of the paragraphed as produced by mwparserfromhell and mwedittypes. With that plaintext we need to find the ve.Range corresponding to the paragraph in the Visual Editor session.
This task is about coming up with an initial heuristic for doing that.
Example:
Plain text of the first paragraph of the article about Wisconsin as used in the model:
Wisconsin ( ) is a state in the Upper Midwest and Great Lakes regions of the United States. It borders Minnesota to the west, Iowa to the southwest, Illinois to the south, Lake Michigan to the east, Michigan to the northeast, and Lake Superior to the north. With a population of about 6 million and an area of about 65,500 square miles, Wisconsin is the 20th-largest state by population and the 23rd-largest by area. It has 72 counties. The state's most populous city is Milwaukee. Its capital and second-most populous city is Madison; other urban areas include Green Bay and the Fox Cities.
and as available in VE:
Wisconsin (/wɪˈskɒnsɪn/ ⓘ wih-SKON-sin)[12] is a state in the Upper Midwest and Great Lakes regions of the United States. It borders Minnesota to the west, Iowa to the southwest, Illinois to the south, Lake Michigan to the east, Michigan to the northeast, and Lake Superior to the north. With a population of about 6 million[9] and an area of about 65,500 square miles, Wisconsin is the 20th-largest state by population and the 23rd-largest by area. It has 72 counties. The state's most populous city is Milwaukee. Its capital and second-most populous city is Madison; other urban areas include Green Bay and the Fox Cities.