Implementation of http://etherpad.wikimedia.org/p/cx-markup-alignment
# MT backed will create an interface named TranslateHTML
# Create an annotation mapping module - it exposes an interface to provide plain text word sequences for a given HTML source input. They are the full text version of the html, word sequences from the inline annotations of the html.
# Use these subsequece annotations and pass it to MT. once recieved, The input array of sequences and plain text MT sequences are passed to annotation module again
# Implement a generic minimal algorithm that uses only edit distance to find ranges in translated plain text MT corresponding to each tag in source HTML. Use lineardoc and apply these annotations
# Enhance the above implementation so that algorithm step about finding match can be overriidden in language specific modules similar to segmentation language modules
# Enhance the algotihm to use n-grams to support word order changes in subsequnece matching