In the manual evaluation T278864#6974599 there were some comments that suggested the generated anchor-texts for the links are wrong.
The following cases were described:
- Linking to just a portion of a larger phrase, in which the larger phrase would not be a link, such as:
- Awards, e.g. "The Jane Smith Award for Excellence" might link just to "Jane Smith".
- Song titles, e.g. "Un Beso Para Mi" might link just to "Un Beso".
- Schools, e.g. "Rockville High School" might link just to "Rockville".
- Possessive suffix are: for anchor text "Brazilian Navy's", the suggestion would be to link just the "Brazilian Navy" portion to the target, whereas we would want to include the "'s" in the link.
This will require some better parsing of the raw text to generate candidate anchors.