General
- Evaluate/Compare with existing solutions: Page Summary, MobileApps, TextExtracts, ActiveAbstracts
- Test r.t.l. languages
Feedback with examples
- Essential text stripped off because slashes are used in text, eg https://en.wikipedia.org/wiki/ISO/IEC_8859-2 (“/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/I”)
- Includes styling noise of pronunciation, eg https://fr.wikipedia.org/wiki/Anarchisme
- Includes styling noise due to math notations, eg https://en.wikipedia.org/wiki/Quotient_group (eg …G mod N {\displaystyle G{\bmod {N}}} , where mod {\displaystyle {\mbox{mod}}} is…)
- Essential text stripped off between square brackets, eg https://en.wikipedia.org/wiki/Coset (eg “… is usually denoted by [G : H]“, similar in other math articles). Here the math letters are shown correct and without styling (not up to us: displayed similar, but marked up differently in comparison with the above Quotient Group)
- Essential text stripped off between square brackets, essential to the content (similar in other articles around phonetics, music etc), eg https://en.wikipedia.org/wiki/Distinctive_feature
- Includes styling noise due to keypress “icon” (using the Template:Key_press), eg https://en.wikipedia.org/wiki/Tab_key. General logic around the use of templates?
- abbreviations for awards/titles included, eg https://en.wikipedia.org/wiki/Benjamin_Franklin
- Superscript at the end (typo?) is displayed, eg https://en.wikipedia.org/wiki/Soviet_space_program
- optional: certain logic for typos? eg when references are put after full stop and followed by another full stop, resulting in .. in abstract, eg https://en.wikipedia.org/wiki/International_Conference_on_Learning_Representations