Here's a list of issues found after comparing summary `extract_html` fields from 1.2.0 to 1.3.0 (MCS commit b0be98c). So far the wikis ar through es on the [[ http://wpsummary.surge.sh/1.2.0-b0be98c/html/es.html | comparison report ]] have been checked.
=== Issues
[x] `scribunto-error` as first paragraph selected for [[ https://bg.wikipedia.org/api/rest_v1/page/html/%D0%92%D1%82%D0%BE%D1%80%D0%B0_%D1%81%D0%B2%D0%B5%D1%82%D0%BE%D0%B2%D0%BD%D0%B0_%D0%B2%D0%BE%D0%B9%D0%BD%D0%B0 | one article in bgwiki ]]. We should consider removing `span.scribunto-error` from the DOM before selecting the intro paragraph. --> fixed in rGMOA050f9c2e122a
[x] We should also consider not going into any subsections, in the rare corner case where the lead section has a subsection. (This is rare but the new Parsoid `section` elements can be nested.): [[ https://bn.wikipedia.org/api/rest_v1/page/html/%E0%A6%B6%E0%A6%BF%E0%A6%95%E0%A7%8D%E0%A6%B7%E0%A6%BE/2758222 | bn:শিক্ষা ]] (fixed in https://bn.wikipedia.org/api/rest_v1/page/mobile-html/%E0%A6%B6%E0%A6%BF%E0%A6%95%E0%A7%8D%E0%A6%B7%E0%A6%BE/3400557)
=== Should be fixed on the Parsoid level or onwiki
[] Infobox syntax shown [[ https://bn.wikipedia.org/api/rest_v1/page/html/আলী_ইবনে_আবু_তালিব/2837813 | one article on bnwiki ]]
[] `books.google.com/books?isbn=0810864908` shown in [[ https://ar.wikipedia.org/api/rest_v1/page/html/%D8%A3%D9%85%D8%A7%D8%B2%D9%8A%D8%BA/26896137 | one arwiki article ]]
=== Recommend to be fixed onwiki
Moved most to T188134.
[-] Should be a list after the first paragraph: [[ https://ca.wikipedia.org/api/rest_v1/page/html/Anarquisme/19283010 | ca:Anarquisme ]]
(I think the lead paragraph in this article is too long, and the two bullet items too large to fit into a summary. I don't have the Catalan skills to rewrite this so I'm going to punt on this.)
[-] title missing: [[ https://de.wikipedia.org/api/rest_v1/page/html/Stuttgart/173449180 | de:Stuttgart ]] (The [[ https://de.wikipedia.org/w/index.php?title=Vorlage:Audio&action=edit | de:Audio template ]] uses the `noprint` class)
[] would be nice to also include the paragraph immediately following the first when the end of the first paragraph ends with `:` These have some extra content or formatting HTML in between, though. So, best to fix onwiki: [[ https://cs.wikipedia.org/api/rest_v1/page/html/Ohm%C5%AFv_z%C3%A1kon/15567838 | cs:Ohmův_zákon ]], [[ https://cs.wikipedia.org/api/rest_v1/page/html/Archimédův_zákon/15693096 | cs:Archimédův_zákon ]], [[ https://cs.wikipedia.org/api/rest_v1/page/html/Příslovce/15397768 | cs:Příslovce ]], [[ https://da.wikipedia.org/api/rest_v1/page/html/Idealgasligning/8516662 | da:Idealgasligning ]]
=== Minor issues, some of which should probably be fixed in MCS
Most of these could also be fixed onwiki, see T188134.
[] too many punctuation/whitespace characters (usually a result of stripping parentheticals or IPAs)
[-] double commas (`,,`): [[ https://da.wikipedia.org/api/rest_v1/page/html/Blasfemi/9104429 | da:Blasfemi ]] should probably be fixed onwiki
[-] comma before semicolon (`,;`): don't see it anymore with 1ee857e
[-] double spaces (` `): [[ https://en.wikipedia.org/api/rest_v1/page/html/London/822677492 | en:London ]], fixed in https://gerrit.wikimedia.org/r/c/414023/, another fix is
[] space before comma (` ,`): [[ https://es.wikipedia.org/api/rest_v1/page/html/Grecia/105177728 | es:Grecia ]]
[x] space before semicolon (` ;`): [[ https://es.wikipedia.org/api/rest_v1/page/html/C%C3%A9lula/105100890 | es:Célula ]]
[] consider not stripping `()`
[x] from paragraphs after the first paragraph (e.g. in `<li>` elements): [[ https://en.wikipedia.org/api/rest_v1/page/html/Suit/822913926 | en:Suit ]] (The latest version of that article is tough to handle since the list comes in a new section) , [[ https://da.wikipedia.org/api/rest_v1/page/html/Kulstofkredsl%C3%B8b/9351225 | da:Kulstofkredsløb ]] (this one is fixed now)
[-] with contents in bold `(<b>foo</b>)`: [[ https://el.wikipedia.org/api/rest_v1/page/html/%CE%95%CF%85%CF%81%CF%89%CE%BC%CF%80%CE%AC%CF%83%CE%BA%CE%B5%CF%84_1987/6542146 | el:Ευρωμπάσκετ ]], [[ https://es.wikipedia.org/api/rest_v1/page/html/N%C3%BAmero_%C3%A1ureo/105154269 | es:Número_áureo ]]; but [HOLD OFF FOR NOW] since we probably want to get rid of all the IPAs in [[ https://en.wikipedia.org/api/rest_v1/page/html/Azerbaijan/827080382 | Azerbaijan ]].
[] Consider including next paragraph if first paragraph ends with a `:`
[[ https://es.wikipedia.org/api/rest_v1/page/html/Resistencia_el%C3%A9ctrica/103807716 | es:Resistencia_eléctrica ]] -> T220249