Hovercards sometimes has contents in brackets (parentheses) appearing in excerpts, especially at ruwiki
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Quiddity
	Mar 19 2015, 12:27 AM

Description

Tpimh reports

In English wiki bulbs show the beginning of the article with stripped text in brackets (e.g. dates of birth and death, alternative names), but in Russian wiki it is shown. Sometimes it is the only information that is shown if the text in brackets is long enough. Kind of not usefull at all.

This seems to be a prolific problem at Ruwiki, (screenshots from [[Main page]] links to these articles: Луций Сергий Катилина, and Гай Саллюстий Крисп, and Этрурия.)

Screenshot_from_2015-03-18_14:16:46.png (1×1 px, 590 KB)

Screenshot_from_2015-03-18_14:16:59.png (1×1 px, 607 KB)

Screenshot_from_2015-03-18_14:16:39.png (1×1 px, 636 KB)

I've only been able to find one example at Enwiki (out of ~100 tests) linking to this article https://en.wikipedia.org/w/index.php?title=Samuel_Allyne_Otis&oldid=523235627

Screenshot_from_2015-03-18_14:18:54.png (845×1 px, 220 KB)

and none at Frwiki,

Note: There is a plan to refine what content is excluded, in T91344: Review exclude all approach to parenthetical elements in summary endpoint, but for the moment No bracketed contented is meant to be shown.

Details

	Subject	Repo	Branch	Lines +/-
	renderer.article: Increase exsentences to 5 in the API call	mediawiki/extensions/Popups	master	+1 -1

Customize query in gerrit

Related Objects

Mentioned In: rEPOP51884d39e925: renderer.article: Increase exsentences to 5 in the API call
rMEXT8adba2b9f1f3: Updated mediawiki/extensions Project: mediawiki/extensions/Popups…
T98067: Scrub parentheses and dates from text extract
T94957: Hovercards text size, leading and whitespace management
Mentioned Here: T91344: Review exclude all approach to parenthetical elements in summary endpoint

Event Timeline

Quiddity created this task.Mar 19 2015, 12:27 AM

Quiddity assigned this task to • Prtksxna.

Quiddity raised the priority of this task from to Needs Triage.

Quiddity updated the task description. (Show Details)

Quiddity added a project: Page-Previews.

Quiddity subscribed.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 19 2015, 12:27 AM

In the case of Samuel Allyne Otis, TextExtracts returns the following excerpt —

Samuel Allyne Otis (son of James Otis, Sr., father of Harrison Gray Otis and brother of prominent revolutionary James Otis, Jr.

And when Hovercards finds malformed brackets, in this case — just an opening bracket, it doesn't do anything to the text.

The same seems to be the case with the links on Russian Wikipedia

title	extract	api call
Кантата	Кантата (итал. cantata, от лат.	https://ru.wikipedia.org/w/api.php?action=query&format=json&prop=extracts%7Cpageimages%7Crevisions%7Cinfo&redirects=true&exintro=true&exsentences=2&explaintext=true&piprop=thumbnail&pithumbsize=300&rvprop=timestamp&inprop=watched&indexpageids=true&titles=%D0%9A%D0%B0%D0%BD%D1%82%D0%B0%D1%82%D0%B0
Орган (музыкальный инструмент)	Орга́н (лат. organum из др.-греч.	https://ru.wikipedia.org/w/api.php?action=query&format=json&prop=extracts%7Cpageimages%7Crevisions%7Cinfo&redirects=true&exintro=true&exsentences=2&explaintext=true&piprop=thumbnail&pithumbsize=300&rvprop=timestamp&inprop=watched&indexpageids=true&titles=%D0%9E%D1%80%D0%B3%D0%B0%D0%BD+(%D0%BC%D1%83%D0%B7%D1%8B%D0%BA%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B9+%D0%B8%D0%BD%D1%81%D1%82%D1%80%D1%83%D0%BC%D0%B5%D0%BD%D1%82)
Концерт (произведение)	Конце́рт (нем. Konzert от итал.	https://ru.wikipedia.org/w/api.php?action=query&format=json&prop=extracts%7Cpageimages%7Crevisions%7Cinfo&redirects=true&exintro=true&exsentences=2&explaintext=true&piprop=thumbnail&pithumbsize=300&rvprop=timestamp&inprop=watched&indexpageids=true&titles=%D0%9A%D0%BE%D0%BD%D1%86%D0%B5%D1%80%D1%82+(%D0%BF%D1%80%D0%BE%D0%B8%D0%B7%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5)

So, as long as there are malformed brackets in the extract, Hovercards will show them as is.

This seems like a TextExtracts issue for now.

• Prtksxna added a project: TextExtracts.Mar 30 2015, 5:29 AM

• Prtksxna set Security to None.

In our API call if we increase exsenteces to 5 and get rid of exintro we'll be able to get a better TextExtract. We are clipping the extra content on other cards anyway, so this won't cause a problem there.

For example, we could get —

Кантата (итал. cantata, от лат. саntare — петь) — вокально-инструментальное произведение, созданное для солистов и хора.

…instead of just —

Кантата (итал. cantata, от лат.

@MaxSem Would this be alright for our use case?
@ori, would this have any performance implications?

• Prtksxna mentioned this in T94957: Hovercards text size, leading and whitespace management.Apr 3 2015, 10:28 AM

@MaxSem Would this be alright for our use case?

Yes.

@ori, would this have any performance implications?

No.

Change 202001 had a related patch set uploaded (by Prtksxna):
renderer.article: Remove exintro and increase exsentences to 5 in the API call

https://gerrit.wikimedia.org/r/202001

gerritbot added a project: Patch-For-Review.Apr 6 2015, 6:05 AM

@MaxSem, I noticed something strange with the use of exintro.

exintro	result	link
`false`	Кантата (итал. cantata, от лат.	API Call
`true`	Кантата (итал. cantata, от лат.	API Call
not set	Кантата (итал. cantata, от лат. саntare — петь) — вокально-инструментальное произведение, созданное для солистов и хора.	API Call

This is why I had removed exintro in resources/ext.popups.renderer.article.js.

@Prtksxna, it seems that exintro is considered true whenever it's set. Just leave it out to get the default value (false) to appear.

In T93160#1183913, @Ricordisamoa wrote:

@Prtksxna, it seems that exintro is considered true whenever it's set. Just leave it out to get the default value (false) to appear.

Right. That is what I am doing in the patch. @MaxSem points out that we might end up getting the first section heading in Hovercards in case the intro isn't long enough.

• Spage mentioned this in T98067: Scrub parentheses and dates from text extract.May 4 2015, 8:38 PM

Change 202001 merged by jenkins-bot:
renderer.article: Increase exsentences to 5 in the API call

https://gerrit.wikimedia.org/r/202001

• Prtksxna mentioned this in rMEXT8adba2b9f1f3: Updated mediawiki/extensions Project: mediawiki/extensions/Popups….Jul 8 2015, 5:21 PM

• Prtksxna mentioned this in rEPOP51884d39e925: renderer.article: Increase exsentences to 5 in the API call.

• Prtksxna closed this task as Resolved.Jul 8 2015, 5:22 PM

• Forrestbot added a project: WMF-deploy-2015-07-14_(1.26wmf14).Jul 8 2015, 6:00 PM

Quiddity moved this task from Backlog to Done on the Page-Previews board.Aug 4 2015, 3:56 PM

	F100015: Screenshot_from_2015-03-18_14:16:46.png
	Mar 19 2015, 12:27 AM

	F100014: Screenshot_from_2015-03-18_14:18:54.png
	Mar 19 2015, 12:27 AM

	F100016: Screenshot_from_2015-03-18_14:16:59.png
	Mar 19 2015, 12:27 AM

	F100017: Screenshot_from_2015-03-18_14:16:39.png
	Mar 19 2015, 12:27 AM

Hovercards sometimes has contents in brackets (parentheses) appearing in excerpts, especially at ruwikiClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

Hovercards sometimes has contents in brackets (parentheses) appearing in excerpts, especially at ruwiki
Closed, ResolvedPublic
Actions