Page MenuHomePhabricator

Increase usage of noexcerpt class in IPA and pronunciation related templates
Closed, DeclinedPublic

Description

The PCS summary endpoint tries to exclude some content from its output that otherwise appears on a page to derive an introduction to an article page, one of which are IPAs in various forms, sometimes with audio pronunciations (and sometimes a loudspeaker icon and/or "listen" link).

The summary code already excludes elements with classes noexcerpt. TextExtracts recommends using the noexcerpt class, too.

How can we get more IPA related templates to use the noexcerpt class?

Example: https://en.wikipedia.org/wiki/Template:IPAc-en and of course for many more wikis.

Increasing the usage of this would improve the summary extract results and allow PCS to eventually remove some of the code which deals with stripping IPAs for specific cases.

Example pages

Event Timeline

Im confused.
Editors are encouraged to add noexcerpt class.
Why would we encourage them to add another?

bearND added a comment.EditedFeb 23 2018, 9:49 PM

Well, that answers the question then. :)
noexcerpt it is. Found that also in the TextExtract docs.

How can we get more IPA related templates to use it? I doubt template editors care to look at extension configuration documentation. I basically want to remove the hacky CSS selectors used currently to strip IPAs.
Example: https://en.wikipedia.org/wiki/Template:IPAc-en and of course for many more wikis.

bearND renamed this task from Let editors markup parts of page content to be excluded from summaries to Increase usage of noexcerpt class in IPA and pronunciation related templates.Feb 23 2018, 10:00 PM
bearND updated the task description. (Show Details)
bearND updated the task description. (Show Details)Feb 23 2018, 10:05 PM

How can we get more IPA related templates to use it?

Best thing is to find a partner in community who agrees and is happy to help with this.

Personally, I'd just expose them and hope that editors who care about this will fix these issues themselves. See https://phabricator.wikimedia.org/T91344#1096762 - lots of discussion around this.

@CKoerner_WMF - any thoughts on the best place to reach out to interested editors?

This is becoming more important since we are now less aggressive when it comes to removing parenthetical content due to not wanting to remove portions of chemical formulas T183833.

bearND updated the task description. (Show Details)Feb 26 2018, 6:56 PM
Vachovec1 added a subscriber: Vachovec1.EditedMar 1 2018, 7:20 PM

The examples in the second paragraph are not about IPA/pronunciation, these are citation/source related templates. They are all saying something like "dead link" or "source required". Obviously this should be stripped. On cs-wiki these are displayed via the templates which all have a common span class, namely "doplnte-zdroj". But I am not able to find the definition of this class in a local MediaWiki css or js file.

The example from cs-wiki in the first paragraph is a very bad pattern how to express pronunciation. Something like this should be preferably fixed directly in the article and is probably not worth fixing elsewhere.

@ovasileva Finding folks in the community that would be interested in helping sounds like the direct way forward. Identifying templates (as has been done in the task summary) to point folks toward is helpful. Let me ask around on the best way to approach folks. I want folks to feel encouraged to fix these templates (as their content will render better in different contexts like Page Previews).

bearND removed bearND as the assignee of this task.Jun 18 2018, 4:43 PM

@ovasileva Finding folks in the community that would be interested in helping sounds like the direct way forward. Identifying templates (as has been done in the task summary) to point folks toward is helpful. Let me ask around on the best way to approach folks. I want folks to feel encouraged to fix these templates (as their content will render better in different contexts like Page Previews).

Thanks @CKoerner_WMF - this sounds like a good way forward.

Jhernandez triaged this task as Low priority.Jun 29 2018, 6:25 PM
Jhernandez added a subscriber: Jhernandez.

There is some information here: https://www.mediawiki.org/wiki/Extension:TextExtracts#How_can_I_remove_content_from_a_page_preview/extract?

What actionable thing could we do to help this happen?

How can we get more IPA related templates to use the noexcerpt class?

The problem here is IPA templates only handle the content inside. They don't have anything to do with the brackets
e.g.

Barack Hussein Obama II''' ({{IPAc-en|audio=En-u
                           ^

This is a big change to how editors would edit content and there's no technical solution here - just community consulting with specific projects.

I'm not sure what actions we can take here, apart from recommending it to editors in a big document and continuing to educate on why summaries are broken in certain cases.

The problem here is IPA templates only handle the content inside. They don't have anything to do with the brackets
e.g.

Barack Hussein Obama II''' ({{IPAc-en|audio=En-u
                           ^

This is a big change to how editors would edit content and there's no technical solution here - just community consulting with specific projects.
I'm not sure what actions we can take here, apart from recommending it to editors in a big document and continuing to educate on why summaries are broken in certain cases.

@ovasileva @CKoerner_WMF Do we have any existing canonical place where we line up recommendations to editors to better use the software capabilities? (See ☝️)

@Jhernandez Yes. No. Well, it's complicated.

Template editors don't usually all hang out in the same place. Which is why we often use a rather "shotgun" approach at communication. Mailing lists, tech news, and a translatable message to various Village Pumps to try and get as many eyes on changes as possible.

However, if we know what templates would benefit from these changes we can use the talk pages of those templates to reach editors that are interested. Some templates are reused across wikis and we can use Wikidata to find those pages (example). This doesn't work in all cases.

As for a single place for all software capabilities - MediaWiki.org. I'm only have kidding here. After folks asking on local wiki Village Pumps for tech help MediaWiki.org is where we have most of our documentation. That wiki is still rather large and information is spread across it. :/

Aside: One thing I think our movement lacks is someone who actively evangelizes the features of the software and advancements in the tools. Unfortunately a lot of learning in our communities is an act of individual self-discovery - searching and finding information over a long period of time to build up skills. Each person learning a little at a different pace.

Jhernandez closed this task as Declined.Jul 12 2018, 11:25 AM

Right, I imagined as much. Thanks for the answer Chris.

@bearND I think we should follow Chris's advice and if you see a template that you would like improved we should reach out through the talk page of the template and post a recommendation to add the noexcerpt class explaining the benefit and linking to https://www.mediawiki.org/wiki/Extension:TextExtracts#How_can_I_remove_content_from_a_page_preview/extract.

I'm going to decline this for now as we are doing other things, and we can do this outreach as needed in a case by case basis.

I'll keep this task bookmarked to refer to if Readers Web sees similar requests in future too. Thanks, both!