Page MenuHomePhabricator

Outreach-17 Project: Add a new Linter Category: Links-in-Links
Closed, ResolvedPublic

Description

Brief summary

MediaWiki provides a Linter extension that exposes markup issues to editors to fix. These markup issues are classified into different Linter categories. These issues are identified by Parsoid during its wikitext parsing process.

As part of this project, at the very minimum, you would be implementing two things: (a) write code in Parsoid to detect the use of links-in-links which is not semantically meaningful and cannot be rendered in HTML (b) write code in the PHP Linter extension to add this new category.

Example wikitext that has this markup error.

[http://google.com This is [[Google]]'s search page]

In the above example, Google is linked in the This is [[Google]]'s search page link text for the http://google.com url link. This is invalid and should be flagged by the Linter code in Parsoid.

Skills required

Both node.js and PHP skills would be ideal. At least one of them would be good. Familiarity with wikitext and/or DOM manipulation would be a bonus, but not required for this project. You will be picking up the necessary skills.

Mentor(s)

@ssastry

Get started

Microtasks

Event Timeline

https://gerrit.wikimedia.org/r/396049 could be used to "fix" the behavior in the core parser, once we've wikilinted the problem away.

Is this specifically about link syntax in external links, or in any links? If the latter, keep in mind that descriptions in file embeds may contain links, e.g. [[File:Example.png|an [[example]] image]] or [[File:Example.png|an [https://en.wikipedia.org example] image]]; the linter category should not pick these up.

Is this specifically about link syntax in external links, or in any links? If the latter, keep in mind that descriptions in file embeds may contain links, e.g. [[File:Example.png|an [[example]] image]] or [[File:Example.png|an [https://en.wikipedia.org example] image]]; the linter category should not pick these up.

This is about invalid HTML output, not so much about wikitext. So, yes, image captions can contain links because the captions aren't embedded in other links. But, yes, wikitext syntax makes it confusing because in some cases, you can embed links in link syntax and in other cases, you cannot.

From the perspective of the wikitext preprocessor, [ and ] are not currently "seen" by the preprocessor. So any [... [[ ... ]] ... ] construct is an invalid link, but the preprocessor can't tell that currently. So that's the specific case which https://gerrit.wikimedia.org/r/396049 would help with.

Agreed with @Dinoguy1000 and @ssastry that [[ ... [[ ... ]] ... ]] is a little more subtle, since some of those are valid. But we should try to make the behavior consistent for the invalid cases, instead of emitting broken HTML and letting tidy fix it up arbitrarily. The "wikitext way" is probably to emit literal [[ characters in the output for the inner link, which will make it obvious to editors that there's a problem that needs to be fixed.

This could be a good outreachy task...

@cscott Could you modify the task description to meet the requirements as in this template, also add some microtasks and then upload the project to Outreachy's website? Check step 2 and 3: https://www.mediawiki.org/wiki/Outreachy/Mentors#_Before_the_program. Thanks!

ssastry renamed this task from New Linter Category: Links-in-Links to Outreach-17 Project: Add a new Linter Category: Links-in-Links.Sep 12 2018, 9:59 PM
ssastry updated the task description. (Show Details)

@cscott Could you modify the task description to meet the requirements as in this template, also add some microtasks and then upload the project to Outreachy's website? Check step 2 and 3: https://www.mediawiki.org/wiki/Outreachy/Mentors#_Before_the_program. Thanks!

Does the updated description look good?

@ssastry Yes :) You could upload it to the Outreachy website now and ping me when you are done so that I could approve the request!

@ssastry Yes :) You could upload it to the Outreachy website now and ping me when you are done so that I could approve the request!

Done. Let me know if it needs any changes.

This message is for all candidates interested in working on this project for Outreachy. Please make sure that before you start working on this project, you've filled out an initial application to help Outreachy organizers verify whether or not you are eligible to participate in the program: https://www.outreachy.org/eligibility/. It should only take you 5 minutes to 30 minutes to complete.

Once you've submitted your initial application, it may take up to a week for Outreachy organizers to review your application and make a decision. Once you are approved, you can start working on the microtasks. In the meanwhile, read our participants guide https://www.mediawiki.org/wiki/Outreachy/Participants and learn about the Wikimedia movement https://www.wikimedia.org/.

Hello, I'm interested on working on this Outreach-17 Project

I am Farida

Welcome to Phabricator, Farida. For the benefit of other watching this ticket. Farida and I have been in email communication already.

Change 489620 had a related patch set uploaded (by Farida; owner: Farida):
[mediawiki/extensions/Linter@master] Add new Linter category wikilink-in-extlink

https://gerrit.wikimedia.org/r/489620

Change 489620 merged by jenkins-bot:
[mediawiki/extensions/Linter@master] Add new Linter category wikilink-in-extlink

https://gerrit.wikimedia.org/r/489620

Change 494148 had a related patch set uploaded (by Farida; owner: Farida):
[mediawiki/services/parsoid@master] Fix if statement in logWikilinksInExtlinks

https://gerrit.wikimedia.org/r/494148

Change 494189 had a related patch set uploaded (by Farida; owner: Farida):
[mediawiki/services/parsoid@master] Fix new linter category to enable code work with templates

https://gerrit.wikimedia.org/r/494189

Change 494148 abandoned by Subramanya Sastry:
Fix if statement in logWikilinksInExtlinks

Reason:
Duplicate

https://gerrit.wikimedia.org/r/494148

This category is now being populated, but there doesn't appear to be good dsr info to act on,
https://en.wikipedia.org/wiki/Special:LintErrors/wikilink-in-extlink

but there doesn't appear to be good dsr info to act on,

Nevermind, that's probably something to do with the 2017 wikitext editor.

We should remove the column that says "Nested wikilinks in external links that need to be fixed" since it doesn't seem to be used.

Change 494189 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Fix new linter category to enable code work with templates

https://gerrit.wikimedia.org/r/494189

ssastry claimed this task.