Page MenuHomePhabricator

GSoC 2018 Proposal: Pay off technical debt and increase usability of Navigation Popups gadget
Closed, DeclinedPublic

Description

Profile Information

Name: --
IRC nickname on Freenode: enterprisey
Web Profile: https://en.wikipedia.org/wiki/User:Enterprisey & https://github.com/enterprisey
Resume (optional): Will be appended to final proposal PDF
Location (country or state): USA
Typical working hours (include your timezone): I am in the EDT timezone, and will work from 10 AM to 6 PM or later on weekdays and most weekends

Synopsis

  • Short summary: The Navigation Popups gadget is the most-used non-default gadget on the English Wikipedia (nearly 49,000 users) and is widely used on other language editions. The original code was written in 2007 or earlier, and most of it is still used today, including a naive wikitext-to-HTML parser and summarizer module. Both are rather buggy; the first aim of this project is to replace both with simple API calls (T138803 and its parent tasks). This would resolve almost every preview rendering bug in the gadget, make future extensions to the preview function easier, and create a much improved user experience. Second, while working on the API calls, I will rewrite the functions that the gadget uses to handle API results (T178670) to make them more efficient and extensible. Finally, as a stretch goal if I have time left over, I will explore ways to add Navigation Popups functionality to the Page Previews feature.
  • Possible Mentor(s): I have reached out to multiple WMF teams, including the Reading and Citoid teams. For code review, my patches will be handled by the existing system of edit requests on the gadget talk page, so that's already provided for.
  • Have you contacted your mentors already? I've been in touch with TheDJ, who'll probably be reviewing most of the code, and I plan to get in touch with my mentors before April starts.

Deliverables

WhenWhat
April 23 - May 14Community bonding period. Contacting current users of the gadget, and identifying some of the biggest issues with the gadget from the perspective of the users. I've already read the archives of the gadget talk page back to ~2008 for the purpose of making them into Phab tasks, so this would add to my existing knowledge about the gadget.
May 14 - May 20Rereading the source code and becoming more familiar with the architecture of the gadget. Finishing my existing work of dividing the gadget into a number of source files, to reduce its ~7K lines to something more manageable. Writing up some of the most convoluted function call chains, such as the process of building API calls and parsing their results into HTML.
May 21 - May 27Starting to write test cases to ensure the architecture changes don't break the gadget, achieving at least 50% coverage of functions at the end of the week. Determining which test framework to use for Popups. Setting up Travis CI or some other online test runner to ensure no regressions.
May 28 - June 3Finishing code-based test cases. Achieving full coverage of the functionality that I'll be working on. Community discussions to figure out common workflows, so that I don't accidentally mess with those. Planning out removal of the caching layer, which is the first blocker for removing the core custom parser. Beginning to remove the caching layer, if time permits.
June 4 - June 10Finishing removal of the caching layer, and evaluating the resulting impact on performance. Planning out removal of the custom parser.
June 11 - June 15Phase 1 evaluation. Deliverables: full test suite for the gadget; finished (incl. fully tested) version of the gadget without the caching layer, which will make progress for the rest of the project faster.
June 16 - June 24Removing the custom parser, week 1: writing a replacement. Evaluating how to best transform the action=parse output from the API to produce a maximally readable summary. Beginning to create an alternate build of the gadget that uses the new API call.
June 25 - July 2Removing the custom parser, week 2: verifying the replacement. Finishing the alternate build. Testing that the new summarizer performs as well, or better, than the existing summarizer.
July 3 - July 9Removing the custom parser, week 3: doing the actual replacement. Receiving community feedback, and modifying the gadget's behavior accordingly. Dealing with the inevitable massive amounts of edge cases.
July 9 - July 13Phase 2 evaluation. Deliverables: a finished (fully tested) version of the gadget with a new action=parse based summary view.
July 14 - July 22API calls/etc fixing, week 1: coding. Planning and coding more usable utility functions for API calls. More technical debt work, including function-level documentation and call-path documentation if I have time.
July 23 - July 29API calls/etc fixing, week 2: testing. In addition to making sure that all new changes to the gadget are fully tested, making sure that all community feedback is reasonably resolved.
July 30 - August 6Feature freeze. Further documentation and community discussion, including an announcement (to the broader community - of course I'll have been testing with interested community members for the entire project). All remaining bug fixes.
August 6 - August 14Final week - evaluations and final product. Deliverables: a finished and deployed gadget with as much technical debt paid down as possible.

Participation

I will work on this project in a GitHub repository. It will be frequently deployed to a "dev" version on testwiki and the English Wikipedia, so that it can be tested in the environment where people will use it. I will be online on IRC during the time that I'll be working on the project, which will be the working hours I mentioned above (10 AM EST to 6 PM (and later) EST). I will keep using the Phabricator project that I started to manage bugs and improvements. I will also be available through the email feature on my Wikipedia account and the talk page of that account. I will post detailed weekly progress reports, probably to my Wikipedia userspace.

About Me

I am a current university student on the east coast of the US. I plan to finish my undergraduate degree (computer science) in spring 2020. This will be the first time I'm participating in GSoC, and I'm very excited about the opportunity to focus intensely on an important tool used by the community. I heard about GSoC through a list of open-source projects a few years ago, but I wasn't eligible at that point. I have no other summer work or jobs. I may have a family vacation at the very end of the summer and I'll update everyone if that happens, but I fully expect to be able to finish up everything before then. I am not eligible for Outreachy.

I've been a Popups user for five years, and I think it's absolutely one of the most useful gadgets we have. I've contributed some patches to this gadget (described below), but it's been too complicated for me to implement fixes that would take longer than a weekend. I've heard this sentiment from another editor who works on the gadget a lot too - the gadget is simply too complicated for the architectural changes to be made without some dedicated block of time. That's why I'm looking forward to being able to commit a lot of effort to understanding and fixing this gadget. If I can make this project happen, it would make this wonderful gadget even more useful.

Past Experience

I have submitted 16 patches (as edit requests) to this gadget, all of which were accepted. The first one was in April 2016, and the requests and discussion for all of them can be seen on the talk page for the gadget code. My patches have ranged from implementing feature requests from users (my favorite: an icon to show the gender a user has set in their preferences, which I got a technical barnstar for), to cleaning up the code (over the course of a month, I went through the entire codebase to make it pass a Javascript linter). I also wrote a brief technical writeup last year to help future developers. On the bug-tracking side, I started a Phabricator project, Navigation-Popups-Gadget, to track progress on responding to feature requests and bug reports, then went back through talk page archives to add an initial backlog. Since then, I've opened 21 tasks in the project.

I have extensive experience with other user scripts and gadgets. The second biggest gadget I work on is the main helper gadget for the Articles for Creation WikiProject; I'm its current maintainer, and I've been using it for about as long as I've been using the Popups gadget. I've also written 10 user scripts of varying complexity, listed here. I'm most proud of reply-link, which adds a link after each signature on a talk page to allow users to reply to each comment using a form without going through the edit page. It's the most complex script I've written, and although it's still in beta, I got very positive feedback when I demonstrated it at the last Wikipedia Day meetup. My most popular user script is delsort, which is used by 105 editors to quickly categorize Articles for Deletion discussions by topic.

Besides Javascript work, I started editing Wikipedia nearly six years ago. Since then, I've made almost 25,000 edits, including seven non-stub articles, 12 Toolforge tools (which I've put just as much effort into as the user scripts), and nine BAG-approved bot tasks. I've also gotten seven technical barnstars; one, as I mentioned above, was for my work on this gadget.

Other open-source projects I've contributed to (even though the vast majority of my work is on Wikipedia) include http://childrenofur.com/, a community-driven spinoff of Glitch (a browser-based MMO that its developers shut down), and the Pywikibot library.

Any Other Info

Let me know if you have any questions! I've included an overview of both the existing code and my replacement plan so you can better understand the project, as well as a description of the stretch goal I'll attempt if I have time left over.

Overview of existing code

At a high level, the gadget executes the following steps when the user hovers over a link (condensed from the technical summary I wrote last year):

  1. Figure out what sort of page the link points to: article? user page? history page?
  2. Look up which API calls need to be made based on the page type: for example, for any page, we need the history; for a user, we additionally need the edit count; and for an image, we need its metadata.
  3. Perform the API calls and hand off the results to dedicated parsing functions that turn each result into HTML.
  4. Assemble the HTML fragments into a visible widget, and display it.

The most visible part of the popups UI is the article preview. It's generated using two modules named Previewmaker and Insta, responsible for generating previews and parsing page wikitext into HTML respectively. Previewmaker receives the page text from an API call, removes certain markup chunks using fragile regular expressions (for the adventurous, Previewmaker.prototype.makePreview() at around line 3420), and hands the result off to Insta to be parsed into HTML, again using a number of regular expressions. According to comments in the gadget, Insta (properly "InstaView") is actually a library written between 2005 and 2006 that was copy-and-pasted into the gadget codebase.

Although this architecture works well in many easy cases, it is defeated by even slightly tricky markup, such as an apostrophe inside italic text (T139902; ''Hybrids''' makes the rest of the preview italicized). This results in frequent visible rendering bugs that degrade the user experience. The Previewmaker module also has many bugs: the regular expressions are often too permissive or too restrictive, leading to random gaps in the preview or text that should not be included. The functions that do this are often not that maintainable, making fixes for bugs like T187064 (ignoring the bad image list) difficult. Worse, none of the functions have error reporting or tests; as I was writing this, for example, I noticed that the gadget refused to show a preview for the "United Kingdom" article while giving no console error messages or other logging messages.

Another area of the code that could be improved is the design of internal interfaces, specifically the internal API call functions. They're structured using a module called Downloader, whose abstraction needs to be updated and made more usable. I tried to use it once while trying to add additional API calls to the project, and with modern gadget-writing practices. I came to this conclusion while trying to add additional API calls to the gadget, which will definitely be necessary in the future. Furthermore, it's important to accomplish tasks using standard architectures, since the ultimate end goal is to integrate this gadget with the Hovercards extension.

Detailed replacement plan

The Previewmaker and Insta modules are together redundant to one of the many API calls that return the HTML of a page. Furthermore, both are tightly integrated into the gadget's code. Therefore, my first step (after writing tests) will be to try to create new preview and parser modules that can be swapped in to replace the existing ones. Since we will now be using an API call that returns HTML, the parser module will be trivial. The preview module will depend on which API call I use. If I use the Page Content Service, which is what Page Previews uses, I don't have to do anything as we are already receiving summaries in HTML form. However, if we get an endpoint that returns just HTML, I will need to first trim down the HTML using the same algorithm that worked on wikitext. Writing an algorithm this way might be difficult. If we use the existing summarizing code together with Parsoid to turn the summarized wikitext into HTML, the resulting solution wouldn't perform as well because we would need two API calls. To summarize the options in a table:

StrategyProsCons
Use the Page Content ServiceGadget code becomes trivial; no processing of API results necessaryAPI is not open to the public at this time
Use action=parse and a new summarizerHTML guaranteed correct; straightforward solutionBuilding a summarizer algorithm for HTML might be difficult
Use Parsoid and the existing summarizerGadget code will not require much modificationBugs from existing summarizer will need to be fixed; latency of two API calls with some processing in between; architecture of API call dispatch must be modified to support API calls depending on other API results

I will decide on which option to pick based on consultation with my mentor. I'm leaning towards the middle option because it's architecturally simple, although if the Page Content Service becomes open I would strongly prefer that option.

Stretch goal: Page Previews

It would be very exciting if I had enough time left over to make progress towards T109796. There is already a lot of foundational work that must happen on the Navigation Popups side first: specifically, functions must be written to use modern (asynchronous) programming patterns. Furthermore, we must make the menu configuration system in Popups expressible within the Page Previews codebase. This will require me to learn more about Page Previews internals, but I have confidence that I will be able to find a mentor for hacking on that extension before I start. In addition to the foundational work, I might be able to make progress on the "page metadata" subsystem that displays context-sensitive information about a page at the bottom of the widget, such as edit count for a link to a user page. The Popups code already provides the mapping from page type to API calls, as I mentioned earlier. One possibility is to implement both page metadata and action menus separately, and then make them separate options in the Page Previews preferences pane:

Screenshot-2018-3-27 Wikipedia-edit.png (321×422 px, 16 KB)

Either way, I may not be able to dedicate that much time to the goal of feature parity between Page Previews and Navigation Popups during the project, but I would definitely continue working on it after the conclusion of the project.

Event Timeline

Enterprisey renamed this task from Replace outdated wikitext parser and pay off technical debt in Navigation Popups to GSoC Proposal: Replace outdated wikitext parser and pay off technical debt in Navigation Popups.Mar 12 2018, 10:27 AM
Aklapper renamed this task from GSoC Proposal: Replace outdated wikitext parser and pay off technical debt in Navigation Popups to GSoC Proposal: Replace outdated wikitext parser and pay off technical debt in Navigation Popups gadget.Mar 12 2018, 11:37 AM

@Aklapper, yeah, sadly there's no overlap between the gadget's edit history and the 2018 GSoC mentor list. If @TheDJ were interested, he would be the best mentor for this.

There were some older discussions about integrating the features that are currently only in Navpopups, into an "Advanced Hovercards" mode of that extension, so that more wikis could benefit and more code could be reused (and probably other benefits I'm forgetting).
Is that a direction you might be interested in going in?
The architecture of the extension was originally designed to enable that kind of expansion, though it has been overhauled since then so someone would need to check on the current situation.
Also ping @Nihiltres who was previously interested in this topic.

I'm not as experienced with extensions as I am with gadgets, so I don't think I would be the right person to do that. I was thinking about learning about extensions in time for next year's proposal round, though.

Actually, I'd like to revise my previous comment - @Quiddity, I (incorrectly) thought that most of the code for Hovercards was PHP, which I have relatively little experience in. Now that I see that it's in JavaScript, I would definitely be open to working on that Hovercards enhancement.

If the WMF is more likely to accept a proposal about Hovercards (new, current engineering practices, in-house) than Navigation Popups (old, not very reliable, community-built over a decade ago) - for the great reason that the resulting work is portable to more wikis - then I would definitely be able to rework the existing proposal so that it would center around that. I see that there are a bunch more mentors listed for extensions, so I think that would be a good idea. Thoughts?

In T189456#4042337, @APerson wrote:

Possible mentor pings: @Bawolff @siebrand @Zppix

In T189456#4044225, @APerson wrote:

@Aklapper, yeah, sadly there's no overlap between the gadget's edit history and the 2018 GSoC mentor list. If @TheDJ were interested, he would be the best mentor for this.

Wait, does that mean im on a mentor list somewhere?

Wait, does that mean im on a mentor list somewhere?

You're actually first on the list, lol - it's the page recommended for finding mentors for Outreachy and GSoC

Enterprisey renamed this task from GSoC Proposal: Replace outdated wikitext parser and pay off technical debt in Navigation Popups gadget to GSoC 2018 Proposal: Replace outdated wikitext parser and pay off technical debt in Navigation Popups gadget.Mar 13 2018, 3:08 AM

Yeah, so i actually have very conflicting thoughts about moving all the navigation popups functionality into an advanced mode of Hovercards.

  1. It's awesome and we could rewrite everything
  2. It would lead to huge discussions about what functionality to drop and keep (because it is so much) and I fear in the end most editors would disagree with whatever we decide and keep to navpopups.

I mean disambiguation and redirect fixing with auto edits, popups on links in the editor... is that really something we would be fine with adding to Page-Previews ???? I wouldn't.

My other fear is.. It took 4 years to get Page-Previews to production quality... And now we are going to add 20-30 times the functionality in a GSoC project ?? Seems stretching it to me...
Anyway, I've not looked into it deep enough to be sure which way to go. But my gut feeling says, patch up the gadget first to fix the renderer and to keep it alive for another 3 years or so and then work on followup items for Page-Previews

Note: While I would love to support this project as a Mentor, I can simply make 0 promises as to my availability.

@TheDJ Awesome - if there were other people willing to mentor on the community outreach/integration side of this, would you be okay with mentoring the parts strictly limited to the gadget code (assuming you're going to be available occasionally to check in on the code work)?

Enterprisey renamed this task from GSoC 2018 Proposal: Replace outdated wikitext parser and pay off technical debt in Navigation Popups gadget to GSoC 2018 Proposal: Pay off technical debt and increase usability for Navigation Popups gadget.Mar 26 2018, 4:11 PM
Enterprisey updated the task description. (Show Details)
Enterprisey renamed this task from GSoC 2018 Proposal: Pay off technical debt and increase usability for Navigation Popups gadget to GSoC 2018 Proposal: Pay off technical debt and increase usability of Navigation Popups gadget.Mar 26 2018, 10:39 PM

Leaving a note here of a demo @Nirzar @Jhernandez presented at the recent Hackathon using the APIs Page Previews to build navigation popups outside MediaWiki and jQuery. Might be insightful for this project.

https://chimeces.com/context-cards/