Name: Daniel Glus
IRC nickname on Freenode: enterprisey
Web Profile: https://en.wikipedia.org/wiki/User:Enterprisey & https://github.com/enterprisey
Resume (optional): Will be appended to final proposal PDF
Location (country or state): USA
Typical working hours (include your timezone): I am in the EDT timezone, and will work from 10 AM to 6 PM or later on weekdays and most weekends
- Short summary: The Navigation Popups gadget is the most-used non-default gadget on the English Wikipedia (nearly 49,000 users) and is widely used on other language editions. The original code was written in 2007 or earlier, and most of it is still used today, including a naive wikitext-to-HTML parser and summarizer module. Both are rather buggy; the first aim of this project is to replace both with simple API calls (T138803 and its parent tasks). This would resolve almost every preview rendering bug in the gadget, make future extensions to the preview function easier, and create a much improved user experience. Second, while working on the API calls, I will rewrite the functions that the gadget uses to handle API results (T178670) to make them more efficient and extensible. Finally, as a stretch goal if I have time left over, I will explore ways to add Navigation Popups functionality to the Page Previews feature.
- Possible Mentor(s): I have reached out to multiple WMF teams, including the Reading and Citoid teams. For code review, my patches will be handled by the existing system of edit requests on the gadget talk page, so that's already provided for.
- Have you contacted your mentors already? I've been in touch with TheDJ, who'll probably be reviewing most of the code, and I plan to get in touch with my mentors before April starts.
|April 23 - May 14||Community bonding period. Contacting current users of the gadget, and identifying some of the biggest issues with the gadget from the perspective of the users. I've already read the archives of the gadget talk page back to ~2008 for the purpose of making them into Phab tasks, so this would add to my existing knowledge about the gadget.|
|May 14 - May 20||Rereading the source code and becoming more familiar with the architecture of the gadget. Finishing my existing work of dividing the gadget into a number of source files, to reduce its ~7K lines to something more manageable. Writing up some of the most convoluted function call chains, such as the process of building API calls and parsing their results into HTML.|
|May 21 - May 27||Starting to write test cases to ensure the architecture changes don't break the gadget, achieving at least 50% coverage of functions at the end of the week. Determining which test framework to use for Popups. Setting up Travis CI or some other online test runner to ensure no regressions.|
|May 28 - June 3||Finishing code-based test cases. Achieving full coverage of the functionality that I'll be working on. Community discussions to figure out common workflows, so that I don't accidentally mess with those. Planning out removal of the caching layer, which is the first blocker for removing the core custom parser. Beginning to remove the caching layer, if time permits.|
|June 4 - June 10||Finishing removal of the caching layer, and evaluating the resulting impact on performance. Planning out removal of the custom parser.|
|June 11 - June 15||Phase 1 evaluation. Deliverables: full test suite for the gadget; finished (incl. fully tested) version of the gadget without the caching layer, which will make progress for the rest of the project faster.|
|June 16 - June 24||Removing the custom parser, week 1: writing a replacement. Evaluating how to best transform the action=parse output from the API to produce a maximally readable summary. Beginning to create an alternate build of the gadget that uses the new API call.|
|June 25 - July 2||Removing the custom parser, week 2: verifying the replacement. Finishing the alternate build. Testing that the new summarizer performs as well, or better, than the existing summarizer.|
|July 3 - July 9||Removing the custom parser, week 3: doing the actual replacement. Receiving community feedback, and modifying the gadget's behavior accordingly. Dealing with the inevitable massive amounts of edge cases.|
|July 9 - July 13||Phase 2 evaluation. Deliverables: a finished (fully tested) version of the gadget with a new action=parse based summary view.|
|July 14 - July 22||API calls/etc fixing, week 1: coding. Planning and coding more usable utility functions for API calls. More technical debt work, including function-level documentation and call-path documentation if I have time.|
|July 23 - July 29||API calls/etc fixing, week 2: testing. In addition to making sure that all new changes to the gadget are fully tested, making sure that all community feedback is reasonably resolved.|
|July 30 - August 6||Feature freeze. Further documentation and community discussion, including an announcement (to the broader community - of course I'll have been testing with interested community members for the entire project). All remaining bug fixes.|
|August 6 - August 14||Final week - evaluations and final product. Deliverables: a finished and deployed gadget with as much technical debt paid down as possible.|
I will work on this project in a GitHub repository. It will be frequently deployed to a "dev" version on testwiki and the English Wikipedia, so that it can be tested in the environment where people will use it. I will be online on IRC during the time that I'll be working on the project, which will be the working hours I mentioned above (10 AM EST to 6 PM (and later) EST). I will keep using the Phabricator project that I started to manage bugs and improvements. I will also be available through the email feature on my Wikipedia account and the talk page of that account. I will post detailed weekly progress reports, probably to my Wikipedia userspace.
I am a current university student on the east coast of the US. I plan to finish my undergraduate degree (computer science) in spring 2020. This will be the first time I'm participating in GSoC, and I'm very excited about the opportunity to focus intensely on an important tool used by the community. I heard about GSoC through a list of open-source projects a few years ago, but I wasn't eligible at that point. I have no other summer work or jobs. I may have a family vacation at the very end of the summer and I'll update everyone if that happens, but I fully expect to be able to finish up everything before then. I am not eligible for Outreachy.
I've been a Popups user for five years, and I think it's absolutely one of the most useful gadgets we have. I've contributed some patches to this gadget (described below), but it's been too complicated for me to implement fixes that would take longer than a weekend. I've heard this sentiment from another editor who works on the gadget a lot too - the gadget is simply too complicated for the architectural changes to be made without some dedicated block of time. That's why I'm looking forward to being able to commit a lot of effort to understanding and fixing this gadget. If I can make this project happen, it would make this wonderful gadget even more useful.
I have extensive experience with other user scripts and gadgets. The second biggest gadget I work on is the main helper gadget for the Articles for Creation WikiProject; I'm its current maintainer, and I've been using it for about as long as I've been using the Popups gadget. I've also written 10 user scripts of varying complexity, listed here. I'm most proud of reply-link, which adds a link after each signature on a talk page to allow users to reply to each comment using a form without going through the edit page. It's the most complex script I've written, and although it's still in beta, I got very positive feedback when I demonstrated it at the last Wikipedia Day meetup. My most popular user script is delsort, which is used by 105 editors to quickly categorize Articles for Deletion discussions by topic.
Other open-source projects I've contributed to (even though the vast majority of my work is on Wikipedia) include http://childrenofur.com/, a community-driven spinoff of Glitch (a browser-based MMO that its developers shut down), and the Pywikibot library.
Any Other Info
Let me know if you have any questions! I've included an overview of both the existing code and my replacement plan so you can better understand the project, as well as a description of the stretch goal I'll attempt if I have time left over.
Overview of existing code
At a high level, the gadget executes the following steps when the user hovers over a link (condensed from the technical summary I wrote last year):
- Figure out what sort of page the link points to: article? user page? history page?
- Look up which API calls need to be made based on the page type: for example, for any page, we need the history; for a user, we additionally need the edit count; and for an image, we need its metadata.
- Perform the API calls and hand off the results to dedicated parsing functions that turn each result into HTML.
- Assemble the HTML fragments into a visible widget, and display it.
The most visible part of the popups UI is the article preview. It's generated using two modules named Previewmaker and Insta, responsible for generating previews and parsing page wikitext into HTML respectively. Previewmaker receives the page text from an API call, removes certain markup chunks using fragile regular expressions (for the adventurous, Previewmaker.prototype.makePreview() at around line 3420), and hands the result off to Insta to be parsed into HTML, again using a number of regular expressions. According to comments in the gadget, Insta (properly "InstaView") is actually a library written between 2005 and 2006 that was copy-and-pasted into the gadget codebase.
Although this architecture works well in many easy cases, it is defeated by even slightly tricky markup, such as an apostrophe inside italic text (T139902; ''Hybrids''' makes the rest of the preview italicized). This results in frequent visible rendering bugs that degrade the user experience. The Previewmaker module also has many bugs: the regular expressions are often too permissive or too restrictive, leading to random gaps in the preview or text that should not be included. The functions that do this are often not that maintainable, making fixes for bugs like T187064 (ignoring the bad image list) difficult. Worse, none of the functions have error reporting or tests; as I was writing this, for example, I noticed that the gadget refused to show a preview for the "United Kingdom" article while giving no console error messages or other logging messages.
Another area of the code that could be improved is the design of internal interfaces, specifically the internal API call functions. They're structured using a module called Downloader, whose abstraction needs to be updated and made more usable. I tried to use it once while trying to add additional API calls to the project, and with modern gadget-writing practices. I came to this conclusion while trying to add additional API calls to the gadget, which will definitely be necessary in the future. Furthermore, it's important to accomplish tasks using standard architectures, since the ultimate end goal is to integrate this gadget with the Hovercards extension.
Detailed replacement plan
The Previewmaker and Insta modules are together redundant to one of the many API calls that return the HTML of a page. Furthermore, both are tightly integrated into the gadget's code. Therefore, my first step (after writing tests) will be to try to create new preview and parser modules that can be swapped in to replace the existing ones. Since we will now be using an API call that returns HTML, the parser module will be trivial. The preview module will depend on which API call I use. If I use the Page Content Service, which is what Page Previews uses, I don't have to do anything as we are already receiving summaries in HTML form. However, if we get an endpoint that returns just HTML, I will need to first trim down the HTML using the same algorithm that worked on wikitext. Writing an algorithm this way might be difficult. If we use the existing summarizing code together with Parsoid to turn the summarized wikitext into HTML, the resulting solution wouldn't perform as well because we would need two API calls. To summarize the options in a table:
|Use the Page Content Service||Gadget code becomes trivial; no processing of API results necessary||API is not open to the public at this time|
|Use action=parse and a new summarizer||HTML guaranteed correct; straightforward solution||Building a summarizer algorithm for HTML might be difficult|
|Use Parsoid and the existing summarizer||Gadget code will not require much modification||Bugs from existing summarizer will need to be fixed; latency of two API calls with some processing in between; architecture of API call dispatch must be modified to support API calls depending on other API results|
I will decide on which option to pick based on consultation with my mentor. I'm leaning towards the middle option because it's architecturally simple, although if the Page Content Service becomes open I would strongly prefer that option.
Stretch goal: Page Previews
It would be very exciting if I had enough time left over to make progress towards T109796. There is already a lot of foundational work that must happen on the Navigation Popups side first: specifically, functions must be written to use modern (asynchronous) programming patterns. Furthermore, we must make the menu configuration system in Popups expressible within the Page Previews codebase. This will require me to learn more about Page Previews internals, but I have confidence that I will be able to find a mentor for hacking on that extension before I start. In addition to the foundational work, I might be able to make progress on the "page metadata" subsystem that displays context-sensitive information about a page at the bottom of the widget, such as edit count for a link to a user page. The Popups code already provides the mapping from page type to API calls, as I mentioned earlier. One possibility is to implement both page metadata and action menus separately, and then make them separate options in the Page Previews preferences pane:
Either way, I may not be able to dedicate that much time to the goal of feature parity between Page Previews and Navigation Popups during the project, but I would definitely continue working on it after the conclusion of the project.