[epic] Wikipedia.org Portal: automating updates for the portal
Closed, ResolvedPublic

Description

Email sent on July 11, 2016 to gather thoughts:

Within the Discovery Portal team we have a large, and potentially difficult, goal of being able to automatically update the statistics (number of articles and pageviews for each language) on the www.wikipedia.org portal page.

This email is to showcase what we've brainstormed on so far and we'd like your feedback and your participation - if you have time and want to help out. Or, if you know of someone that isn't on this email that could/would want to help, please forward this to them.

Background of existing manual process:
The wikipedia.org portal is a static HTML page that is compiled during development using several build scripts. The current process for updating these stats starts with a developer, working on their local machine, running the following about every two weeks:

  • a script to pull down new pageview/article-count stats from various API endpoints
  • a script to merge those stats with text translations and feeds that will combine the data into Handlebars templates
  • a script to compile the templates to create the final HTML page
  • git commit and merge to get the new HTML page (and updated support files) into the repo
  • then, a deployment script is run during SWAT to push it all to production

Brainstorm ideas on how to automate this process:

  1. could the portal page use dynamic code to pull stats directly from some static data page that volunteers can freely edit?
    • maybe php, javascript, or lua
    • would need to have caching set up to avoid having code executed with every request
  2. could the portal page make a rest call to a service that would return stats info?
    • the service could parse a volunteer-edited static data page
  3. could a background server
    • 1) process detect changes to the git repo, or to static stats data files/pages, and
    • 2) compile the template with new data, and
    • 3) deploy the resulting static HTML?
    • ideally we could automate all 3 parts (detect/compile/deploy), but automating steps 1 or 2 would be helpful
  1. could the deployment script merge the page code (from git) with static stats data?
  2. if all coding happened on a separate (non-master) branch, could an automated script push the stats into the master branch, and
    • then could the master branch be deployed automatically by a cron job?

      Note: while it would be nice to automagically merge/deploy new text translations, that is not the focus for this particular email thread.

      We'd love to hear your thoughts!

Related Objects

StatusAssignedTask
Resolveddebt
Resolveddebt
ResolvedJdrewniak
Resolveddebt
Resolveddebt
ResolvedJGirault
ResolvedJGirault
Resolveddebt
ResolvedJGirault
DeclinedNone
Resolveddebt
Resolveddebt
DeclinedNone
Resolveddebt
Resolveddebt
DeclinedJdrewniak
Resolveddebt
Resolveddebt
Resolveddebt
DeclinedJdrewniak
Resolveddebt
Resolveddebt
ResolvedJdrewniak
Resolvedhashar
ResolvedRobH
ResolvedRobH
OpenNone
Resolveddebt
OpenJdrewniak
ResolvedJdrewniak
debt created this task.Jul 12 2016, 8:46 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 12 2016, 8:46 PM
bd808 added a comment.Jun 2 2017, 12:42 AM

I haven't seen the scripts, but it sounds like the steps up to and including the gerrit submission could be scripted and automated with either a jenkins job or from a Labs/Tool Labs account. Automating the actual deploy is much more complicated and probably best to ignore. If everything was prepped it would probably be pretty easy to convince the Release Engineering team to add merging and deploying the mw-config submodule bump to their weekly deploy train process.

debt triaged this task as High priority.Oct 3 2017, 3:41 PM
debt edited projects, added Discovery-Portal-Sprint; removed Discovery-Portal-Backlog.
debt renamed this task from Wikipedia.org Portal: automating the statistics on the portal to [epic] Wikipedia.org Portal: automating the statistics on the portal.
debt added a project: Epic.
debt renamed this task from [epic] Wikipedia.org Portal: automating the statistics on the portal to [epic] Wikipedia.org Portal: automating updates for the portal.Oct 3 2017, 10:56 PM
Krinkle added a subscriber: Krinkle.
Deskana removed a subscriber: Deskana.Mon, Nov 20, 1:52 PM
debt moved this task from Backlog to Done on the Discovery-Portal-Sprint board.Tue, Dec 12, 4:20 PM
debt closed this task as Resolved.Tue, Dec 12, 5:59 PM
debt claimed this task.

This task is effectively done and closed - there are two minor tasks that need to be finished up: T180777 and T181799.

Going forward, as part of the automation process, two tickets will continue to be updated for the stats (T128546) and translations (T142582) weekly updates.

More information on how the automation is done can be found in the documentation on github.

🎁 🎉