Page MenuHomePhabricator

Wikimedia Technical Conference 2019 Session: What have we learned when deploying a standalone server side rendering service for the new mobile Wikidata termbox
Closed, ResolvedPublic

Description

Session

  • Track: Deploying and Hosting
  • Topic: What have we learned when deploying a standalone server side rendering service for the new mobile Wikidata termbox

Description

In this session WMDE would present how we created and successfully deployed the Server-Side Rendering service to the WMF production infrastructure using the Deployment Pipeline. What issues have we faced, what was missing for us, and what else can we share with others

Questions to answer and discuss

Question:
Significance:

Question:
Significance:

Related Issues

  • ...
  • ...

Pre-reading for all Participants

  • [add links here]

Notes document(s)

https://etherpad.wikimedia.org/p/WMTC19-T234642

Notes and Facilitation guidance

https://www.mediawiki.org/wiki/Wikimedia_Technical_Conference/2019/NotesandFacilitation


Session Leader(s)

Session Scribes

  • Chris

Session Facilitator

  • Aubrey

Session Style / Format

  • [what type of format will this session be?]

Session Leaders please:

  • Add more details to this task description.
  • Coordinate any pre-event discussions (here on Phab, IRC, email, hangout, etc).
  • Outline the plan for discussing this topic at the event.
  • Optionally, include what this session will not try to solve.
  • Update this task with summaries of any pre-event discussions.
  • Include ways for people not attending to be involved in discussions before the event and afterwards.

Post-event summary:

  • ...

Post-event action items:

  • ...

Event Timeline

debt triaged this task as Medium priority.Oct 22 2019, 6:57 PM

(Programming note)

This session was accepted and will be scheduled.

Notes to the session leader

  • Please continue to scope this session and post the session's goals and main questions into the task description.
    • If your topic is too big for one session, work with your Program Committee contact to break it down even further.
    • Session descriptions need to be completely finalized by November 1, 2019.
  • Please build your session collaboratively!
    • You should consider breakout groups with report-backs, using posters / post-its to visualize thoughts and themes, or any other collaborative meeting method you like.
    • If you need to have any large group discussions they must be planned out, specific, and focused.
    • A brief summary of your session format will need to go in the associated Phabricator task.
    • Some ideas from the old WMF Team Practices Group.
  • If you have any pre-session suggested reading or any specific ideas that you would like your attendees to think about in advance of your session, please state that explicitly in your session’s task.
    • Please put this at the top of your Phabricator task under the label “Pre-reading for all Participants.”

Notes to those interested in attending this session

(or those wanting to engage before the event because they are not attending)

  • If the session leader is asking for feedback, please engage!
  • Please do any pre-session reading that the leader would like you to do.

Who should attend this session? Is it for developers to learn how to do it? Is it a discussion whether this practice should be adopted more widely?

Etherpad notes from the session

Wikimedia Technical Conference
Atlanta, GA USA
November 12 - 15, 2019

Session Name / Topic
What have we learned when deploying a standalone server side rendering
service for the new mobile Wikidata termbox
Session Leader: Leszek; Facilitator: Aubrey; Scribe: Chris
https://phabricator.wikimedia.org/T234642

Session Attendees
Birgit, Nick, Kaldari, Aubrey, Amir, Tyler, Stephen, Franziska, Monica, Daniel, Antoine, Pablo, Tobi

Notes:

  • Introduction:
    • Hello, I'm the engineering manager at WMDE
    • This is not meant to say this is the best way to do this. Happy to hear challenges to this approach. Finally the mobile is in () because product requirements while developing were to have it on mobile. We consciously made it responsive. 
  • Termbox
    • A concept on Wikidata. What we call termbox is the part where things are called and labeled and potentially edited.
    • Situation we're facing is in 2018 is that on mobile it was very limited in its view
    • when you click on the edit pen you edit somewhere else on a Speical page - limited editing experience
    • wanted to make this better on mobile
    • technology has changed over time - editing on mobile is something that is not uncommon
  • Problem statement
    • The tension we had between product ant tech teams
      • should work without JS enabled
      • developer and egineering happiness was important too
        • didn't wnat to implement things twice
        • didn't want to couple things in a monolith
        • wanted to use standard front-end tools
        • for the manager - know how to hire new engineers who could get started quickly (those who know standard tools like jquery UI)
      • Decided to use VUE.js - already used in other parts of code base. Decided against using our own custom framework
      • Disclaimer: This is not a Vue pitch. Just want to show a clear picture of the library that we used.
  • Termbox v2
    • Editing interface is in-line
    • Shows values in other languages (no previously possible)
  • Architecture
    • [diagram showing Wikibase Termbox Server Side Rendering]
      • When a user goes to an page, API is hit, request is made to SSR (server side rendering), fetches  (what item, revision, user language, info on user (logged in/ user name), permissions), then SSR calls to API to gather data to be rendered. 
        • API is separate from MediaWiki
          • Disclaimer: Wikibase has this concept of "fall back language". If you're browsing in Austrian German, but no definition is available, but there is German German, then Wikibase knows to fall back to German German, falls back with nothing available to English. Rendering is independent from i18n (internationalization) things, relies on MediaWiki. 
        • SSR is written in node.js (just an implementation detail. Conceptually the same if written in any other language)
        • Mediawiki asks SSR, which in turns asks MediaWiki again. This is not a problem, but a conscious decision/feature. This is because of how pages are composed in MediaWiki. If this changes, the service is still usable. 
  • Termbox in Client & Server
    • We didn't want to implement things twice. So the rendering of the termbox is using TypeScript (.ts files), then turned into JavaScript. Uses webpack and standard tooling. 
    • We also have a dev environment so we can test in a local browser. in our experience simplified the development.
    • SSR part in blue box
  • From dev to production
    • How we deploy it. 
    • We're the first non-WMF service deployed.
    • CI -> Docker -> testing (helm, kubernetes) -> production
  • Test
    • when we push a change to gerrit, (this [image in slide] is one of the patches we pushed recently) there is this comment from pipeline bot that there is a build. Pipeline CI part. 
  • Deliver
    • When you merge a patch is similar. Pipeline bot runs test, build image, when successful, publish to docker registry
    • When everything goes well it's published
  • Deploy
    • we use standard tooling - Helm, tags
    • an additional step we ended up doing. Server side part changes we have to adjust the client side code. So the HTML structure is kept the same. Make sure ssr side things are in parser cache. 
      • requires a manual step. client side JS git submodule version is update every time we update. Could be done better (non manually).
      • We didn't suffer enough from this manual step to investigate a smarter way
      • Client side is in MW extension code - goes with regular MW train.
  • What we've learned
    • Integration with Wikibase is unnecessarily complex
      • we spend most of the time not on termbox changes but wrestling with MW
      • When MW is using the pipeline things should be easier
    • No checklists on how to get a service deployed
      • expected since we're the first, but relied on asking individuals to assist
    • deployment pipeline didn't initially allow deployments to test.wikidata.org
      • did quickly solve, but not at first
    • Beta cluster not in the scope of the deployment pipeline
      • we couldn't use the same tooling so we had to come up with our own home-brew solution.
      • a little annoying another step needed to get rolling
    • WMF service folks and the service templates were rather rigid. 
      • would have preferred to have a library/set of express middlewares instead of template
    • Q: Is the service run using https://github.com/wikimedia/service-runner and template https://github.com/wikimedia/service-template-node meant here?
      • A: Yes
        • Followup - it's rigid on purpose to make sure things are consistent
      • A: Our service was a little different and maybe didnt' fit the pattern of past services the template was developed for
    • TypeScript required a steep learning curve
      • Some restrictions, but also some cool things, initial pace was slow
    • developing a cross-product component library is hard
      • social challenges are a major factor - needs persistence and a driver
        • Engineering was pushing one way to do it. other roles did not have the same thing in mind. 
        • Huge endeavor, needs alignment and someone driving
    • Integration level testing is not on termbox application, but only when integrated with wikibase. 
      • so this is faster, not end-to-end level of test parameter there, somewhere else instead
    • Things assumed to be hard were easy becaues of standard tooling
      • things that are less standard like integration with MW were harder
    • making things like responsiveness an official product requirement is possible by taking the time to find compelling arguments
    • Estimating usage and server load was hard and constantly underestimating the load
      • we were looking at averages and not peaks. Multiple times underestimating
    • Micro-frontend approached
      • what we like is that creating this strict boundary where we only work on what is inside. Creates isolated place where we can use standard tools, tooling we decide, allows to build new function and products efficiently allows to create new things without overhauling everything at once. (like UI and big changes) makes it agile.
  • Links
    •  
  • Want to say thinks to WMF teams - Release engineering and ops to help with this. 
  • Would love to hear feedback and questions. 
  • If there is room for collaboration, conceptual or low level component library things I'm happy to discuss
  • Q&A
  • Q:I was going to complain that you make a service and a core request when you could have done this in Wikibase as a single PHP process. When you look at all the steps. Interacting with many people. Why did you start that service and when you started deployment, which step took too much time and when did you find yourself blocked? Meetings, people, etc. 
    • A: Regarding blockers. There was never a situation where we had weeks or months of no response. The process - there was no clear process for deploying service - we knew we wanted to do it, but not how to do it. We had to figure as we were going. I'm not being artificially kind here, the WMF teams were responsive. With my manger hat on I want to avoid asking what I need to do. If you have a clear list of steps this would be easier. Like asking pipeline team to help get on test.wikidata.org was an extra request not on their roadmap. 
      • Q: followup: you basically had to write the process as you went then.
  • Q:You said one of the biggest problem was integration with MW. So how did you do it and how could it be improved? I'd like to get clarity on one detail. The rendered output from your service. Does that go into the parser cache?
    • A: It does. The whole page is cached including a placeholder mark in it. The rendered termbox content is also cached. Both are combined (i.e. placeholder is replaced with the actual termbox content) before sending back HTML to the browser.
      • Q: follow up: Why?
        • A: it's alreadyin the cache. It's conceptually useul to keep the way the item page is composed using the version 1 and version 2 codebase.
    • A:[pablo] Basically the moment in time when the request against SSR happens is a bit peculiar. When trying to save a fresh revison. This happens before the rev id is generated. THere is no ID to ask the API for the matching entity. We have to trick the parser cache to not do rendering on the intial save. 
    • A: [Leszek] We hack into MW to create structured data. The parser does nothing, but then we generate the item page content HTML. Article first injection is a problematic hack. AbuseFilter causes some of these problems, because it needs metadata because the actual parsing.
    • A: Coupling at the top level is challenging. 
    • Q: In a decoupled architecture how would you redo this?
      • A: A fair question. This is a the first service deployed, so maybe not the best to analyze, but a great topic for an unconference session 
  • Q: Do you plan on adding more services to improve Wikibase, Wikidata and add more and more features?
    • A: There is nothing particular i can tell you now. But the service oriented architecture seems better than monolith.
      • Q: followup: was this good enough for you, if next year you had a new service would you adopt this model again?
        • A: It is a decent workflow. What's the alternative to that? Pipeline is great, but the alternative is to stick into monolith.
          • Q: will you use it again? 
            • A: Yes, rather than to stick in into monolith
  • Q: This is a great product. Responsive termbox. For content and design, have you worked with the Wikimedia styling guide when developing this? The living style guide. Was there any challenges?
    • A: One of the challenges is in getting a consistent style between wikibase and mediawiki when setup, and different skins it ships with. We could not quite bridge this gap. What the rest of Wikibase looks like and mimicking OOUI, but we would be incompatible with component in Wikibase in the same page. It pains us as developers as the code is there, but not tangible. I'd love to know how to increase collaboration there. 
    • A: we tried to follow when applicable, but there are challenges. If we followed styleguide for one component, but not another it would look weird.
    • A [Leszek]: The old interface is definitely not in line with the style guide.
    • A: [Franziska] plans for WMDE design to improve this over the next year. We have our own design team, and we do our best to coordinate with the WMF design team. Conversations have started. Have people and are working on it!
  • Q: As a developer we see a presentation then we scrutinize points This development experience is better in many ways and there are some very obvious collaboration points. This is fantastic and I wish we were doing more of it. 
    • A: Much appreciated. One of the reasons we are showing is not to brag but to inspire other developers to consider and think about similar issues. 
  • Q: I'm not a JS guy. Not familiar how these frameworks work. I learned recently that template language that vuejs uses is pretty close to mustache, or maybe this even is mustache. Did you consider using Mustache on the PHP side? Why node.js
    • A: The short answer for the mustache question. They are logic-less templates, this is more of an application. There is a lot of logic. The seemingly static side and client side. Different levels of abstraction .Then ?? would not be needed on client but on the server.
    • A: you have to synchronize across front and back end, we tried that - implementing an php library that understands a subset of the vue template "language" that we need for our purposes. We maintain a library, but don't officially release it to remain consistent with a language that was not meant to run client side code [php]. The heart of the problem is to avoid double implemention is to write in a language that compiles to javascript so that the browser can understand. 
  • Q: [Action Item] The checklist ithat is not htere is still not there right? Should we use some time at this event to go over that?
    • A: Yes, unconference or breakout
  • Q: A lot of things occured to me that did not before. Different teams and different docs. We need to put that together. A service moving thorough this process needs. 
  • Q: Great presentation, appreciate lessons learned!
  • - - - Thank you 文Α

Thanks for making this a good session at TechConf this year. Follow-up actions are recorded in a central planning spreadsheet (owned by me) and I'll begin farming them out to responsible parties in January 2020.