Page MenuHomePhabricator

Wikimedia Technical Conference 2019 Session: System level testing: patterns and anti-patterns with Selenium
Closed, ResolvedPublic

Description

Session

  • Track: Testing
  • Topic: System level testing: patterns and antipatterns with Selenium
  • Scheduled for Day 1 - Tuesday November 12th 14:45-15:30 (45 minutes)

Description

Testing at the system level is fraught with unforeseen issues. This session is an overview of the current system level testing framework with a focus on anti-patterns and patterns to follow.

Questions to answer and discuss

Question: Should a repository have as little as possible or as much as possible Selenium tests?
Significance: Not enough tests means critical bugs might be missed. Many tests require a lot of maintenance.

Question: Is page object pattern useful?
Significance: It's another layer of complexity. Does it have a positive impact, or just complicates things.

Question: Should the API be used to speed up tests and make them more robust?
Significance: It's both faster and more robust.

Related Issues

Pre-reading for all Participants


Session Leader(s)

Session Scribes

Session Facilitator

Session Style / Format

  • Discussion and code review.

Notes

Post-event summary, from T234635#5658096, photo by @Florian:

Our notes from the session:

20191112_153811.jpg (1×4 px, 2 MB)

Detailed notes
Timo: Raise your hand if you have not written a Node/Selenium test? (3 no, 3 yes)
Today we'll cover a couple of examples of tests are written, how they work, what makes a good/bad one. Best practices and gotchas.
Session will be in two parts. Some slides, and then looking at some real code to discuss as example.

Zeljko: Story - I was on a plane, reviewed all Selenium code ever written for MW. I was looking for good and bad examples, and found three such examples, plus a strange one! 

(Slide quote): "All happy families are alike; each unhappy family is unhappy in its own way"
Zeljko: From Tolstoy, replace "family" with "tests".

1.  A good test is as small as possible. (while still providing value)
Kaldari: Polarity, small as possible whilst also providing as much value as possible. Could write a profusion of small tests, or a few high value tests.
Z: If more complexity provides value, then do it.
T: A small test is often good, a big test can sometimes be bad - with the exception that small tests duplicate too much, then that violates the rule.
Kosta: Helppanel uses JavaScript  widget to [?] which shows up in RecentChanges. The thing that Selenium can provide (that others can't) is the full user-interaction of seeing it appear on the page.  IIUC, you're advocating for the opposite, with many small discrete parts are tested?
Z: Not necessarily.
T: (How small is small?)
Z: Just opening a page and checking for an element might be too small for a selenium test. trivial level.
T: If the test is really small, it might be something that shouldn't an integration test - maybe should just be a unit test.  Value could be gotten from a unit test.
Cormac: hypothesis: if each test incorporates the previous test, then your test is too small.

2. A good test is as simple as possible.
Z: it is more important for tests to be simple than DRY. (don't repeat yourself) XP/Clean code principle. 
T: Don't have an abstract utuility function that you [???]  It might be more beneficial to have a test be independant and easy to read, if the alternative just saves a few characters.

3. A good test tests as little as possible - when it breaks, it should be easy to find out why
Z: ties into being small and simple. For complex tests, when it breaks, any of the many things could be buggy. 
James F: I think the narrowness of the scope is relative to the scale of the type of the test.  Unit tests are small, integration tests are potentially very big.  Login.
Z: In this case, small means relative to the entire scale of everything.

4. A good test uses API for setup - as much as possible.  It's both faster and more robust.
Z: Use whatever helpers you can, setting up rows in databses via the APIs can be faster and more reliable.
Z: An abuse of this rule would be creating an account with the API when you're trying to test whether the signup form works.
T: What also ties into this, in earlier version of our current node.js stack, we exposed a page object as something extension could call into. Which meant page object could be used by an extension, which would then instantiate that page to create a user. If your UI is tightly coupled to assuming the user has just signed-up, that's legit and could be hardcoded in your extension test. But more commonly, the test will likely not need that. E.g. Recentchanges tests, should not create all the edits to be displayed there as the current viewer, but use the  API instead.
Kosta: Related to this - this ties into the theme of could we use a maintenance script in core to do some of this setup stuff?  Maybe useful also for local development.  Might be faster than using Node.js bot  or API...
Z: Whatever makes sense.  You should speed up the setup as much as you can, but ???

5. A good test uses page objects - exceptions are possible but not probable.
Z: But I've yet to see an exception that is justiified.
Z: Disadvantage of page object is that there's yet another layer of abstraction - but it makes the test code simpler.  Test code can read like user actions.  Complexity is hidden away in page objects. The thing that changes the most is the UI.
Z: Anybody not like page objects?
T: How does page objects tie in with the earlier point of avoiding DRY code in tests ("don't repeat yourself").
Z: There's one repo here that has one simple test - it's fine.  Do it as simple as possible.
T: multiple features files that cover the same page, abstract them into a seperate file?
Z: If two tests use the same element...
Florian: beneficial to have a page-object because it abstracts the technical details, and explains the intent. I don't want to "clicks div 123", i want to "clicks sign-up button". This becomes explicit by using a function with such name. Makes tests more readable.
T: Is there an intermediary between a page object and inline selectors?
Z: not really, except "just a function in a file". The advantage of page objects is, that its where things are located and easily findable.

6. A good test doesn't depend on others - test suite should pass when tests are running in random order or parallel
Z: if anything works when you run serially, but fails when you runin parallel, having independant tests makes it easier to debug failures
e.g. testing framework changs the way that you run, from alphabeticaly to other.

7. A good test sets up it's environment - Users, pages, files
Z: Your test can't assume things are there. It does not depend on the environment being in a certain state.
Z: E.g. a branching condition in a test file is unexpected.
Z: Especially branching on conditions based on that state of the environment.  Bad.

8. A good test is at a lower level. -  What could be tested at a lower level should not be tested at a higher level
Z: There should be as few high level tests as possible. If you have more than a handful, you might be following an antipattern
Z: If you can do it in a unit test, etc., then do so.
Z: This list is incomplete.  Please help me complete it.  If you've got ideas, please share them.

9. "Never trust a test you did not see fail." - (quote from unknown source)

https://phabricator.wikimedia.org/T234635 description contains post-event action items:
Z: Software testing antipatterns blog by Kostis Kapelonis http://blog.codepipes.com/testing/software-testing-antipatterns.html
Z: Pyramid / ice cream cone https://watirmelon.blog/testing-pyramids/
Z: Selenium tests don't require you to change your code, unlike unit and integration tests.  Don't do that.
T: Selenium is the path of least resistance if your code isn't in a very good state.
Z: Might be a good transition state. e.g. "My code is spaghetti and i need a safety net", a good temporary solution is many selenium tests.
T: Search suggestions system can give multiple items from different namespace, alphabetizations, etc. All of that is unit testable. User tests should cover "did everything interact that should have".
Z: started with ???
Elena: Another reason for having so much Selenium tests could be that manual testing (repetitive tasks) can be automated that way without changing a lot in unit tests/integration test layers (as an intermediate step)
Z: didn't even want to test Manual Regression tests here! but yes, similar use-cases.
Here https://www.mediawiki.org/wiki/Selenium has an examples section, that has an uptodate list of all repos that have Selenium tests. You could use this to find examples.

Looking at real code (Zeljko projecting):
Z: A few repos moved from Ruby and cucumber to Node and mocha - they're just out of scope for today
Z: Please reach our to me for pairing if you'd like me to look at other repos.

Good example (MediaWiki core, page.js)
Z: beautiful new way to use the API. Before hook runs each test suite. use Util.getTestString.  Sometimes has had problems with encoding (UTF etc)
Kosta: Main frustration with S - breaks not because something changed in core, but because the assertion runs before that first line is finished basically.  How can we promote better practices around that?  My first reaction when something fails is to think it's a flaky test not that something has changed.
Florian: is that about reloading a page?  if there's JS involved, might need to add "waitfor" patterns.  Similar in angular.

Bad example (TemplateWizard)
Z: scored all the antipattern points! L28-L98 "basic use of templatewizard". It uses node selectors, 1 huge file, all details inside the test instead of using page objects. Positively, it waits for elements to be present.

Strange example (MobileFrontend)
Z: Tried to reuse existing cucumber test suite. It worked for some definition of "worked", but the resulting test code is strange. Slightly too complicated. Assertions should be exclusively at the end. Not horrible, but strange.
T: Function names used here are stringifications of function files meant to relate to former cucumber tests
Z: Do whatever you need, but don't count on it being your endgame.

Originally from https://etherpad.wikimedia.org/p/WMTC19-T234635

Post-event action items:

Event Timeline

What is the expected outcome of this session, is it mainly focussed on getting people, who are not familiar with our "system test level" to give an overview, so that they can provide valuable feedback, ideas and concerns in the session building on top of that?
Or is this session focussed on finding patterns and anti-patterns we're using or utilizing at the moment in order to try to have detailed focus-areas where we can invest time to imrprove and hopefully get rid of used anti-patterns?

Sorry if this question is disturbing, but I kind of do not get this info from the description :]

What is the expected outcome of this session, is it mainly focussed on getting people, who are not familiar with our "system test level" to give an overview, so that they can provide valuable feedback, ideas and concerns in the session building on top of that?
Or is this session focussed on finding patterns and anti-patterns we're using or utilizing at the moment in order to try to have detailed focus-areas where we can invest time to imrprove and hopefully get rid of used anti-patterns?

Sorry if this question is disturbing, but I kind of do not get this info from the description :]

Good questions.

@zeljkofilipin I'm looking at you for leading this session, what would be the most productive use of time from your perspective? I was assuming a best practices/anti-patterns plus time for discussion on improvements.

zeljkofilipin updated the task description. (Show Details)
zeljkofilipin updated the task description. (Show Details)

What is the expected outcome of this session, is it mainly focussed on getting people, who are not familiar with our "system test level" to give an overview, so that they can provide valuable feedback, ideas and concerns in the session building on top of that?

No, but that would be an interesting separate session.

Or is this session focussed on finding patterns and anti-patterns we're using or utilizing at the moment in order to try to have detailed focus-areas where we can invest time to imrprove and hopefully get rid of used anti-patterns?

That was what I had in mind.

@Krinkle you've done a lot of Selenium framework updates recently. Would you be interested in co-leading this session?

(Programming note)

This session was accepted and will be scheduled.

Notes to the session leader

  • Please continue to scope this session and post the session's goals and main questions into the task description.
    • If your topic is too big for one session, work with your Program Committee contact to break it down even further.
    • Session descriptions need to be completely finalized by November 1, 2019.
  • Please build your session collaboratively!
    • You should consider breakout groups with report-backs, using posters / post-its to visualize thoughts and themes, or any other collaborative meeting method you like.
    • If you need to have any large group discussions they must be planned out, specific, and focused.
    • A brief summary of your session format will need to go in the associated Phabricator task.
    • Some ideas from the old WMF Team Practices Group.
  • If you have any pre-session suggested reading or any specific ideas that you would like your attendees to think about in advance of your session, please state that explicitly in your session’s task.
    • Please put this at the top of your Phabricator task under the label “Pre-reading for all Participants.”

Notes to those interested in attending this session

(or those wanting to engage before the event because they are not attending)

  • If the session leader is asking for feedback, please engage!
  • Please do any pre-session reading that the leader would like you to do.
greg added a subscriber: brennen.

My notes after reviewing the code.

Legend:

  • ✅ simple specs and page objects
  • ⚠️ warning
  • 🤷 only sample code
  • 🥒 using cucumber testing framework instead of mocha, out of scope
  • 😱 not using page objects(?)

Repositories:

  • mediawiki/core ✅
  • mediawiki/extensions/AbuseFilter ✅
  • mediawiki/extensions/AdvancedSearch ⚠️
    • pageobjects/search.page.js - too complicated for my taste
    • specs too complicated
  • mediawiki/extensions/CirrusSearch:
    • tests/integration 🥒
    • tests/selenium ✅
  • mediawiki/extensions/Cite ✅
  • mediawiki/extensions/ContentTranslation 🤷
  • mediawiki/extensions/Echo ✅
  • mediawiki/extensions/ElectronPdfService ✅
  • mediawiki/extensions/FileImporter
    • tests/selenium/pageobjects ⚠️ files have page instead of page.js extension
    • tests/selenium/specs ⚠️ one test has no assertions
  • mediawiki/extensions/GrowthExperiments ⚠️ tests too complicated
  • mediawiki/extensions/Math ✅
  • mediawiki/extensions/MobileFrontend ⚠️
    • tests/selenium/features 🥒
    • tests/selenium/specs
      • switch_views.js ⚠️ too complicated
      • user_page.js 😱
  • mediawiki/extensions/Newsletter ✅
  • mediawiki/extensions/ORES ⚠️ simple, but the only test is disabled
  • mediawiki/extensions/Popups ⚠️ branching in test code
  • mediawiki/extensions/RelatedArticles ✅ page objects a bit complicated but tests look ok
  • mediawiki/extensions/RevisionSlider ✅ page objects a bit complicated but tests look ok
  • mediawiki/extensions/TemplateWizard 😱 + everything in one big test
  • mediawiki/extensions/TwoColConflict ⚠️ some tests are fine, some have too much detail
  • mediawiki/extensions/Wikibase
    • repo/tests/selenium ⚠️ too much detail in tests
  • mediawiki/extensions/WikibaseCirrusSearch 🤷
  • mediawiki/extensions/WikibaseLexeme ⚠️ some tests are fine, some have too much information, too many tests?
  • mediawiki/extensions/WikibaseMediaInfo ⚠️ browser.pause(), too much detail in tests
  • mediawiki/skins/MinervaNeue
    • tests/selenium/features 🥒
    • tests/selenium/specs 😱
  • phab-deployment ✅
  • wikibase/termbox ⚠️ some tests are fine, some have too much information
  • wikidata/query/gui 😱 + but it's just one simple test so in this case it might be fine
  • wikimedia/portals/deploy 😱 + but it's just opening a lot of static pages and checking them, so it might be ok

Thanks for making this a good session at TechConf this year. Follow-up actions are recorded in a central planning spreadsheet (owned by me) and I'll begin farming them out to responsible parties in January 2020.