Page MenuHomePhabricator

It takes about 20 seconds just to start a Sauce Labs browser
Closed, ResolvedPublic

Description

Our browser tests drive a browser in SauceLabs, whenever the WebDriver need to ask information from the browser that require a network roundtrip. There are a few cases were a simple statement would end up generating hundred+ necessary queries which take several seconds to complete.

There is an extensive example run with annotations as a comment {T92613#1116580}.

Event Timeline

See P396 for the full Selenium log. I have looked at the log and identified the following sections in before the browser actually shows any content:

11:25:17.840 INFO - Launching a standalone server
...
11:25:27.334 INFO - Done: [get: http://en.wikipedia.beta.wmflabs.org/wiki/]

It takes roughly 10 seconds from the start of the VM until the GET of http://en.wikipedia.beta.wmflabs.org/wiki/ is completed, which is fine.

From there it gets a bunch of elements by id from 11:25:27.944 to 11:25:41.068 or roughly 13 seconds to get just 6 elements:

11:25:29.353 INFO - Executing: [find element: By.id: n-recentchanges])
11:25:31.412 INFO - Executing: [find element: By.id: n-randompage])
11:25:34.138 INFO - Executing: [find element: By.id: n-help])
11:25:36.038 INFO - Executing: [find element: By.id: t-whatlinkshere])
11:25:37.968 INFO - Executing: [find element: By.id: t-recentchangeslinked])
11:25:39.951 INFO - Executing: [find element: By.id: t-specialpages])

Note how it takes 2 seconds per element, probably due to the network round trip delay with SauceLabs + Selenium FirefoxDriver that occur on each iteration. Ideally they would be pipelines in one go with an array of id to be retrieved, no clue whether it is possible.

Then for the scary stuff:

11:25:41.865 INFO - Executing: [find elements: By.xpath: .//a])
11:25:41.994 INFO - Done: [find elements: By.xpath: .//a]
11:25:42.154 INFO - Executing: [get element attribute: 6 [[FirefoxDriver: firefox on LINUX] -> xpath: .//a], href])
11:25:42.167 INFO - Done: [get element attribute: 6 [[FirefoxDriver: firefox on LINUX] -> xpath: .//a], href]
...
11:26:05.797 INFO - Executing: [tag name: 124 [[FirefoxDriver: firefox on LINUX] -> xpath: .//a]])
11:26:05.807 INFO - Done: [tag name: 124 [[FirefoxDriver: firefox on LINUX] -> xpath: .//a]]

25 seconds to fetch the href attribute of 118 a elements. We might not even use them at that point, as above, it should be pipelined as well. I haven't done xpath in a while, but it should surely be possible to retrieve all a elements with their href attribute out of a single xpath request, maybe something like: //a/@href.

The log corresponds to the mediawiki/core scenario:

11:25:15   Scenario: Main Page Sidebar Links               # features/main_page_links.feature:10
11:26:11     Then I should see a link for Recent changes   # features/step_definitions/main_page_links_steps.rb:13
11:26:11     And I should see a link for Random page       # features/step_definitions/main_page_links_steps.rb:17
11:26:11     And I should see a link for Help              # features/step_definitions/main_page_links_steps.rb:21
11:26:11     And I should see a link for What links here   # features/step_definitions/main_page_links_steps.rb:25
11:26:11     And I should see a link for Related changes   # features/step_definitions/main_page_links_steps.rb:29
11:26:11     And I should see a link for Special pages     # features/step_definitions/main_page_links_steps.rb:33
11:26:11     And I should see a link for Printable version # features/step_definitions/main_page_links_steps.rb:37
11:26:11     And I should see a link for Permanent link    # features/step_definitions/main_page_links_steps.rb:41
11:26:11     And I should see a link for Page information  # features/step_definitions/main_page_links_steps.rb:45

In conclusion, from 11:25:17.840 to 11:26:06.390 or ~ 54 seconds, the instance is booted (10 secs) then 44 secs are busy time lamely fetching elements and attributes from the page. Then the actual work starts.

The background for the scenario is I open the main wiki URL in From the mediawiki/core features/step_definitions/main_page_links_steps.rb. It really just: visit(MainPage). I highly suspect PageObject to be the cause of all those requests.

Another thing I noticed, is that each scenario creates a new SauceLabs VM, but that is probably working as intended.

So I have reused the page-object-selenium-local.rb proposed script, inserted at the top $DEBUG = true which causes a wide range of modules to output useful information messages.

As I tracked it last week, the issue is in page object. When we create an element such as:

a(:some_name, href: /foo/)

Page object retrieve all A elements as selenium elements, then on each of them ask selenium driver whether it match the regex. The execution is roughly:

find_selenium_element( identifier={:href=>/action=history/},
    type=PageObject::Elements::Link,
    tag=a,
    other=)
elements = browser.find_elements(how=tag_name, what=a)
# Then each element needs to do a round trip with the remote Selenium to get the href attribute :(
element = elements.find { |ele| matches_selector?(ele, regexp[0], regexp[1]) }

So the root cause is us looking for elements using regex:

a(:view_history, href: /action=history/)

A working fix that still looks for a link containing action=history is to use xpath

a(:view_history, xpath: "//a[contains(@href, 'action=history')]")

But Zeljkof orginally told me it was just a work around because the a missed an ID. So we can use a css selector instead:

a(:view_history, css: '#ca-history a')

Which fix the issue!

Poking @dduvall making sure he knows about the above issue and the fix :-)

Change 200767 had a related patch set uploaded (by Hashar):
Browser test: avoid loading elements over network

https://gerrit.wikimedia.org/r/200767

I tweaked the script to use Watir::Browser, which is what MW-Selenium does, and profiled it with just the stdlib profiler.

require "profiler"
Profiler__.stop_profile

require "page-object"

class HomePage
  include PageObject
  page_url "http://en.wikipedia.beta.wmflabs.org/wiki/Main_Page"
  #page_url "http://127.0.0.1:8080/wiki/Main_Page"
  #a(:view_history, href: "/w/index.php?title=Main_Page&action=history")
  a(:view_history, href: /action=history/)
end

require "watir-webdriver"
@browser = Watir::Browser.new(:firefox)
home_page = HomePage.new(@browser)

require 'page-object/page_factory'
include PageObject::PageFactory
visit HomePage

Profiler__.start_profile
on(HomePage).view_history_element.exists?
Profiler__.print_profile(STDERR)

@browser.close
/Users/.../page-object-minimal $ head -n 2 ~/Desktop/profile.log; tail -n+3 ~/Desktop/profile.log | sort -rnk 6 | head -n 15
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  0.00     1.43      0.00        1     0.00  1430.00  #toplevel
  0.00     1.43      0.00        1     0.00  1420.00  Watir::Element#locate
  0.00     1.43      0.00        1     0.00  1420.00  Watir::Element#exists?
  0.00     1.43      0.00        1     0.00  1420.00  Watir::Element#assert_exists
  0.00     1.43      0.00        1     0.00  1420.00  PageObject::Platforms::WatirWebDriver::Element#exists?
  0.00     1.43      0.00        1     0.00  1390.00  Watir::ElementLocator#locate
  0.00     1.43      0.00        1     0.00  1380.00  Watir::ElementLocator#find_first_by_multiple
  0.00     1.43      0.00        1     0.00  1350.00  Enumerable#find
  0.00     1.43      0.00        1     0.00    30.00  Watir::Browser#assert_exists
  0.00     1.43      0.00        1     0.00    30.00  Selenium::WebDriver::SearchContext#find_elements
  0.00     1.43      0.00       92     0.00    29.67  Watir::ElementLocator#wd_find_by_regexp_selector
  0.70     1.30      0.01      182     0.05    14.84  Watir::ElementLocator#matches_selector?
  0.00     1.43      0.00       91     0.00    14.84  Enumerable#all?
  0.00     1.43      0.00       91     0.00    14.73  Watir::ElementLocator#fetch_value
  0.00     1.43      0.00       91     0.00    14.73  Selenium::WebDriver::Remote::Bridge#getElementAttribute

It seems the regex matching is a function of Watir WebDriver in this case, not PageObject. (PageObject may implement its own additional matching when used with bare Selenium WebDriver.)

After looking through the Watir source and the Selenium docs, I'm not sure there are any straightforward and 'clean' optimizations to be made.

https://github.com/watir/watir-webdriver/blob/97b54739dedbdb55c69571c5d7ca35b4bc6a91a3/lib/watir-webdriver/locators/element_locator.rb#L151

Watir begins to iterate over the elements found by the simplified selector here. I thought WebDriver might have a operation for including attributes in the results of findElements, but it doesn't. One other possibility would be to use execute_script to perform a batch retrieval of attributes for a given set of elements, but the resulting attribute values may not be identical to those returned by WebDriver—WD returns the values as they were when the page originally loaded, not their current values.

https://github.com/watir/watir-webdriver/blob/97b54739dedbdb55c69571c5d7ca35b4bc6a91a3/lib/watir-webdriver/locators/element_locator.rb#L176
https://github.com/watir/watir-webdriver/blob/97b54739dedbdb55c69571c5d7ca35b4bc6a91a3/lib/watir-webdriver/locators/element_locator.rb#L191

This is where Watir calls out to getElementAttribute for each element found by the simplified selector. Again, I thought WebDriver might have an operation to retrieve element attributes in bulk, but it doesn't.

https://github.com/watir/watir-webdriver/blob/97b54739dedbdb55c69571c5d7ca35b4bc6a91a3/lib/watir-webdriver/locators/element_locator.rb#L91

This is where Watir originally gives up using a pure WebDriver selector. One possibility at this point might be to detect trivial regexps (containing only literal characters) and convert them to xpath expressions. It's the least icky optimization I can think of and it would probably cover most cases.

And now I am totally confused by my page-object gem installation embedding both the selenium and watir drivers. Too many layers for my little brain :-(

When using a regex on a href

a(:view_history, href: /action=history/)

One of page-objects / watir / selenium, should know it is going to need the A element together with its href attribute. So maybe that could use a crafted xpath selector such as:

browser.find_elements(how=tag_name, what=a['href'])
                                           ^^^^^^

I speculate the elements would have the href pre populated thus when invoking:

matches_selector(a :href)

fetch_values(a, ::ref)

element.attribute(:href)

The element would yield the value directly instead of having to query the webrowser. But don't quote me on that, I have no real clue how the code works :-\


With xpath, page objects emits a single query to the browser. I guess the xpath selector is handled by the browser and yield the only element matching. All good.

a(:view_history, xpath: "//a[contains(@href, 'action=history')]")

I don't think that including the attribute in the xpath predicate will cause WebDriver to include to the attribute in the response, but I could be wrong. It's worth an experiment.

https://github.com/watir/watir-webdriver/blob/97b54739dedbdb55c69571c5d7ca35b4bc6a91a3/lib/watir-webdriver/locators/element_locator.rb#L91

This is where Watir originally gives up using a pure WebDriver selector. One possibility at this point might be to detect trivial regexps (containing only literal characters) and convert them to xpath expressions. It's the least icky optimization I can think of and it would probably cover most cases.

I experimented with this a bit and came up with https://github.com/watir/watir-webdriver/pull/331

hashar removed hashar as the assignee of this task.Apr 8 2015, 8:17 AM

I am not actively working on this since I lack even the most basic ruby skills :(

The PR for optimization of regexp selectors has been merged.

Updating to the latest version of watir-webdriver should solve the problem.

Make sure to check the status of related patch https://gerrit.wikimedia.org/r/#/c/200767/

That changes replaces a bunch of selectors href: /regex/ to use id: foo instead. That is much faster since in watir_webdriver that is a single network roundtrip and the browser yields the single element matching. Whereas with href: /regex/ (subject of this bug) it ends up doing a network roundtrip for each element.

So the patch should be rebased, amended to fix all href: /regex/ and probably get merged in.

Running only one scenario:

bundle exec cucumber features/preferences.feature:4
00:00:08.098   Scenario: Preferences Appearance           # features/preferences.feature:4
00:01:38.944     Given I am logged in                     # mediawiki_selenium-1.2.1/lib/mediawiki_selenium/step_definitions/login_steps.rb:1
...
00:01:39.134 Finished: SUCCESS
00:00:08.404   Scenario: Preferences Appearance           # features/preferences.feature:4
00:01:12.949     Given I am logged in                     # mediawiki_selenium-1.2.1/lib/mediawiki_selenium/step_definitions/login_steps.rb:1
...
00:01:13.113 Finished: SUCCESS

Yup that saves a bunch of roundtrips. Once rebased, there is probably more href: /regex/ that needs to be removed. Note how your run still takes 72 seconds to achieve the 'Given I am logged in' step :-(((

Change 223556 had a related patch set uploaded (by Zfilipin):
WIP Upgrade to the latest released version of watir-webdriver Ruby gem

https://gerrit.wikimedia.org/r/223556

00:00:10.335   Scenario: Preferences Appearance           # features/preferences.feature:4
00:01:07.752     Given I am logged in                     # mediawiki_selenium-1.2.1/lib/mediawiki_selenium/step_definitions/login_steps.rb:1
...
00:01:07.930 Finished: SUCCESS

Well, I do not know what the conclusion is. Both patches do speed up the scenario, but it still takes a minute to start the browser. I think both both patches should be merged, and we could create a separate task to inspect all repos and remove regular expressions where possible.

Well, I do not know what the conclusion is. Both patches do speed up the scenario, but it still takes a minute to start the browser. I think both both patches should be merged, and we could create a separate task to inspect all repos and remove regular expressions where possible.

Did you check to see if the SauceLabs session is being queued? IIRC, we're allow 10 concurrent sessions, after which the request to SL will block.

Did you check to see if the SauceLabs session is being queued? IIRC, we're allow 10 concurrent sessions, after which the request to SL will block.

As far as I remember, I did check, and at the time that was the only sauce labs session.

Change 223556 merged by jenkins-bot:
browsertests: upgrade watir-webdriver

https://gerrit.wikimedia.org/r/223556

This is a problem for wikidata jobs, that take hours to run in CI.

Things to do:

  • check the timestamp when sauce labs job is created and the first entry in selenium log: looks like there is no delay, see the next comment
  • check if wikidata is using the version of watir-webdriver with Dan's patch: yes, they are using watir-webdriver 0.8.0, the latest released version, contains Dan's patch
  • check if wikidata is using regular expressions: could not find any

Recent job example:

https://saucelabs.com/beta/tests/e34ad733751441979a53ad382e834eab

automator.log

Xlib:  extension "RANDR" missing on display ":95".
1441989599447	addons.manager	DEBUG	Application has been upgraded

1441989599447 is GMT: Fri, 11 Sep 2015 16:39:59 GMT

selenium-server.log

09:39:57.388 INFO - Launching a standalone Selenium Server

09:39:57.388 is probably PDT, so 16:39:57.388 GMT

On https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/743/consoleFull

00:04:48.057 Scenario: Main Page Sidebar Links # features/main_page_links.feature:10
00:05:10.005 Then I should see a link for Recent changes #

So that went from a minute to 22 seconds which is way better

Seems we did all we could do and the MediaWiki core browser test job is faster now. Wikidata should be as well or will be soon.

Gotta update watir-webdriver everywhere, can be done in other tasks.

Change 200767 merged by jenkins-bot:
Browser test: avoid loading elements over network

https://gerrit.wikimedia.org/r/200767