Have you always wanted to automate minesweeper?

I am running a contest as part of the Test Automation Bazaar in Austin, Texas from March 23-24.

The challenge is to write a robot that plays (and wins) expert games of minesweeper. You can use whatever language and tool you like. A colleague of mine, Mark Ryall, and myself have written a web version of minesweeper from scratch using Coffeescript. It’s available to play at minesweeper.github.com.

Full details of the contest are on the conference web site hosted on watir.com.

Good luck! It sounds easier than it is.

A tale of three ruby automated testing APIs (redux)

Redux Note: I originally wrote a similar article to this before going on parental leave about six weeks ago. Whilst I didn’t intend to offend, it seemed that a few people took my article the wrong way. I understand that a lot of effort goes into creating a web testing API, but that doesn’t mean that everyone will agree with what you’ve made.

Sadly, an anonymous coward attacked myself and the company who I work (even though I don’t mention that company on this blog), so for the first time in this blog’s history, I have had to turn comment moderation on. I am sorry to the other genuine commenters whose comments have been lost in transition, and now have to wait for their new comments to be approved.

Since then I have received numerous emails asking where my article went, and commenting that people found it interesting and worthwhile. So I have decided to repost this article, hopefully with a little less contention this time around, making it clear, this is my opinion and experience: YMMV.

Intro

As a consultant I get to see and work on a lot of automated testing solutions using different automated web testing APIs. Lately I’ve been thinking about how these APIs are different and what makes them so.

My main interest is in ruby, and fortunately ruby has three solid examples of three different kinds of web testing APIs, two of which extend the lowest level API: selenium-webdriver.

I’ll (try to) explain here what I consider to be three kinds of automated web testing APIs and where I consider the sweet spot to be and and why.

A meaty example

As a carnivore, I thought I would explain my concept in terms I can relate to. If you’re a beef eater, there are many different kinds of beef that you can use to make some tasty food to eat. I’ll use three different kinds of beef for my example. The first (rawest) kind would involve getting a beef carcass and filleting it yourself to eventually make some edible food. The second kind of beef you could use is beef that is already in a slightly usable form, but you can then use yourself to make some edible food. For example, you can buy minced beef at a butcher, and then make your own hamburger patties, taco fillings etc from it. The final type of beef you could use is beef that has already been prepared so you can directly consume it, for example, sausages which can be cooked and consumed as is.

I consider these three examples of different kinds of beef to roughly correlate to automated web testing APIs, of which I also consider to be three kinds of.

The first is a Web Driver API, which is the rawest form of an API, its job is to drive a browser by issuing it commands. It provides a high level of user control, but like filleting a beef carcass it’s more ‘work’. An example in ruby of this API is the selenium-webdriver API, which controls the browser using the webdriver drivers.

The second kind of automated web testing API is the Browser API, which is a higher level API but still provides user control. This is the minced beef of APIs, as whilst it’s in a more usable form than a carcass, you still have a lot of control (and potential to what you can do with it). An example in ruby of this API is the watir-webdriver API, which uses the underlying selenium-webdriver carcass to control the browser.

The final kind of automated web testing API is the Web Form DSL (Domain Specific Language) which is a very high level API that provides users with specific methods to automate web forms and their elements. This is the beef sausages of APIs as sometimes you feel like eating something else besides sausages, but it’s difficult to make anything else edible but sausages from sausages. An example in ruby of this Web Form DSL is the Capybara DSL.

Visually, this looks something like this:

Show me the code™

So exactly what do these APIs look like?

I knew you’d ask, that’s why I came prepared.

Say I want to accomplish a fairly basic scenario on my example Google Doc form:

  • Start a browser
  • Navigate to the watir-webdriver-demo form
  • Check whether text field with id ‘entry_0′ exists (this should exist)
  • Check whether text field with id ‘entry_99′ exists (this shouldn’t exist)
  • Set a text field with id ‘entry_0′ to ’1′
  • Set a text field with id ‘entry_0′ to ’2′
  • Select ‘Ruby’ from select list with id ‘entry_1′
  • Click the Submit button

This is how I would do it in the three different APIs:

# * Start browser
# * Navigate to watir-webdriver-demo form
# * Check whether text field with id 'entry_0' exists
# * Check whether text field with id 'entry_99' exists
# * Set text field with id 'entry_0' to '1'
# * Set text field with id 'entry_0' to '2'
# * Select 'Ruby' from select list with id 'entry_1'
# * Click the Submit button

require 'bench'

benchmark 'selenium-webdriver' do
  require 'selenium-webdriver'

  driver = Selenium::WebDriver.for :firefox
  driver.navigate.to 'http://bit.ly/watir-webdriver-demo'
  begin
    driver.find_element(:id, 'entry_0')
  rescue Selenium::WebDriver::Error::NoSuchElementError
    # doesn't exist
  end
  begin
    driver.find_element(:id, 'entry_99').displayed?
  rescue Selenium::WebDriver::Error::NoSuchElementError
    # doesn't exist
  end
  driver.find_element(:id, 'entry_0').clear
  driver.find_element(:id, 'entry_0').send_keys '1'
  driver.find_element(:id, 'entry_0').clear
  driver.find_element(:id, 'entry_0').send_keys '2'
  driver.find_element(:id, 'entry_1').find_element(:tag_name => 'option', :value => 'Ruby').click
  driver.find_element(:name, 'submit').click
  driver.quit
end

benchmark 'watir-webdriver' do
  require 'watir-webdriver'
  b = Watir::Browser.start 'bit.ly/watir-webdriver-demo', :firefox
  b.text_field(:id => 'entry_0').exists?
  b.text_field(:id => 'entry_99').exists?
  b.text_field(:id => 'entry_0').set '1'
  b.text_field(:id => 'entry_0').set '2'
  b.select_list(:id => 'entry_1').select 'Ruby'
  b.button(:name => 'submit').click
  b.close
end

benchmark 'capybara' do
  require 'capybara'
  session = Capybara::Session.new(:selenium)
  session.visit('http://bit.ly/watir-webdriver-demo')
  session.has_field?('entry_0') # => true
  session.has_no_field?('entry_99') # => true
  session.fill_in('entry_0', :with => '1')
  session.fill_in('entry_0', :with => '2')
  session.select('Ruby', :from => 'entry_1')
  session.click_button 'Submit'
  session.driver.quit
end

run 10

This is how long they took for me to run:

                        user     system      total        real
selenium-webdriver  1.810000   0.840000  22.130000 ( 73.123340)
watir-webdriver     1.940000   0.870000  24.380000 ( 79.388494)
capybara            1.950000   0.890000  24.080000 ( 79.920051)

Note: Capybara doesn’t always require a ‘session’, it’s only for non ruby rack applications, but since my example (Google) is not a rack application, as are most of the applications I test, my example must use the session.

When using ruby, why Watir-WebDriver is my sweet spot

I personally find Watir-WebDriver to be the most elegant solution, as the API is high enough for me to be highly readable/usable, but low enough to be powerful and for me to feel like I’m in control.

For example, being able to select an element by a explicit identifier (name, class name, id, anything) is a huge deal to me. I personally don’t like relying on the API to determine which selector to use: for example Capybara only supports name, id and label, but you can’t tell fill_in which specific one to choose: it appears to try each selector one by one until it finds it.

I have found that Watir-WebDriver also also provides lots of flexibility/neatness. For example: it’s the only API shown here that allows URLs to not have a ‘http://’ prefix (how many people do you know who type in http:// into a browser?).

In my opinion, the high level APIs like Capybara don’t provide enough control (for example – being able to specify the explicit selector), but the low level APIs like webdriver don’t provide enough functionality. This is evident when I am using a language other than ruby (like C#) when I find myself writing a large number of web element extension methods because webdriver doesn’t provide any of them. A .set method is a classic example, even Simon Stewart writes a clearAndType method in his examples even though he wrote webdriver which sadly misses it (you must call .clear, and .send_keys).

My biggest concern about high level field APIs

But my biggest issue with the high level APIs is that I’ve frequently seen them used to write test scripts that are step by step interactions with a web form. Instead of thinking of a business application as that, people see it as a series of forms that you ‘fill in’. This means people create scenarios like Aslak Hellesøy included in his recent post about cucumber web steps (which uses Capybara) and the problems it has created.

Scenario: Successful login
  Given a user "Aslak" with password "xyz"
  And I am on the login page
  And I fill in "User name" with "Aslak"
  And I fill in "Password" with "xyz"
  When I press "Log in"
  Then I should see "Welcome, Aslak"

I’m not saying it’s not possible to end up with something as ugly as above using other APIs, but I am saying the web form DSL style naturally relates to this: as the APIs look so similar to this style because that’s what the DSL was designed for: filling in forms. I’ve seen people frequently write generic, reusable cucumber steps to match the web form DSL like:

When /^I fill in "(.+)" with "(.+)"$/ do |value, field|
  fill_in field, :with => value
end

But this means you end up with less readable, less maintainable test scripts rather than business readable executable specifications.

Summary

Ultimately what I am looking for in an automated web testing API is simplicity and full control. I personally find browser APIs like Watir-WebDriver and Watir give me this, and this is why I love them so. Your mileage may vary, you may like different styles of APIs better, but I’ve seen other APIs so badly abused by people not even thinking about it, so it makes sense to think about what you’re trying to achieve and whether what you’re doing is the right way.

Cukepatch: rich editing of feature files on Github

I’ve been a strong advocate of using the built in Github web text editor for editing feature files for some time, as it means that non-technical business users don’t need to worry about having git clients installed and pulling/pushing changes.

The benefit of this method over a publishing system such as Relish is that you can send a subject matter expert a URL to a Github feature file, and if they recognize that the specification is incorrect, they can immediately update it, unlike Relish where they need to go to the source code and push a change.

The downside is that the editor is a pretty basic, meaning no syntax highligting, step completion etc. Until now that is…

Aslak Hellesøy and Julien Biezemans recently announced Cukepatch: rich editing of feature files on Github. There’s some detail on the Cukes Google Group about it, but essentially it provides rich editing (syntax highlighting/validation and step completion) using a Google Chrome user script that reads a cucumber file you create in your public cucumber repository.

I did this for both WatirMelonCucumber and EtsyWatirWebDriver, and the results look like this:

This looks very promising indeed. There’s a few caveats at the moment including the requirement for a backend server, only working with public repositories, having to manually install the user script and being Google Chrome only. As these are overcome, I can see this becoming the de facto way for business users to write and edit specifications. Well done guys.

WatirMelon Spinach

Hello Spinach!

“Spinach is a new awesome BDD framework that features encapsulation and modularity of your step definitions.”

Seems like a good idea, so I thought I’d give it a try, and convert my existing simple Watirmelon Cucumber tests to use Spinach instead (link to my source code). It was very easy to do, here’s some observations:

  • It’s easy to get existing Cukes running using Spinach, but I imagine if you were starting out using Spinach you’d design things a lot differently
  • Steps belong in their own class, but you can include mixins to reuse steps – the ruby way
  • Steps are in a steps directory under features – shorter than step_definitions
  • Goodbye regular expressions in step definitions, which is a bit of a shame, as you can no longer capture values from the step name
  • As you can’t have regular expressions in step names, I find myself repeating steps that are similar but slightly different, this means my example steps have gone from 41 to 57 lines of code
  • Scenario Outlines aren’t supported at the moment, but I have raised this as a feature request
  • Hooks are dramatically improved, so I found them very easy to use and understand
  • There is no cucumber world, so you do a normal include instead, and env.rb is still supported
  • It displays really cool ticks on the command line when you’re running your spins

Well done to Codegram for releasing Spinach. If anything, it creates some great innovations that I imagine may find their way into Cucumber in the future.

Watir-WebDriver tests on Firefox 7: getting rid of the send data to Mozilla message

Update 6 October 2011: The send data to Mozilla question will be turned off by default in the next release (2.8.0) of the selenium-webdriver gem which watir-webdriver uses.

I’ve been running Watir-WebDriver tests against Firefox 7, which works superbly. The biggest change is Firefox 7 now supports performance metrics, so this means you can use the watir-webdriver-peformance gem: yay! It also means my EtsyWatirWebDriver project now collects page metrics using Firefox.

The only slight annoyance is the presence of the ‘send data to Mozilla?’ dialog bar. Never fear, it’s easily dismissed.

require 'watir-webdriver'
profile = Selenium::WebDriver::Firefox::Profile.new
profile['toolkit.telemetry.prompted'] = true
b = Watir::Browser.new :firefox, :profile => profile

Enjoy.

Send data to mozilla

C#: Avoiding the WebDriverException: No response from server for url

When it comes to automated testing, there’s not much worse than intermittent failures, especially when they stem from the driver itself. The current version of the C# WebDriver bindings has such a failure, but I worked out a reasonable way to avoid it happening. Basically it involves creating a WebDriver extension method that I use instead of Driver.FindElement, which tries a number of times to find the element, ignoring the exception that is intermittently raised.

I hope you find this useful if you’re consuming WebDriver in C#.

using OpenQA.Selenium;
using OpenQA.Selenium.Support.UI;
namespace Extensions
{
    public static class WebDriverExtensions
    {
        public static SelectElement GetSelectElement(this IWebDriver driver, By by)
        {
            return new SelectElement(driver.GetElement(by));
        }
        public static IWebElement GetElement(this IWebDriver driver, By by)
        {
            for (int i = 1; i <= 5; i++ )
            {
                try
                {
                    return driver.FindElement(by);
                }
                catch (Exception e)
                {
                    Console.WriteLine("Exception was raised on locating element: " + e.Message);
                }
            }
            throw new ElementNotVisibleException(by.ToString());
        }
    }
}

Watir-Page-Helper 0.3.0: now with added frames

I’ve just release version 0.3.0 of my watir-page-helper gem, with support for frames.

To use a frame, you define it as you would any other element:

class PageIFrame < BasePageClass
  direct_url TEST_URL
  frame :iframe, :id => "myiframe"
  link(:ilink) { |page|  page.iframe.link(:text => 'Link in an iFrame') }
end

and then you can use the frame, or any elements within that frame:

it "should support elements within a iframe" do
  page = PageIFrame.new @browser, true
  page.iframe.exist?.should be_true
  page.ilink_link.exist?.should be_true
  page.ilink
end

I hope you find this update useful.