small medium large xlarge

19 Aug 2012, 12:17
Dean Cornish (5 posts)

Hi All, love your work- you have a great recipe book and I cant wait to see it in its completed state. Some feedback…

I find the inclusion of Sikuli as a testing tool an intriguing one. As a former ThoughtWorks Test Consultant- I found very few situations where Sikuli was actually a good fit. I frequently wired it up with cucumber, that wasn’t the problem, that was fairly easy.

The challenge I see is, that just because Sikuli can do something doesn’t mean it should be used to do that thing. Even though every word is valuable in your book, I’d recommend putting some wording around this section recommending to exhaust all options that are native to the UI being hit and only use Sikuli when you’re absolutely stuck. The horror I see here, is that this book could just be renamed the Cucumber and Sikuli book- because one could argue “why would I need any other framework when Sikuli works?”

From what I read about the MIT work on this, it was never intended as a testing tool, it was just intended to automate simple UI operations, the fact that the jar’s were there seem to have lead to it becoming a testing tool by de facto but perhaps (I would argue) in error.

While its relatively easy to write the scripts, deploying the dev environment into a CI box and subsequent stages in the pipeline is very frequently a cow, also the need for interactive user account interaction also adds to the complexity.

I would also argue, if the Application Under Test is so complex that Sikuli seems like a good fit, then they need to simplify the app, or solve the blockers that stop native libraries from working before Sikuli turns their test automation stack on its head.



21 Aug 2012, 04:44
Ian Dees (212 posts)

Hi, Dean.

Totally agreed that a native toolkit is the easiest and most reliable thing to get started with. I like your idea of adding a note to this effect. How about something like this?

bq. Why Not Use Sikuli All the Time?

bq. If Sikuli is cross-platform, then why did we incude platform-specific recipes at all in this book? In our experience, and in that of the readers who have shared their stories with us, a native testing library brings more reliability and better integration with GUI controls than Sikuli does.

bq. Our advice is to prefer a native approach, and consier Sikuli for apps that are cross-platform (or running on a platform with poor automation support) after you’ve tried other approaches.

Would something like that work?

27 Aug 2012, 02:52
Dean Cornish (5 posts)

Very good- thanks Ian!- you’re saving me severe heartburn down the line when someone brings me in and says “We need you to help fix our Sikuli testing suite” ;-)

28 Aug 2012, 21:34
Matt Wynne (92 posts)

How about if we get a horror story from Dean included in the book?

29 Aug 2012, 23:32
Ian Dees (212 posts)

Ooh, I like that. Dean, would you be willing to write a couple of paragraphs (with the characters sufficiently anonymized) about a rough experience with Sikuli? We could include it as a story inside the recipe with your name on the byline.

31 Aug 2012, 05:12
Ian Dees (212 posts)

Hi, gb.

The @.png@ files are in the downloaded zipfile, under @code/sikuli/*.png@. Does the script run to completion if you copy the three @.png@ files into your project directory (one level above the @features@ subdirectory)?

Good find on the Sikuli gems. I’ll give them a shot and see how they compare to the roll-our-own approach.



03 Sep 2012, 05:18
Dean Cornish (5 posts)

Happy to write up something- just might need some help from Ian and co to turn it into something fit for consumption.. ;)

03 Sep 2012, 07:35
Dean Cornish (5 posts)

Hi Ian, I wrote way way too much, please feel free to edit/reword/chop to your hearts content. If the example isn’t strong enough I’m happy to pick another..

PS. turns out about 9 months ago I added a hybrid example using Watir-WebDriver and Sikuli onto github-

Some contexts where I’ve used Sikuli and found it to be helpful:

  • Native windows invoked from a web browser eg. Print dialogs, or save as.
  • PDF File contents opened in a 3rd party application
  • Image content testing eg. search for a particular image on a search engine, ensure you actually got the right image.
  • Flash
  • Overly complex UI implementation eg. nested javascript with obscure calls, presented in multiple layers.
  • Mixed technology applications where no one tool can drive through the whole scenario.
  • Win32 (C++/Delphi)/.NET UI with lots of custom controls
  • Citrix hosted apps (its all one great big image anyway)

One particular horror story was a Delphi app that was in a bit of a mess- it was implementing win32 controls, as well as custom ones that had been bought from a vendor, both had been abstracted many times, with the parent control behaviours being concealed through the abstraction process. Interrogating the UI via COM returned next to nothing, and using .NET to walk the UI would cause .NET to throw and exception and give up! This eliminated using .NET’s UI.Automation and therefore Project White as options. In this case, I quickly was finding that using native tools wasn’t working and I was under pressure to turn something around.

My first step, was to communicate to the team that building automation for an app that was clearly never intended to be automated wasnt working. They needed to take responsibility for it and start refactoring the code, fix the lack of testability and add unit tests as they went as the test automation wouldnt be sustainable even with Sikuli as a work around. Meanwhile I’d build a small test suite using Sikuli just to get them past it and to give them some coverage while they re-factored. Shortly after this with some code changes were made that unblocked COM access after which I ended up pairing up with another dev and implementing automation using COM as our entry point. It wasnt smooth sailing even with this approach, but years later it was still being used even though that option was intended just as a stop-gap measure until the app was re-written (which back then was going to be a near-term event). The entire time we depended on Sikuli was around 2 weeks. The lessons learned in that time were that Sikuli:

Pros: * Very good at matching images * Quick once its running, in some cases faster than native calls without caching. * Can drive UI that otherwise would take you weeks to figure out how to call. * It is reliable once the app you’re testing has stablised. * Seldom crashes * Can be used to build scenarios from screenshots until the actual software is delivered allowing some degree of ATDD to be undertaken then adjusted accordingly.

Cons: * It can type, and click, but cant read from the target application. It compares one image, with the current one. eg. dynamic values = major problem. Consider OCR or other potential work arounds- but none of it is attractive. * Other applications and dialogs that sit on top of, and therefore obscure view of your app will break the automation. Adding code that finds the window and brings it to front on every interaction is one work around, but its annoying to have to do it. * Compiling the required libraries for OpenCV and other dependencies can be painful to complete especially as they get older. * Complex applications eg. scrolling up/down or across quickly blow out the complexity of the scripts. * Small UI changes can and often do cause mass re-recording of images captured. If not done carefully can completely invalidate the test scenario. * Difficult to debug and time consuming as its a “change” then “re-run” and “check” model. * Cant be run headlessly unless you’re running a session in a framebuffer (Xvfb is your friend) or other similarly hidden away interactive session. Creates challenges when executing tests on CI boxes that have no window manager. * Difficult to parallel execute tests, due to Sikuli needing to point and click with absolute control.

05 Sep 2012, 21:11
Ian Dees (212 posts)

Dean, this is fantastic, thank you! Can you write to me off-list at, so that we can send you a permission form to use your story?


You must be logged in to comment