Windows

Review: Automate tasks visually with Sikuli X

Project Sikuli offers the promise for visual scripting in order to automate anything in Windows via screenshots.

Scripted task automation can serve as a nice time-saver for repeating mundane tasks in Windows. Some of us though don't want to hunker down and take the time necessary to understand the majority of scripting languages out there due to learning curves that can be too large for the impatient. Some folks, like myself, are very graphically oriented and scripting with visual building blocks can not only make the job of scripting less of a chore, but engaging as well.

Students at MIT have an interesting development venture called Project Sikuli that promises to deliver graphically-aided scripting. The learning curve is very low and anyone can pick it up with relative ease. In fact, if you remember the classic Desktop Application Director software (also affectionately known as DAD) that was bundled with WordPerfect 6.0 and greater for Windows, Sikuli X is very similar in functionality to that.

  • Title: Sikuli X
  • Company: MIT's Project Sikuli Team
  • Product URL: http://sikuli.csail.mit.edu/
  • Supported OS: Windows XP, Vista, 7, and 8
  • Price: Free
  • Rating: 4 out of 5
  • Bottom Line: Despite some initial quirks and growing pains, Project Sikuli offers real promise for "visual" scripting in order to automate anything in Windows and its applications via screenshots. It's ingenious and definitely worth checking out.

Macro recorder

To better explain what this means, Sikuli is, for all intents and purposes, a graphical macro recorder. Essentially, you have a scripting app that provides two things; a very basic written language which is easy to understand and an ability to utilize screenshots. The latter part is what makes Sikuli interesting.

When you are ready to script an event for automation, you take snapshots of various user interface elements within any particular application as you perform each step by hand. As you do this, you click on the various functions in the menu on the left hand side, which contains operations that coincide with a part of a screenshot.

For example, if I choose the "waitVanish ( )" option and screen-grab an element that is supposed to disappear, Sikuli will perform that action by waiting until the element disappears before proceeding to the next step. You can also automatically click buttons, type in text boxes and even drag and drop items from one place to another. Once the steps have been recorded and you are satisfied with your automated script, you can save it out as a Sikuli executable file which can then be run and distributed to others as necessary.

An excellent usage case scenario for Sikuli is in the realm of remote technical support. Say you have a friend that doesn't know how to properly configure a static IP in Windows 7 and you need help him out. Rather than performing the steps over the phone, just create a Sikuli script to send to your friend that can be run and, just like that, the script leaps to life as the mouse and keyboard work each step out automatically.

Another example of a script is one I made for taking multiple Mediafire download links off the Windows clipboard and entering each one, one at a time into a web browser for downloading. This is an excellent time saver that won't require me to sit and click "Download" on each new link after the previous file finished.

Some aspects to consider

With this being said, there are likely disadvantages to using visual scripting techniques. For one, if certain user interface elements change periodically from time to time, such as a dynamic list of items that are clickable like a list of VMs, Sikuli can get confused if the visuals don't match up with what is contained in the script 100%.

Changing a filename of a file that is supposed to be clicked and opened in a Sikuli script will break the action midway and bark an error message out. As is the case with screenshots, everything needs to be pretty much pixel perfect or else the script might fail.

Another issue I noticed is the fact that you must have Java JRE 6 x86 installed in order for Sikuli X to run. It's not Java that is bad in and of itself, as much as the fact that version 6 is required. With version 7 of Oracle's Java becoming the new standard, version 6 is likely to have its support cut soon and it would be nice for the newer Java to be compatible with Sikuli.

Bottom line

Overall, Project Sikuli is a very interesting way to automate processes using screenshots and easy-to-understand commands. Now you can bring the world of macros like you think of them in Microsoft Office right into the rest of desktop and all applications. It's easy to learn, it's constantly improving and it's available free of charge. If you are okay with using a previous version of JRE, I would recommend this software.

Also read:

About

An avid technology writer and an IT guru, Matthew is here to help bring the best in software, hardware and the web to the collective consciousness of TechRepublic's readership. In addition to writing for TechRepublic, Matthew currently works as a Cus...

1 comments
Mark W. Kaelin
Mark W. Kaelin

This is an interesting idea. If you give Sikuli X a try, let us know how it works for you? Is there real potential here?

Editor's Picks