Windows UI Automation Fundamentals

This section is intended to introduce the principles behind UI structures, and how the UI Automation plugin has been designed to allow TestOptimal to interact with them. For those new to UI Automation and the UIA plugin, reading of this section is recommended.

First Principle

The first principle is to help the user think of UI (or GUI, Graphical User Interface) elements as “objects”. The following diagram will be used to illustrate a generic UI element. For now, try not to visualize each element by its type (button, text field, panel, etc), but simply as a generic element.

  • UI element contains generic properties that define its size, position, and UI element type.
  • Accessibility tags also form the properties data providing the accessibility name and automation ID.
  • UI Element type specific information is also associated with the properties, identifying actions and additional values that relate to the current type.

Some of the above properties are automatically generated at runtime, and others are defined by the developer when code is created.

The underlying OS identifies the specific UI element by a hWnd (window handle). This serves like a memory address that is unique for every UI element type that is currently created (whether visible or not), even regardless of element type. No two UI elements will ever have the same hWnd.

Second Principle

The second principle is to remember that the underlying OS maps UI elements in a tree. UI elements are always stacked in a containers. For example the following diagram illustrates the tree for the default Microsoft Calculator application.

Every UI application in Windows has a main ‘Frame’ or ‘Pane’ UI element which serves as a primary container in which all subsequent UI elements for that visual representation, will be constrained. In this case the Frame then hosts three core UI elements; a Title Bar, Menu Bar and another Pane.

The Title Bar hosts the frame controls, and application drop down menu. As part of the property of the Title Bar, it contains a text value that provides an application name text for the user.

The Menu Bar hosts the drop down menus.

The Pane hosts a collection of text fields and buttons.

Every single UI element in the above tree is assigned a unique hWnd at the time they are created. They each maintain their assigned hWnd until the UI element is destroyed. If several instances of the same application is opened, for example several instances of the Microsoft Calculator are started, then even though they may be displaying the same tree, the hWnd of each element will still be unique.

Third Principle

Thirdly, and lastly, interaction and control via UI Automation consists of obtaining the hWnd of the UI element we are interested in, then we are able to send action requests via the OS, or retrieve the properties associated with that UI element to verify its state.

So how does the UIA plugin, and TestOptimal, interact with UI elements? The Windows environment that we are operating in, will have a number of applications open at a given time. Searching for a UI element can be performed based on a few criteria:

  • The accessibility name tag text,
  • The UI element type,
  • The help text, or
  • The automation ID (applicable only to Windows native UI elements).

Some of the aforementioned criteria can be searched in combination with each other. When a number of applications can be open at any one time, and each application can have several, if not hundreds, of UI elements each. Then recursively searching through each tree to find a single element can become quite processor and time intensive. In order to speed the search, the following steps are done:

  1. First search for the root window, i.e. the frame / pane that forms the application UI primary container.
  2. With the root window located, we can then search only within that window’s UI tree for the UI element we are interested in.
  3. Sometimes the UI element we are interested in, may be ambiguous (i.e. if it wasn’t given an automation ID, did not have a unique accessibility name, and is surrounded by similar ambiguous UI elements). Hence an alternative is to search for a uniquely identified reference point which narrows the field of search, then search relative to that reference point.
  4. With a handle obtained for the UI element of interest, we can now send queries to the OS UI engine to retrieve the value of any given property of interest. As TestOptimal MScript operations are only able to deal with strings, this means that reads have to be done of a single property code at a time. These property codes can then be used to test against expected values to verify the application UI state.
  5. Additionally, actions (if applicable to the type of UI element, and its current state), can be sent to the UI element. The actions that can be sent are the events that directly result from system and user level operations. If the UI element does not support a given action, or its state does not permit it (e.g. if it is currently disabled), then the action will be discarded, returning a failure result.

More Info

You may find more information on UI Application test automation at UI App Automation Discussion/Forum.

QR Code
QR Code ui_automation_fundamentals (generated for current page)