An eye on usability

Editor’s note: Sandra Marshall is president and CEO, and Tim Drapeau is vice president of sales and business development, at EyeTracking, Inc., a San Diego research firm. Maritza DiSciullo is director of market intelligence at AT&T Broadband. This paper was first presented at the 2000 ARF Week of Workshops.

This article tells the story of an integration of qualitative and quantitative approaches to Web site usability. Working collaboratively, usability specialists from San Diego-based EyeTracking, Inc. (ETI) and AT&T investigated how two groups of users interacted with AT&T’s Customer Service Home Page.

The quantitative approach described here comes from an eye-tracking methodology developed by EyeTracking, Inc. ETI records eye movements in two ways: 1) a video that shows the point-of-gaze superimposed on the display seen by the user and 2) the precise record of horizontal and vertical pixel coordinates on the screen. The latter are recorded at 250Hz, yielding 15,000 observations per minute for each eye. ETI’s approach comes from the analytic techniques that synthesize this large body of data into revealing aspects of a user’s performance as he or she traverses a Web site.

The article is organized in three parts. The first part provides details of the study. The second part summarizes the results from the two approaches, both separately and in combination. Finally, the third part discusses the value added by this integrated approach.

The Web site usability study

In a one-day study, ETI tracked the gaze of 12 participants who interacted with the AT&T Customer Service Home Page. Two groups of users were recruited: those who were already users of the online customer service and those who were AT&T customers but were yet not online users. The project was carried out in the usability lab at EyeTracking, Inc. Subjects were recruited and screened by an outside recruiting firm using customer information provided by AT&T.

Each user responded to a set of 13 tasks. Nine of the tasks were common to both groups of participants, and four tasks were unique to each group. Most of the tasks required a user to make a menu selection from the AT&T Customer Service Home Page and then to follow appropriate links to complete the task. All subjects completed the study in approximately one hour.

Technical details

The eyes of each participant were tracked for about 30 minutes. During this time, he or she wore the eye tracker shown in Figure 1 while interacting with the AT&T Web site displayed on a 17” monitor in 800x600 resolution through the Internet Explorer browser. The user was free to use the mouse and type in normal fashion.

Figure 1

Project objectives

Prior to a redesign of its site, AT&T wanted to study the usability of the current site. This study was intended to provide baselines against which the newly designed site could be compared. Of primary interest were measures of overall effectiveness of the site, including ease of use and appeal to the customer. Also of immediate interest was the detection of any technical problems currently existing on the site.

Eye tracking is particularly useful for identifying problem areas as individuals work through tasks on a site. In the eye-tracking paradigm, tasks are presented one at a time to a participant, who is free to ask questions if she or he does not understand what to do. No other verbalization is required, although participants are free to make any comments they wish. They carry out the tasks independently, without interruption from the experimenter and without being asked to describe either what they are thinking or to explain why they are making particular choices. As a result, their natural performance can be observed without possible influence from an experimenter’s comments or questions.

In-depth interviewing brings additional information. It is here that the participant is asked to describe his or her reaction to a site, to point out likes and dislikes, and to explain why certain actions were taken.

Results of the study

Eye-tracking results
The primary foci of the eye-tracking analyses were the functionality and usability of the home page, because the home page was the critical point of initiation of all tasking. Failure to move from the home page to the appropriate next pages insured task failure for a participant regardless of the structure of the subsequent pages.

Three essential characteristics of a Web site are usability, visibility, and optimization. The eye-tracking analyses focused on measures of these three features.

  • Usability. Three measures of usability were determined: success on the task, time to succeed, and degree of confusion shown by the participants. The first of these is a common measure and needs no further elaboration. Success in this study was defined by reaching a pre-determined page in the Web site, depending upon the specific task being undertaken. The success rates for experienced and new users were 88 percent and 81 percent respectively.

The second measure of usability is important because it reveals whether a high success rate masks important problems. If individuals can succeed but require an inappropriate amount of time to do so, a site does not have good ease of use. For each given task in this study, the time required by participants to reach the criterion page was the basis of the measure. The average times to complete tasks successfully were 40 seconds for experienced users and 42 seconds for new users. These numbers are misleading, however, because of the large range over the tasks. Figure 2 shows the average times by task for the two groups.

Figure 2

The third essential measure of usability is based on the degree of confusion or hesitation that users display in trying to navigate around a Web site. If users find the site understandable and easy to use, they tend to navigate quickly and easily. If they have difficulty understanding the logic of the site, they tend to have specific eye patterns that reveal this confusion.

Figure 3

Figure 3 shows an example of a user who is confused. On this task, the user is trying to decide which menu option is best. In the figure, each dot represents one observation in the individual’s point of gaze. Adjacent dots were recorded 4 msec. apart. The colors indicate the sequence in which the observations were made. They occur in the following order: red, blue, yellow, green, purple, aqua -with each color representing 4 seconds of observations (i.e., there are 1,000 observations of each color).

In Figure 3, it is evident that the participant was in doubt about the menu selection because of the very large number of observations that occurred repeatedly in the same area of the menu. More than 20 seconds were consumed as he made his decision.

In contrast, the eye movements of Figure 4 show little or no confusion. The participant moved immediately to the menu and selected the appropriate option in less than 5 seconds.

Figure 4

Confusion about the menu is revealed only in the eye movements. The two participants shown in Figures 3 and 4 made exactly the same number of mouse clicks for menu selection. However, the participant shown in Figure 3 took about 20 seconds longer than the participant in Figure 4 to make the decision to select the menu item, and he was focused on the menu for most of the time (the red and blue dots were early observations before he looked at the menu). The participant’s uncertainty is revealed in the multiple back and forth movements across the various menu options.

In these data, confusion was considered absent if there were few back and forth movements and the response time fell between 2 and 8 seconds. Confusion was judged to be present if there were repeated back and forth movements in one location and the response time fell between 8 and 24 seconds.

The confusions were tallied by subject across all tasks and averaged over groups. The overall average level of confusion for experienced users was .63 and for new users was .52. A value of 1.0 would indicate that all users experienced confusion on all tasks; a value of 0 would indicate no confusions. The observed values indicate that across all common tasks, more than half of all users -in both groups -showed high levels of confusion in using the menu.

It is important to consider the confusion index in addition to traditional measures such as overall success or time on task because it is entirely possible to perform a task successfully on a very unusable site. A site that is easy to use should have high rates of success, low times to completion, and very low rates of confusion.

  • Visibility. A second measure derived from patterns of eye movements indicates the extent to which all relevant regions of a display are noticed. When individuals are given time to explore a display, they tend to notice the outstanding features and to pick up its general organization by scanning around the page. It is useful to determine if some regions are attracting too little attention while others are attracting too much. The two GazeStats in Figures 5 and 6 illustrate this point.

Figure 5

Figure 5 shows the average percentage of time spent by experienced users in a free scan of the Customer Service Home Page, and Figure 6 shows the same thing for the new users. Notice the large differences between these two groups of users in their free scan of the home page. The experienced users spent significantly more time on the menu items to the left (which are the key navigation tools on this page) and on the text box in the lower center. The attention of new users was captured by the AT&T logo and the picture in the center. The differences between the two groups on all regions are statistically significant.

Figure 6
  • Optimization. A simple measure reveals the efficiency with which users are able to select items and make transactions efficiently: the number of non-page-changing extraneous menu items that are inspected (and then rejected). With optimal performance, a user will make few or no extra clicks on the menu items.

On an unfamiliar site, new users would be expected to make more extraneous clicks on menu items than experienced users, and this expectation was realized in the study. New users required an average of 22.8 clicks and experienced users registered an average of 17.7 total clicks for the nine common tasks. The optimal number of clicks for the set of nine common tasks is 10, indicating that both groups are well above the optimal number. (Note: This difference approaches but does not reach statistical significance because of the great range of clicks in each group and also because of the small sample size.)

It should be noted that these extraneous clicks provide information above and beyond the confusion data reported above and shown in Figures 3 and 4. The confusion data suggests that participants were looking again and again at the different menu titles, perhaps trying to understand what they were and how to differentiate among them. The extraneous links indicate a searching strategy and suggest that the users have not remembered correctly what the various category and subcategory menu items contain.

Interview results

The primary focus of the interviews was to ascertain why participants had trouble completing specific tasks, with special attention to features that the participants misunderstood.

During the eye-tracking session, an interviewer observed the participant through a one-way mirror. While monitoring the participant’s eye movements, this interviewer also listened to any verbalizations and observed the participant’s body language throughout the exercises. Notes were taken regarding tasks that appeared to be met with difficulty, so that these could be addressed in a debriefing interview with the participant.

In the debriefing interview, the participant was asked for general impressions of the Web site and was then probed about his or her impressions of certain tasks. The interviewer was able to follow up by asking the participants to review their thought processes during certain tasks and to explain their decision processes.

Having difficulty

Both success rate and confusion rate are high for experienced users, indicating that they can succeed in doing simple tasks but are having difficulty using the site. The high confusion rates suggest that wording or organization of the menu is unclear. One would expect confusion to drop with increased familiarity, but this is not the case. For the experienced users, confusion was highest toward the end of the study.

All regions of both home pages were noticed by most users. The text of the customer service home page captured quite a bit of attention, suggesting that important information placed here would be seen by both new and experienced users.

The large number of extra menu clicks and extraneous page links suggest that visitors are not using the site in optimal ways. The menu clicks may be caused by a number of factors, including failure to understand the categories, the absence of desirable categories, or the hierarchical structure of the categories.

The traversing of extra pages is probably related to misunderstandings about menu items. Users tended to click on a menu item, open a page, and then return immediately to the menu in order to select another item.

Corroborative results

The combination of eye tracking with in-depth interviewing produced strong corroborative results. Through this process we were able to verify that the aspects we interpreted as confusing were in fact confusing and we were able to understand why. This integration provided the research team with a richer interpretation than either technique alone. The result is a set of strong, detailed recommendations for site improvement that have both statistical and qualitative bases.