I just finished reading a very interesting article (“The Lumiére Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users“) from Microsoft Research. Lumiére was the name of the project that developed mathematical models of assessing the needs and goals of software users in real time in order to present users with accurate, relevant assistance as they worked. Does this sound like something that you think users would be interested in? I sure do!
However, after reading the Lumiére paper, I may be changing my mind a bit. To understand why, we need to take a brief trip through time.
In 1993, the Lumiére project began, and was initially shown to another team at Microsoft in 1994 (to tell you what team ruins the surprise!). That other team liked what they saw, and built a special version of one of their products that allowed the Lumiére group to hook into it and begin showing how the Lumiére techniques could improve the product.
During this period of time, teams of experts (primarily psychologists, from what I can tell) were set up to observe users in a sort of “Chinese Room” type experiment. Users were told that they were using an experimental system to help them work. The experts observed the users, but were not told what the users were trying to do. The experts tried to identify the tasks that the users were accomplishing, and then provide the users with relevant assistance. The way the experiment was set up, users were not sure if it was a computer or humans presenting the help, because they were isolated from the experts. This was done in order to understand what types of tasks were easily identified and for which of those tasks assistance would be helpful to the user.
Lumiére users Bayesian networks of probabilities (commonly used in spam filters) to identify what the user is trying to do and what they need help with based upon their past tasks, takes into account the users’ demonstrated abilities, levels of expertise, and historical “pain points.” After determining the users’ needs and goals (using a set of probabilities that age and become less relevant over time) through a variety of events (and non-events, such as idling the mouse over a menu option for a period of time), Lumiére makes suggestions based upon what the user is most likely stuck on, or helps give them a shortcut to accomplish their goals faster, even if they already know what they are doing.
Does this sound familiar yet? It should. Let us finish the history of Lumiére though
The application development team was so impressed with the results of the Lumiére/application integration, that Lumiére was adapted to that particular application (and related applications) in a less advanced form. I am not sure if the full Lumiére project was stripped down to meet shipping deadlines, reduce the system resource needs, or meet the needs of a broad user base, but much of its advanced functionality (including tracking and profiling users to determine individual skill levels) did not make it to the final product. Lumiére powered what was probably the most hated feature in software history.
The application? Microsoft Office 97. Lumiére’s contribution? Clippy.
Maybe if Lumiére was shipped in its entirety, and users knew that it could be “trained” they would have been willing to put up with it long enough to have it actually be helpful. I do not know, and trying to re-predict the past based on “what-if scenarios” and “just-so stories” is not very helpful. What is known is that Clippy was so universally hated that it is probably the only feature (definitely the only high-profile feature) that Microsoft has ever removed from a product that I can recall. In fact, when Clippy was left out of Office XP (aka Office 2002), it was actually a selling point in advertisements!
Before I read the Lumiére paper, I had assumed that Clippy was some second-rate hack, or maybe not much work had gone into it. Lumiére had been worked on for four years by a wide variety of psychologists, usability experts, statisticians, and a ton of other highly qualified and experienced experts. It was developed by the best of the best. From what I can tell, users in the Lumiére testing group must have been very impressed by it for it to end up in the Microsoft Office Suite. But real-world users despised it. It was annoying. It impeded work flow. It was grossly inaccurate.
Personally, I wanted to love Clippy. I tried a number of times to leave him on. I always felt sad, after the first use of an Office application after installation, to see him blink his eyes and commit feature suicide when I turned him off. Then again, I am also a touch attached by the little puppy in Windows file search. I tried letting Clippy help me do my job. Clippy always made me miserable, and I think he made 99% of users miserable.
And this is why I am reconsidering my stance towards applications that anticipate users’ needs. If all of the effort that Microsoft put into Lumiére could not produce a system that was usable and workable in the real world, it is pretty hard to imagine being able to do it well at all. Granted, Microsoft Office is an incredibly complex suite of applications that get used for a wide variety of tasks, many of which are outside their intended purposes (Excel as a database system, Access for client/server applications, etc.). But I can imagine the difficulty and frustration of using a system which attempts to adjust itself to the users’ needs on the fly. If it is less than 100% accurate, users will get mad and frustrated, and their lives will become painful, instead of the anticipation system making them more productive. That is the real lesson to be learned from Lumiére and Clippy. User anticipation is a great idea, but only if it is perfect in its execution.
J.Ja