|
|
![]() ARTICLE
A Web Site User Model Should at Least Model Something About Users A press release from WebCriteria1 was passed on to me because it made strong claims about a tool able to predict users' Web browsing behavior, which is an area in which I am conducting research (Chi, Pirolli, &Pitkow, in press; Pirolli &Card, 1999). As presented in the press release, the predictions were based on a browsing model called Max: Max, the intelligent agent, is programmed with perceptual, cognitive and motor behaviors gleaned from fundamental computer human interaction research and original research of web user browsing behavior. This browsing behavior model ensures true apples-to-apples comparison and provides a high degree of accuracy, objectivity and repeatability. Max's measurements, contained in SiteProfile reports, have been shown to match users' perceptions of ease of use and speed, http://zing.ncsl.nist.gov/hfweb/proceedings/lynch/index.html. Upon reading the cited report (Lynch, Palmiter, &Tilt, 1999) which was presented at a peer-reviewed conference (Human Factors and the Web), I cannot find any evidence to support any of above claims. In fact, the main results of the report show no correlation between Max's predictions and user behavior. Max's Psychological Model is not Psychologically RealLet us first address the claim that the Max Model has psychological validity. Lynch et al (Lynch et al., 1999) claim that the Max Model is based on GOMS and the Model Human Processor (MHP, Card, Moran, &Newell, 1983; Pirolli, 1999) and therefore constitutes a psychologically valid model. Although Lynch et al. provide a paragraph summarizing GOMS and MHP, the remainder of the article does not provide any GOMS or MHP analysis. Nor does the article give any kind of user task analysis or perceptual-cognitive-motor timing analysis that would be generally consistent with the GOMS/MHP approach. Indeed, the description of the Max model structure and operation is quite unlike any GOMS model. That would be fine if Max were some psychological plausible alternative, but it is not. The following are some of the "Psychological Characteristics" of Max, which seem to fly in the face of even casual observation:
If any of these claims are true of real users, there is no evidence presented to support them. Max is not a psychologically real model of users. Max's Accesibility Values are Uncorrelated with Observed User TimesThe main point of Lynch et al (1999) is to "validate the Accessibility metric, [which is] an abstraction of the Max Model." To do this, Lynch et al aim to "correlate Accessibility Values for the current abstraction of the Max Model with observed behaviors and subjective ratings." Figure 1 is a scatterplot the data presented in Lynch et al, Figure 5. To the naked eye the data appear uncorrelated. The statistical correlation (Pearson product-moment) is r = .28, which is not significant. Figure 2 presents the same data, without the extreme value. The Pearson product-moment correlation based on this smaller set of data is r = .32, which is not significant. One can only conclude that the Lynch et al article fails to validate the Accessibility metric. There are no grounds for the WebCriteria press release claiming "true apples-to-apples comparisons" with "high accuracy," because, in fact, Max predicts nothing. ![]() Figure 1. Observed User Time as a function of Max's predicted Accessibility Value. Data are replotted from Lynch et al (1999). There is no correlation between Max's predictions and User Time, r = .28.
Figure 2. Same data as presented in Figure 1, without the extreme point. Again, there is no correlation, r = .32 ConclusionIt is the nature of marketing to print press releases that paint a product in the best light possible using whatever evidence is available. These press releases then find their way into national and international publications. When claims are published that are simply prima facie invalid, it serves no one-not the business (if it wants to remain in the market), the consumer, nor the field. Although I have targeted the Lynch et al article for criticism, the fault also lies with the review process for the article. The main conclusion of the article is contradicted by its contents. The paper claims that it will show a correlation between a model and users, and the data simply shows no such correlation. The pernicious effect of publishing unsubstantiated claims is that they can be, and often are, cited as "proven" by virtue of their publication. This can lull consumers and users to waste time and money. This can only serve to weaken the standing of the specific conference and the field in general. 1 Industry's first web site behavior agent gets face-lift and more muscle with enhance SiteProfile reports [WebCriteria press release] (2000, February 8). PRNewswire. References
© Internet Technical Group Last update: April 15, 2000 URL: http://www.sandia.gov/itg/newsletter/mar00/critique_max.html hosted by Sandia National Labs Disclaimer: Neither Sandia Corporation, the United States Government, nor any agency thereof, nor any of their employees makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by Sandia Corporation, the United States Government, or any agency thereof. The views and opinions expressed herein do not necessarily state or reflect those of Sandia Corporation, the United States Government or any agency thereof. |