Tuesday, January 15, 2008

Predicting satisfaction with search

Googlers Scott Huffman and Michael Hochster have a SIGIR 2007 paper, "How Well does Result Relevance Predict Session Satisfaction?" (ACM), that attempts to find a simple metric for people's satisfaction when they search.

The paper looks at "session satisfaction", not query satisfaction, because the goal is to determine how satisfied people are when trying to accomplish a task using the search engine. In some cases, people make many queries before getting the answer they need or before they give up.

The metric they end up with is a combination of the relevance of the first three results for the first query in the session that treats navigational and non-navigational queries differently (placing much more value on the first result for navigational queries). They also got a small boost by considering how many pages (which they call "events") were viewed in the session.

Don't miss the discussion in the paper of other things they tried (Section 4). Particularly interesting is that only navigational queries appear to require special handling and that they never tried using the last query in the session rather than the first (though they mention they would like to).

One thing I would have liked to see more of was a discussion of why they only consider the first three results. It was my understanding that a common behavior for searchers is to look at the top 1-3 results, then quickly skim the remainder. If true, I wonder if this suggests that the relevance of the top couple results are most important, but that the other results should still be considered in some form.

Finally, I would love to see a version of this using clickstream data instead of manually labeled relevance for the search results. Clickstream data is easily available. It would be tremendously useful to have a good proxy for session satisfaction that uses clicks instead of other data.

No comments: