Tuesday, December 13, 2005

Google Doesn't Work

Google's search doesn't work. Not yet in my opinion. Not very well. It works in the sense that if you're interested in J. Christopher Westland, you can type that into Google and get links to my sites. But for anything where there isn't a right answer but only a best answer (as in most of life) search engines don’t get the job done. They complicate matters rather than simplify them.

Back when I was completing my PhD at Michigan, the World Wide Web had yet to be invented, and search by hypermedia was still largely discounted as a pipe dream of the incredibly impractical (and hopelessly dyslexic) Ted Nelson. Michigan did have some of the first computers capable of hypermedia search – the Xerox Dandelion. I played a bit with them, but since there wasn’t much information stored in the Notes databases that were on them, hypertext search just wasn’t interesting in practice. Fortunately my thesis chair, Manfred Kochen, had a personal project – the ‘World Brain’ he called it – that had intrigued him for most of his academic life. Kochen became interested in this when he was John von Neumann’s research assistant at Princeton’s Institute for Advanced Study in the 1950s, and later at IBM. Kochen had finished his dissertation under fuzzy sets inventor Lotfi Zadeh, and perceived search in terms of fuzzy associations with user profiles. He was surprisingly prescient, as most search engines now include a measured degree of serendipity. My own dissertation was on semantic networks applied to information retrieval, an approach which Tim Berners-Lee has suggested for future evolution of the Internet.

Why fiddle with success? I think that the Internet’s hypermedia search model doesn't work that well for a lot of people, particularly those who don't possess the patience of a true believer. Nor does it work on some types of focused or highly structured tasks. The problems start when we get too many responses that match up poorly to our questions and context. Some search engines, e.g., Chemistry from Match.com, or eHarmony, and other profiling tools for job hunting or dating, attempt to improve on the situation by extensive questioning and profiling. Certainly more information can reduce the number of responses – but it also decreases the likelihood of a serendipitous discovery.

The problem is Internet search’s counterpart to type I and type II error in statistics – what are called Precision and Recall of a search. Precision is the percent of returned records that answer the search query; Recall is percent of records returned out of all those records in the entire Internet that could answer the query. If you improve Precision, you lower the number of records retrieved, making it easier to read through and use them. But you also leave more records unretrieved – some of these might contain nuggets of information that you would really like to have retrieved.

Internet search frustrates us with erratic performance when we try to apply it to distinctly human problems. Social networking , matchmaking, job hunting, thought sharing ; all are liable today to get caught up in social cliques and terminate in meaningless cul-de-sacs because of the inhuman and even misanthropic mechanisms of Web linkage. Evolution of the Internet will have to give these uniquely human tasks first priority in search and context.

The problem grows as the Internet grows. Of course, if there were not so much information on the Internet to begin with, the problem wouldn’t really have been that interesting. It’s our Catch 22 for the 21st century.


