16 février 2004

Search For Tomorrow (Wash. Post):
' To achieve common sense, the Web needs to go through the infantile process of self-discovery. The Web doesn't really understand itself. There's lots of information on the Web, but not much 'information about information,' also known as 'metadata.'
If you're a robotic search engine, you look for words in the text of a page, but ideally the page would have all manner of encoded labels that describe who wrote the material, and why, and when, and for what purpose, and in what context.
Hendler explains the problem this way: If you type into Google the words 'how many cows in Texas,' Google will rummage through sites with the words 'cow' and 'many' and 'Texas,' and so forth, but you may have trouble finding out how many cows there are in Texas. The typical Web page involving cows and Texas doesn't have anything to do with the larger concept of bovine demographics. (The first Google result that comes up is an article titled 'Mineral Supplementation of Beef Cows in Texas' by the unbelievably named Dennis Herd.)
Hendler, along with World Wide Web inventor Tim Berners-Lee, is working on the Semantic Web , a project to implant the background tags, the metadata, on Web sites. The dream is to make it easier not only for humans, but also machines, to search the Web. Moreover, searches will go beyond text and look at music, films, and anything else that's digitized. 'We're trying to make the Web a little smarter,' Hendler says.
But Peter Norvig, director of search quality at Google, points out that the current keyword-driven searching system, clumsy though it may be and so heavily reliant on serendipity, still works well for most situations.
"Part of the problem is that keywords are so good," he says. "Most of the time the words do what you want them to do." '

