If you read the inside-the-Beltway publication Washington Technology you may have noticed that the August issue has an article on five “disruptive technologies”. Search, lumped together with the Semantic Web, is discussed as one of the five. No argument here on search being disruptive, even though it has been around quite awhile. And I wouldn’t argue about the importance that the Semantic Web will play in our future, or the decision to combine search and the Semantic Web in the same discussion.
However, I think that search, and related capabilities such as clustering, is a disruptive technology with or without the Semantic Web.
The author gives two examples of the promise of Semantic Web technologies, first mentioning that “the National Institutes of Health could use it [semantic technology] to break down jargon and help with site exploration.” He then mentions the Cleveland Clinic’s website as an example of a government website “using the vocabulary features of the Semantic Web to create search engines that reach across complex jargon and tech silos to offer a high degree of automation with external systems and various terminologies, in addition to the ability to accurately answer users’ questions.”
The National Library of Medicine (NLM), which is part of the National Institutes of Health (NIH), happens to be a Vivisimo customer, and I decided to do a quick side-by-side comparison of the search and discovery capabilities on the Cleveland Clinic’s website with the Vivisimo-powered NLM website.
Starting at www.clevelandclinic.org,
First, I typed “atrial mixoma” into the search box. “Mixoma” is a misspelling of “myxoma.” The response was zero hits and no further help.
Next, I typed “cancer” into the search box and selected “Diseases/Conditions” in the pop-up to narrow my search. Separate from the 1,024 results that came back, there were two recommended items, both very general overview documents on cancer. That’s helpful for the most general investigation, but the rest of the 1,024 search results were just a list of documents containing the word “cancer,” with the ones that have the most occurrences of the word at the top—not very helpful for exploring different types of cancer. There was a list of links on the left side labeled “Filter Results By,” however this list simply repeated the topics in the pop-up next to the search box. There was nothing very helpful in the list of additional qualifiers, and the experience left me wondering when the magic of semantic technologies would rescue me.
I then went to the Vivisimo-powered www.nlm.nih.gov and typed “atrial mixoma” in the search box. Instead of receiving zero results, the message from the NIH search was a question: “did you mean myxoma”? This is of course was extremely helpful, especially for a website catering to a general audience.
Next, I typed “cancer” in the NLM search box. Just as at the Cleveland Clinic site, I got a lot of results (5,000+ in this case). I also got a link to a general article on cancer separate from the main list. So far the sites would seem to have similar search capability. But looking over on the left side, the NLM site provided a list of “Clusters” generated by the Vivisimo search engine that included types of cancer such as breast, skin, lung, colon, prostate and others. It also contained topics such as treatments. Each of these had a list of sub-topics below them. Clicking on “prostate cancer” I immediately narrowed the list to twelve authoritative articles.
Which website would you rather use to look for information?
Indeed, as a skilled searcher I could use the “advanced search” on the Cleveland Clinic’s site to narrow my search significantly. But remember, I went to that site with the hopes that the Semantic Web would take me by the hand and lead me to the information I was looking for.
My point in providing this comparison is not to be critical of the Cleveland Clinic website, but rather to warn that labeling something as “semantic technology” isn’t always going to make it more useful. The concepts and standards embodied in the Semantic Web do indeed have the potential to be disruptive. However, there are many building blocks needed to get there.

