if Semantic Web is the medium, what is the message? May 20, 2006

Posted by dfhuynh in research, semantic web.

A colleague of mine is in the habit of speaking in URLs. Often for points he wishes to illustrate, he would dig out from the vast Web a URL and give us that, instead. I’ve jokingly noted that ours is turning into a URL-speaking society, but there is truth in that. The Web, from one point of perspective, is a rudimentary global language in which the URLs are the nouns. And everything else necessary for a language is missing.

Then comes the Semantic Web.

Designed to fill in the slots for everything that the Web is missing, the Semantic Web is a hot medium. It leaves little for the reader’s imagination. Indeed, it’s supposed to be so perfectly specified that machines can reliably act on it.

Web 2.0, on the other hand, is a cool medium that calls for participation, for social interaction, for interpretation and involvement. Tagging, blogging, etc. all require heavy participation and interpretation, which makes Web 2.0 a lot more congruent with our cool electric society today. Hence, Web 2.0 has easily taken off and Semantic Web is left in the lab cage.

Perhaps just like many great ideas before it, the Semantic Web is here at the wrong time, or at least cast in the wrong way to our particular society at its particular state.


So then, if Semantic Web is the medium, what is its message? The message that is understandable by the electric man, connected with cell phones, blogs, tags,…, who is becoming once again the tribal man as Marshall McLuhan predicted.

In many ways, the Semantic Web is like the printing technology (movable types). Whole books can be beautifully, precisely, and multiply produced by stringing together repeatable, uniform units that are carefully placed. Similarly, complex topics can now be described (supposedly) precisely by connecting together as graphs of URIs (supposedly) carefully coined. The similarity between print and Semantic Web is clear when one looks at how sophisticated Semantic Web UIs are rendered: views are recursively embedded one within another as the machine reaches out into the graph and pulls in more and more data. This gave rise to my obsession with the “fine granularity” that RDF affords when I started my research on the Haystack project. That is, just as typography is about repeatability, RDF is about recursability.

Just as print, according to McLuhan, turned the tribal man into the individualistic man, capable of entertaining thoughts outside communal gatherings, Semantic Web is an individualistic technology. It is about gathering information from external sources and then interpreting it in one’s own way for one’s own individual needs. That is, roll your own semantics on someone else’s data. This is why my Piggy Bank research has been so instrumental for understanding the Semantic Web for many, and why Simile‘s attempts at a screen-scraper ecology and Semantic Bank deployments have been less than successful.

Just like movable type made words more affordable, the Semantic Web will make data more affordable as data can be more easily re-used independent of its origin and original purpose. And just as printed words are mediated by paper and ink, Semantic Web data is necessarily mediated by the machine that gather, interpret, and combine disparate Semantic Web data into a coherent soup upon each individual’s demand.

That is all to say, the Semantic Web is like typography for this primitive, global language called the Web. And we are still searching for its typesetting machine.


When the Semantic Web starts to mature, the artists will move in and explore this new medium. Expect machine processible semantic puns. Expect graph arts.

Printing and Calculus May 19, 2006

Posted by dfhuynh in research.

More quotations from Understanding Media (Marshall McLuhan):

In chapter 9, "The Written Word: An Eye for An Ear":

Prince Modupe wrote of his encounter with the written word in his West African days:

The one crowded space in Father Perry's house was his bookshelves. I gradually came to understand that the marks on the pages were trapped words. Anyone could learn to decipher the symbols and turn the trapped words loose again into speech. The ink of the print trapped the thoughts; they could no more get away than a doomboo could get out of a pit. When the full realization of what this meant flooded over me, I experienced the same thrill and amazement as when I had my first glimpse of the bright lights of Konakry. I shivered with the intensity of my desire to learn to do this wondrous thing myself.


In chapter 11, "Number: Profile of the Crowd":

For the Renaissance, it was the infinitesimal calculus that enabled arithmetic to take over mechanics, physics, and geometry. The idea of an infinite but continuous and uniform process, so basic to the Gutenberg technology of movable types, gave rise to the calculus. Banish the infinite process and mathematics, pure and applied, is reduced to the state known to the pre-Pythagoreans. This is to say, banish the new medium of print with its fragmented technology of uniform, lineal repeatability, and modern mathematics disappears. Apply, however, this infinite uniform process to finding the length of an arc, and all that need be done is to inscribe in the arc a sequence of rectilinear contours of an increasing number of sides… [p.117]

 "Standing on the shoulders of giants" now means something even larger.

Understanding Media May 19, 2006

Posted by dfhuynh in research, semantic web.

Four/five years ago I could hardly make it past the first chapter of Understanding Media by Marshall McLuhan, from which came the infamous sentence, "the medium is the message." I have recently returned to it and I'm now half-way through, enjoying every page. And although I'm not sure if I can re-verbalize his ideas, I do find the book very illuminating. I'd recommend it even more than I'd recommend Edward Tufte's books.

This is my effort to try to understand this new medium called the Semantic Web on which I'm doing my research.

Core to McLuhan's book, as how I understand it, is the idea that technologies shape the way humans fundamentally think and behave, regardless of how the technologies are used, that each technology bears some intrinsic characteristics independent of its use. And that humans, embedded within the technologies, cannot but think and act in some particular ways. This is a generalization of the Sapir-Whorfian hypothesis which posits that one's thoughts are moulded by whichever native language that one speaks.

The most primitive tribes of Australia and Africa, like the Eskimos of today, have not yet reached finger-counting, nor do they have numbers in series. Instead they have a binary system of independent numbers for one and two, with composite numbers up to six. After six, they perceive only "heap." Lacking the sense of series, they will scarcely notice when two pins have been removed from a row of seven. They become aware at once, however, if one pin is missing. [p.111]

The computer science researchers these days speak a few certain technical languages and are embedded within a few certain technologies. For example, some speak the "relational database" language, some speak the "Emac buffer" language, some speak the "Perl" language, … And a lot of the Semantic Web researchers speak some sort of "graph" language, although many are still stuttering and think in the "relational database" language.

For those who speak the "relational database" language, it is next to impossible to comprehend a world of graphs wherein the cost of creating relationships is zero. So they don't speak of "relationships" although ironically they deal with relational databases. They only speak of foreign keys and tables. They even have tables that contain only 2 foreign keys… If you ask me, it's another case of premature optimizations.

The Semantic Web folks are further along. They have adopted the "graph" language but then become greatly incapable of talking to the rest of the world who don't speak graphs. In vain, they take graphs and splash them on the screen, literally, creating visualizations full of nodes and arrows that are incomprehensible to everyone else.

But upon closer inspection, most of these graphs that they create are shallow, and some are just trees. It turns out, I suspect, that the Semantic Web people are simply obsessed with graphs that they try to turn everything into graphs even if they can only find shallow, tabular data. And so they are back to square one, busy exploring tools for visualizing and dealing with tabular data in graph's clothing without even realizing it.

It is amusing, puzzling, and frustrating to watch the great minds, many way "smarter" than I am, getting stuck in their own mindsets.

Once men have adopted the visual dynamic of the phonetic alphabet, they begin to lose the tribal man's obsession with cosmic order and ritual as recurrent in the physical organs and their social extension. Indifference to the cosmic, however, fosters intense concentration on minute segments and specialist tasks, which is the unique strength of Western man. For the specialist is one who never makes small mistakes while moving toward the grand fallacy. [p.124]

It is so true it is terrifying. After all, what mindset am I getting stuck in?! I will never know.

Flickr April 14, 2006

Posted by dfhuynh in Photography, Play.
OK, so I finally got with the program and signed up on Flickr:

A screencast is worth a thousand words March 3, 2006

Posted by dfhuynh in firefox, research, semantic web.

I was trying to explain to my gf what my research was all about, but it was hard to explain, and so I made a screencast. But before you watch this screencast, try the following:

  1. go to http://www.amazon.com/
  2. search for "typography"
  3. tell me how many paperback books on typography published in 2004, 2005, and 2006 you have found

Was it easy for you to do that? Did you try to look for a Sort by Year feature?… Now watch the screencast:

http://people.csail.mit.edu/dfhuynh/research/media/UIST%202006/uist2006.swf (11 Mb)

[April 3rd, 2006] Updated to new version:

http://people.csail.mit.edu/dfhuynh/research/papers/uist2006-augmenting-web-sites.mov (23 Mb, 3 min)