Clay Shirky's Writings About the Internet
Economics & Culture, Media & Community, Open Source
The Semantic Web, Syllogism, and Worldview
First published November 7, 2003 on the "Networks, Economics, and Culture" mailing list.
Subscribe to the mailing list.
The W3C's Semantic Web project has been described in many ways over the last few years: an extension of the current web in which information is given well-defined meaning, a place where machines can analyze all the data on the Web, even a Web in which machine reasoning will be ubiquitous and devastatingly powerful. The problem with descriptions this general, however, is that they don't answer the obvious question: What is the Semantic Web good for?
The simple answer is this: The Semantic Web is a machine for creating syllogisms. A syllogism is a form of logic, first described by Aristotle, where "...certain things being stated, something other than what is stated follows of necessity from their being so." [Organon]
The canonical syllogism is:
Humans are mortal
Greeks are human
Therefore, Greeks are mortal
with the third statement derived from the previous two.
The Semantic Web is made up of assertions, e.g. "The creator of shirky.com is Clay Shirky." Given the two statements
- Clay Shirky is the creator of shirky.com
- The creator of shirky.com lives in Brooklyn
you can conclude that I live in Brooklyn, something you couldn't know from either statement on its own. From there, other expressions that include Clay Shirky, shirky.com, or Brooklyn can be further coupled.
The Semantic Web specifies ways of exposing these kinds of assertions on the Web, so that third parties can combine them to discover things that are true but not specified directly. This is the promise of the Semantic Web -- it will improve all the areas of your life where you currently use syllogisms.
Which is to say, almost nowhere.
Syllogisms are Not Very Useful#
Though the syllogism has been around since Aristotle, it reached its apotheosis in the 19th century, in the work of Charles Dodgson (better known as Lewis Carroll.) Dodgson wrote two books of syllogisms and methods for representing them in graphic form, and his syllogisms often took the form of sorites, where the conclusion from one pair of linked assertions becomes a new assertion to be linked to others.
One of Dodgson's sorites goes:
- Remedies for bleeding, which fail to check it, are a mockery
- Tincture of Calendula is not to be despised
- Remedies, which will check the bleeding when you cut your finger, are useful
- All mock remedies for bleeding are despicable
which lets you conclude that Tincture of Calendula will check the bleeding when you cut your finger.
Despite their appealing simplicity, syllogisms don't work well in the real world, because most of the data we use is not amenable to such effortless recombination. As a result, the Semantic Web will not be very useful either.
The people working on the Semantic Web greatly overestimate the value of deductive reasoning (a persistent theme in Artificial Intelligence projects generally.) The great popularizer of this error was Arthur Conan Doyle, whose Sherlock Holmes stories have done more damage to people's understanding of human intelligence than anyone other than Rene Descartes. Doyle has convinced generations of readers that what seriously smart people do when they think is to arrive at inevitable conclusions by linking antecedent facts. As Holmes famously put it "when you have eliminated the impossible, whatever remains, however improbable, must be the truth."
This sentiment is attractive precisely because it describes a world simpler than our own. In the real world, we are usually operating with partial, inconclusive or context-sensitive information. When we have to make a decision based on this information, we guess, extrapolate, intuit, we do what we did last time, we do what we think our friends would do or what Jesus or Joan Jett would have done, we do all of those things and more, but we almost never use actual deductive logic.
As a consequence, almost none of the statements we make, even seemingly obvious ones, are true in the way the Semantic Web needs them to be true. Drew McDermott, in his brilliant Critique of Pure Reason [Computational Intelligence, 3:151-237, 1987], took on the notion that you could create Artificial Intelligence by building a sufficiently detailed deductive scaffolding. He concluded that this approach was fatally flawed, noting that "It must be the case that a significant portion of the inferences we want [to make] are deductions, or it will simply be irrelevant how many theorems follow deductively from a given axiom set." Though Critique of Pure Reason predates not just the Semantic Web but the actual web as well, the criticism still holds.
Consider the following statements:
- The creator of shirky.com lives in Brooklyn
- People who live in Brooklyn speak with a Brooklyn accent
You could conclude from this pair of assertions that the creator of shirky.com pronounces it "shoiky.com." This, unlike assertions about my physical location, is false. It would be easy to shrug this error off as Garbage In, Garbage Out, but it isn't so simple. The creator of shirky.com does live in Brooklyn, and some people who live in Brooklyn do speak with a Brooklyn accent, just not all of them (us).
Each of those statements is true, in other words, but each is true in a different way. It is tempting to note that the second statement is a generalization that can only be understood in context, but that way madness lies. Any requirement that a given statement be cross-checked against a library of context-giving statements, which would have still further context, would doom the system to death by scale.
We Describe The World In Generalities#
We can't disallow generalizations because we can't know which statements are generalizations by looking at them. Even if we could, it wouldn't help, because generalizations are a fundamental tool of expression. "People who live in France speak French" is structurally no different than " People who live in Brooklyn speak with a Brooklyn accent." In any human context "People who live in France speak French" is true, but it is false if universals are required, as there are French immigrants and ex-patriates who don't speak the language.
Syllogisms sound stilted in part because they traffic in absurd absolutes. Consider this gem from Dodgson:
- No interesting poems are unpopular among people of real taste
- No modern poetry is free from affectation
- All your poems are on the subject of soap-bubbles
- No affected poetry is popular among people of real taste
- No ancient poetry is on the subject of soap-bubbles
This, of course, allows you to conclude that all your poems are bad.
This 5-line syllogism is the best critique of the Semantic Web ever published, as its illustrates the kind of world we would have to live in for this form of reasoning to work, a world where language is merely math done with words. Actual human expression must take into account the ambiguities of the real world, where people, even those with real taste, disagree about what is interesting or affected, and where no poets, even the most uninteresting, write all their poems about soap bubbles.
The Semantic Web's Proposed Uses#
Dodgson's syllogisms actually demonstrate the limitations of the form, a pattern that could be called "proof of no concept", where the absurdity of an illustrative example undermines the point being made. So it is with the Semantic Web. Consider the following, from the W3C's own site:
Q: How do you buy a book over the Semantic Web?
A: You browse/query until you find a suitable offer to sell the book you want. You add information to the Semantic Web saying that you accept the offer and giving details (your name, shipping address, credit card information, etc). Of course you add it (1) with access control so only you and seller can see it, and (2) you store it in a place where the seller can easily get it, perhaps the seller's own server, (3) you notify the seller about it. You wait or query for confirmation that the seller has received your acceptance, and perhaps (later) for shipping information, etc. [
One doubts Jeff Bezos is losing sleep.
This example sets the pattern for descriptions of the Semantic Web. First, take some well-known problem. Next, misconstrue it so that the hard part is made to seem trivial and the trivial part hard. Finally, congratulate yourself for solving the trivial part.
All the actual complexities of matching readers with books are waved away in the first sentence: "You browse/query until you find a suitable offer to sell the book you want." Who knew it was so simple? Meanwhile, the trivial operation of paying for it gets a lavish description designed to obscure the fact that once you've found a book for sale, using a credit card is a pretty obvious next move.
Consider another description of the Semantic Web that similarly misconstrues the problem:
Merging databases simply becomes a matter of recording in RDF somewhere that "Person Name" in your database is equivalent to "Name" in my database, and then throwing all of the information together and getting a processor to think about it. [
No one who has ever dealt with merging databases would use the word 'simply'. If making a thesaurus of field names were all there was to it, there would be no need for the Semantic Web; this process would work today. Contrariwise, to adopt a Lewis Carroll-ism, the use of hand-waving around the actual problem -- human names are not globally unique -- masks the triviality of linking Name and Person Name. Is your "Person Name = John Smith" the same person as my "Name = John Q. Smith"? Who knows? Not the Semantic Web. The processor could "think" about this til the silicon smokes without arriving at an answer.
From time to time, proselytizers of the Semantic Web try to give it a human face:
For example, we may want to prove that Joe loves Mary. The way that we came across the information is that we found two documents on a trusted site, one of which said that ":Joe :loves :MJS", and another of which said that ":MJS daml:equivalentTo :Mary". We also got the checksums of the files in person from the maintainer of the site.
To check this information, we can list the checksums in a local file, and then set up some FOPL rules that say "if file 'a' contains the information Joe loves mary and has the checksum md5:0qrhf8q3hfh, then record SuccessA", "if file 'b' contains the information MJS is equivalent to Mary, and has the checksum md5:0892t925h, then record SuccessB", and "if SuccessA and SuccessB, then Joe loves Mary". [
You may want to read that second paragraph again, to savor its delicious mix of minutia and cluelessness.
Anyone who has ever been 15 years old knows that protestations of love, checksummed or no, are not to be taken at face value. And even if we wanted to take love out of this example, what would we replace it with? The universe of assertions that Joe might make about Mary is large, but the subset of those assertions that are universally interpretable and uncomplicated is tiny.
One final entry in the proof of no concept category:
Here's an example: Let's say one company decides that if someone sells more than 100 of our products, then they are a member of the Super Salesman club. A smart program can now follow this rule to make a simple deduction: "John has sold 102 things, therefore John is a member of the Super Salesman club." [
This is perhaps perhaps the high water mark of presenting trivial problems as worthy of Semantic intervention: a program that can conclude that 102 is greater than 100 is labeled smart. Artificial Intelligence, here we come.
Meta-data is Not A Panacea #
The Semantic Web runs on meta-data, and much meta-data is untrustworthy, for a variety of reasons that are not amenable to easy solution. (See for example Doctorow, Pilgrim, Shirky.) Though at least some of this problem comes from people trying to game the system, the far larger problem is that even when people publish meta-data that they believe to be correct, we still run into trouble.
Consider the following assertions:
- Count Dracula is a Vampire
- Count Dracula lives in Transylvania
- Transylvania is a region of Romania
- Vampires are not real
You can draw only one non-clashing conclusion from such a set of assertions -- Romania isn't real. That's wrong, of course, but the wrongness is nowhere reflected in these statements. There is simply no way to cleanly separate fact from fiction, and this matters in surprising and subtle ways that relate to matters far more weighty than vampiric identity. Consider these assertions:
- US citizens are people
- The First Amendment covers the rights of US citizens
- Nike is protected by the First Amendment
You could conclude from this that Nike is a person, and of course you would be right. In the context of in First Amendment law, corporations are treated as people. If, however, you linked this conclusion with a medical database, you could go on to reason that Nike's kidneys move poisons from Nike's bloodstream into Nike's urine.
Ontology is Not A Requirement#
Though proponents of the Semantic Web gamely try to illustrate simple uses for it, the kind of explanatory failures above are baked in, because the Semantic Web is divided between two goals, one good but unnecessary, the other audacious but doomed.
The first goal is simple: get people to use more meta-data. The Semantic Web was one of the earliest efforts to rely on the idea of XML as a common interchange format for data. With such a foundation, making formal agreements about the nature of whatever was being described -- an ontology -- seemed a logical next step.
Instead, it turns out that people can share data without having to share a worldview, so we got the meta-data without needing the ontology. Exhibit A in this regard is the weblog world. In a recent paper discussing the Semantic Web and weblogs, Matt Rothenberg details the invention and rapid spread of "RSS autodiscovery", where an existing HTML tag was pressed into service as a way of automatically pointing to a weblog's syndication feed.
About this process, which went from suggestion to implementation in mere days, Rothenberg says:
Granted, RSS autodiscovery was a relatively simplistic technical standard compared to the types of standards required for the environment of pervasive meta-data stipulated by the semantic web, but its adoption demonstrates an environment in which new technical standards for publishing can go from prototype to widespread utility extremely quickly. [PDF offline, cache here]
This, of course, is the standard Hail Mary play for anyone whose technology is caught on the wrong side of complexity. People pushing such technologies often make the "gateway drug" claim that rapid adoption of simple technologies is a precursor to later adoption of much more complex ones. Lotus claimed that simple internet email would eventually leave people clamoring for the more sophisticated features of CC:Mail (RIP), PointCast (also RIP) tried to label email a "push" technology so they would look like a next-generation tool rather than a dead-end, and so on.
Here Rothenberg follows the script to a tee, labeling RSS autodiscovery 'simplistic' without entertaining the idea that simplicity may be a requirement of rapid and broad diffusion. The real lesson of RSS autodiscovery is that developers can create valuable meta-data without needing any of the trappings of the Semantic Web. Were the whole effort to be shelved tomorrow, successes like RSS autodiscovery would not be affected in the slightest.
Artificial Intelligence Reborn#
If the sole goal of the Semantic Web were pervasive markup, it would be nothing more than a "Got meta-data?" campaign -- a generic exhortation for developers to do what they are doing anyway. The second, and larger goal, however, is to take up the old Artificial Intelligence project in a new context.
After 50 years of work, the performance of machines designed to think about the world the way humans do has remained, to put it politely, sub-optimal. The Semantic Web sets out to address this by reversing the problem. Since it's hard to make machines think about the world, the new goal is to describe the world in ways that are easy for machines to think about.
Descriptions of the Semantic Web exhibit an inversion of trivial and hard issues because the core goal does as well. The Semantic Web takes for granted that many important aspects of the world can be specified in an unambiguous and universally agreed-on fashion, then spends a great deal of time talking about the ideal XML formats for those descriptions. This puts the stress on the wrong part of the problem -- if the world were easy to describe, you could do it in Sanskrit.