What is Tag Soup?
Sat, Oct 12, 2002; by Dave Winer.
On the XML-DEV list this week a hearty debate on the merits of evolved formats like RSS, and other formats that are debated on the XML-DEV mail list every day. One of the debaters called my style of design "tag soup" -- which is a new term I had been hearing lately, it piqued my curiousity, so I found out what it means.
I believe the term was coined by Dan Connolly of the W3C when he was talking about HTML parsers that accept anything anywhere. The example he cited is the <title> element. It really only makes sense in the <head> of a document, but apparently one or more browsers would let you set the title of a page in the body of the page! It's not like this makes the earth crumble or the sky fall, everything can proceed normally, but it's wrong to do it there and the world would be a (slightly) better place if browsers didn't allow it.
Another tag-soupish idea. Suppose you want to indent something underneath another thing. HTML has a convenient element, called <blockquote> which does just this. Ahhh, but a strict HTMLer would say that's wrong and by taking advantage of the fact that some browsers (ie Netscape and MSIE, which have made up 90-plus percent of the browser universe since 1994) implement blockquote by indenting, I am destroying in a small way, the semantic meaning of blockquote. To use it for any other purpose than containing a quote of several sentences or paragraphs is to break one of the cardinal rules of the Web. But people do it anyway.
They've already lost the argument. The Web is tag soup. People use blockquotes to indent. Even though the REST folk argue that it's anti-Web to do RPC, people do RPC anyway. There's a never-ending list of complaints, but they can be resolved. That's why I'm writing this little essaylet.
Sometimes you invent something thinking you know how it's going to be used, and the world surprises you and uses it for something else. It's happened to me. I designed my first outliners to be "idea processors." To make a long story short, more people used outliners to create presentations. Eventually I tired of being right and wanted to make money, so we developed the idea, people loved it and we made our shareholders rich, and everyone lived happily ever after.
How does this relate to HTTP and HTML in 2002? They became something other than what their inventor imagined them to be. It didn't do any harm to anyone that this happened, now that it's happened, it couldn't be any other way. You can't put the genie back in the bottle. Only by making your world very small can you fail to see the enormity of getting everyone to see it your way. Better to adapt your thinking to their way, and see how you can make your vision fit into what is.
In other words, Tag Soup, as awful as it may be, and related concepts, are the way of the world.