Subscribe to
NSLog(); Header Image

XML? Phooey

Perhaps someone can explain to me just why in Sam's tarnation XML is so wonderful. Why? Because I'd really like to know. I feel like I'm missing the boat here, and that XML is something I've already seen and done. Except I called it "tab-delimited text" or something.

But seriously... XML? It's text. Plain freaking text. HTML was more exciting.

XML "allows for seamless sharing of information." So did plain text files. XML "allows for an agreed-upon standard to be easily followed and validated." So did a three line perl script run on a plain text file.

Nearly every scripting or development language handles whitespace (tabs) quite nicely. Handling something like "<comment>here is <b>something</b> i want to say</comment>" becomes a much more complex task. Whitespace parsing is so inherent to any (most) language(s) or scripting environment that "wrappers" and crap doesn't need to be written for it. Instead, now there are XML parsers for PHP, C++, Java, etc. etc. But they had to be written because they didn't exist.

What has XML gotten us? If two people want to share data - like this site's XML RSS feed - why can't it be agreed upon to be in this format: article name (tab) article url (tab) article snippet (tab) article date ?

Okay, I've gone too far. Take a look at this site's RDF and RSS feeds and you'll see they're actually surprisingly complex.

So what am I really saying? Try this on for size: XML is not a God-send. It's a natural evolution of plain-text files and "standards" for sharing them. Would it make editing easier? I really doubt it.

There are two cases in which XML is a "nice" alternative (but again, not the massive earth-saving God-send it's often billed as): versus binary data and versus complex plain text files. The RDF and RSS feeds above are examples of the latter - those would be somewhat more difficult to work with as tab-delimited text files of some kind. Delimiters would be needed for different sections and so on. It'd be a mess to work with in a tab-delimited kind of way (though, again: most tools could do it "out of the box" without an XML parsing "add-on").

The second case most directly affects me day-to-day. Cocoa dictionary and array objects can write themselves to XML quite nicely. They can do so locally (their preferences files are "plists" or property lists - XML) or over the net (see this article at Cocoa Dev Central).

On Mac OS 9, preference files were often binary globs of "crap" that you threw away when broken. In Mac OS X, even some of the system preferences are stored as XML (plists), easily edited on the command line, from within BBEdit, and so on. It's added a new world to the world of development and end-user "maintenance."

Microsoft announced some time ago that the next version(s) of Office would use XML as the native file format. Unfortunately, as you can read here, it's not actually a very "open" standard as XML (again: geared towards "standards" and "sharing") set out to be. Leave it to Microsoft to fuck up something intended to help all computer users. Leave it to Microsoft to compete unfairly on their "standards."

It's like this... years ago all the virus companies agreed that they'd never "hide" a virus from the other companies. They agreed, essentially, to compete not based on which products detected and killed the most viruses, but on the feature sets of the virus applications themselves. Microsoft is afraid to compete in this area. If the ".doc" format was opened, then all sorts of cheaper competitors would jump on board, and Microsoft Word wouldn't be necessary. People would be using "DOCtor" and "Steve's Homebrow Word Processing App." And they'd drive down the necessity of "Word" as an entity.

Heaven forbid THAT ever happen. So, XML... it is what it is (unless it's Microsoft XML). And even when it is what it is, it ain't all that. <grin>