XML vs. rel DBs [was: Re: [nycphp-talk] Many pages: one script]
Elliotte Harold
elharo at metalab.unc.edu
Sat Aug 11 23:07:02 EDT 2007
csnyder wrote:
> If it's not too much trouble, could you give us some other use cases
> for an XML database? Because title and first paragraph, if that's
> something a system "routinely does" could easily be stored as
> relational data at the time of import.
>
Storing books, web pages, and the like in a relational database has only
two basic approaches: make it a blob or cut it into tiny little pieces.
The first eliminates search capabilities; the second performs like a dog.
You're right that if grabbing the title and the first paragraph is all
you need to do, then these two pieces could be stored separately (and
normalization be damned.) Now suppose the editor comes along and tells
you they really want to show the first two paragraphs of text to
non-logged in users instead of just the first one. If you know the
queries in advance, you can layout the data to optimize them, but using
the data's own structure will give you much more flexibility for
unexpected uses.
Here's another common use case: extract the links from a web site. Do a
Google-like reverse index that finds all the pages linking to this one.
The only way to make that happen in a relational DB is to chop the
content into so many trivially small pieces that putting them back
together again is prohibitively expensive. And even once you've done
that, the SQL to pull the result is ungodly ugly. The XQuery is a lot
simpler because it matches the natural structure of the documents rather
than treating everything as a table. Some data wants to live in tables.
Some doesn't.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
More information about the talk
mailing list