[nycphp-talk] Blog Posts with Embedded Content
Hans Zaunere
lists at zaunere.com
Sun Oct 12 19:19:11 EDT 2008
Gentlemen,
> > The safest approach is probably to pass the html through tidy, and
> > then into DOM, and traverse and count the length of text nodes, but
> > that would be quite slow if you ran it on every request.
>
> Right, +1 for Tidy and DOM, it's the "real" way to do it. You won't
> need to do it on every request -- you can either store the summary
> itself as a separate text field, or store the length of the summary as
> an integer.
I tried this, working through using both DOM and Tidy, and combinations of each - no luck. The problem is getting the differential between the two versions of the text.
> This is crying out for a web service: The Excerpter. POST markup, get
> the first X display characters back as a response, with embedded HTML
> intact.
Yeah, I agree - this has turned into a royal problem, and one that seems as though it'd had to be solved already.
At the end of the day, what would be a very handy library - an object/etc that would store the text, in various forms, include various manipulation methods on it, meta data, etc, etc. I had written something like this for MIME, but would not look forward to doing it for HTML/etc.
H
More information about the talk
mailing list