[nycphp-talk] fetch page script
Hans Zaunere
zaunere at yahoo.com
Wed Jan 29 16:29:29 EST 2003
Hello,
--- Carlos A Hoyos <cahoyos at us.ibm.com> wrote:
>
> Hi,
> I need to write a function that given a URL, will open it and "rip it" to
> my local server, (i.e. save the page's code to a specific directory, queue
> all of the page's assets -images, css, js, etc- save them to that directory
> as well and fix all paths to make it available for local browsing).
Connecting and ripping pages isn't a big deal, but there may be some added
complexity to consider:
-- Are cookies involved? Will the remote server give you the content you
expect?
-- Parsing through the code for all required assets I think would be horrible
(especially trying to deal with JS, etc. in real complex pages). Same goes
for fixing paths and the like. Then again, I despise client-side code, so
I'm probably biased.
> Not really a big one, but if somebody already knows about a script that
> does this, I wouldn't mind the head start.
I'm sure something at least to start with exists; try
http://phpclasses.mirrors.nyphp.org maybe? or hotscripts.com?
Also, maybe the good old wget command-line tool would come in handy.
> Lastly, it was great to have meet some of you last night... great meeting.
Good seeing you again Carlos,
=====
Hans Zaunere
President, New York PHP
http://nyphp.org
hans at nyphp.org
More information about the talk
mailing list