[nycphp-talk] Practical Extraction in PHP
Brian Dailey
support at dailytechnology.net
Mon Jul 16 10:35:20 EDT 2007
Chris,
I just recently was faced with a similar problem, but I'm not sure if my
solution is 100% helpful or not. I ended up using grep to find what I
was looking for, and then indicating the column position (which was
pretty reliable) to further narrow down any false positives. It's sort
of difficult to give a lot of help since we can't see the files
themselves. I know for the project I worked on there was a lot of
tweaking involved to make it 100% accurate.
Would it be easier to obtain the results in another fashion? If you ran
some sort of 'df -h' on a Linux file system you could get the disk space
from that. Are the emailed reports are your only option?
- Brian D.
csnyder wrote:
> I've had this idea in the back of my head to do a project that
> systematically tracks values embedded in web or email reports. For
> instance, I get logwatch emails from the servers I admin, and each
> time one of those comes in I'd like to extract the disk free space and
> put it into a round-robin database. Or I want to track the rendering
> time for various key pages in a CMS.
>
> The problem is that the value isn't always in the same place. It might
> be a few lines down because of alerts or content that precede it. Or
> it might look different some days (ending in GB rather than MB). It's
> possible that some combination of regex and recurrence number ( 5th
> instance of /[0-9]*(MB|BG)/ ) could work, but it seems messy.
>
> We all probably do a little of this on an ad hoc basis, scraping
> values out of websites and whatnot. Does anyone do it a lot? What kind
> of tools do you use? Is Perl better suited to the task? Or sed+awk?
> Does anyone know of a system (preferably php) that does this in the
> general case?
>
> TIA, y'all.
>
--
Thanks!
- Brian Dailey
Software Developer
New York, NY
www.dailytechnology.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: support.vcf
Type: text/x-vcard
Size: 276 bytes
Desc: not available
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20070716/bc103981/attachment.vcf>
More information about the talk
mailing list