NYCPHP Meetup

NYPHP.org

[nycphp-talk] XML files

Chalu Kim chalu at egenius.com
Wed May 7 14:45:47 EDT 2003


This is the kind of stuff you need the workflow for.

I am speaking in pseudo.

For example, XML files are published. Then, somebody comes along and changes 
it. That pushes the state to say "retracted" or "changed". This event in a 
workflow system can trigger an action.

Before the state change, you could have an opportunity to execute rules.

This is not a simple arena as you need a way to define rules. There have been 
several attempts such as ZPatterns (available in Zope or Java). 

This enables you to define at the interface level as in receiving data to 
trigger actions. In ZPatterns, you can use ZPatterns language (declarative) 
to define behaviors of data reception. Unfortunately, this approach is not 
for average for it is difficult to work with. Just get into this rarefied 
Pattern language and all. 

None of XML related stuff will address workflow or aspect-oriented programming 
because XML files themselves need to remember what state they are in. 

If it is a possibility to mix and match, you can publish XML files into Zope. 
That gets you workflow. XML files will be available through web or FTP. When 
it gets changed, it can do some processing.

Good luck


On Wednesday 07 May 2003 02:17 pm, Larry Chuon wrote:
> Hi Anirudh,
>
> Long time no hear.  Thank you and everyone at NYPHP <mailto:everyone at NYPHP>
> for the solution.  I learn a great deal in the last day and half.  As Chalu
> had noted, I need to rethink through the solution again.  Pumping out
> nearly a million XML files is not the best practice.  Last night, I
> accepted Gary Malcolm's invitation to look at Xindice.  Xindice uses Xpath
> and Xupdate, it addresses some of my concerns (not certain how many records
> it can support though, but I can categorize).  Unfortunately, Xindice does
> not have a PHP client for their API.  I need to figure out something for
> this.
>
> Your sample codes resolve my large volume of files should I ever need to
> spit out more XML files.
>
> Now, I have a whole new set of problems.  Lets say, I need to share data
> with some companies.  The XML files get changed and must be updated on my
> system or their system depending on who is/are the recipient(s) of the
> changes.  Since I have nearly a million records, how do I keep track of
> changes?  Importing everything back into Xindice or whatever repository in
> a multiple of X times will generate a lot unnecessary traffic.  On the
> other hand, updating only the changes will require a lot of tracking
> (timestamp) as well.  Finally, giving appropriate hook to the partners will
> require an extensive predefined business rules (I don't know it all and I
> don't assume to know my partners/users business workflow either) and some
> sort of version control for rollback and approval process similarly to CMS.
>  Thought anybody?
>
> Thanks for all the suggestions guys.
> Larry
>
> -----Original Message-----
> From: Anirudh Zala [mailto:anirudh at aumcomputers.com]
> Sent: Wednesday, May 07, 2003 2:17 AM
> To: talk at nyphp.org
> Cc: LarryC at indexstock.com
> Subject: Re: [nycphp-talk] XML files
>
> Larry
>
> Let's have look about your problems 1 by 1
>
> 1: Problem of storing and accessing large volume of files in 1 container
> directory.
>
> A: In such cases splitted directory structure solution is best like what
> "Rick Seeger" has suggested. Hence you can design a special directory
> structure that can generate new directory as and when total amout of files
> execeeds the limit say... 100 files per directory. Below sample function
> does that (it generate new directory as soon as "id" exceeds value 100)
>
> // CRATES A DIRECTORY ACCORDING TO GIVEN ID AND PARENT DIR PATH
> function createDir($id,$parent)
> {
>     $nd_dirlimit = 100;
>
>     $mod = ceil($id/$nd_dirlimit) - 1;
>
>     $dirs = $mod * $nd_dirlimit + 1;
>     $dire = $dirs + $nd_dirlimit - 1;
>
>     $tempDir = $dirs."_".$dire;
>
>     if(! file_exists("$parent/$tempDir"))
>           mkdir("$parent/$tempDir",0777);
>
>     return "$tempDir";
> }
> By above method dirctory structure will be like ../1_100/, ../101_200/,
> ./201_300/ and in each directory file name with id say "1.xml" to "100.xml"
> (or with any name) will reside. Once above directory structure is
> implemented no script will crawl like now :D.
>
> 2: Importing Data into DB by parsing XML files
>
> A: Use php's file "XPath.class.php" this is very nice OO file to parse XML
> documents at greater speed. It has lot of easy to use method and properties
> like $xml->evaluate, $xml->setAttributes, $xml->appendChild,
> $xml->removeAttribute, $xml->exportToFile etc...by which you can parse lot
> of XML documents like playing Zigsaw puzzles. It is very handy, class based
> php file. Check example at
> http://www.soi.city.ac.uk/~sa386/php/XPath_V2/testBench/useCases/
> <http://www.soi.city.ac.uk/~sa386/php/XPath_V2/testBench/useCases/>
>
> Some times it is useful to use Utilities of Perl and executing your
> perl/cgi scripts from console level. This very fastest and reliable method
> IF such doucmens are to be generated, read or write in batch mode. Perl's
> XML::DOM package helps here. Other advantage of such mechanism is that you
> can run such scripts seperately from other servers that has connection with
> your DB server, hence reduction of overhaed on your main server that has
> php scripts running.
>
> Another method is XSQL that is powerful but still new to us and PHP, where
> your DB SQL is written directly into XML file that syncronises (update your
> XML document from DB by executing XSQL) your XML and DB from 2 ways. (Can't
> tell much here :( )
>
> 3: Is there another way to index and query large volumes of documents?
>
> A: I am not getting here exactly your problem, But if you need faster
> document access from directory or file system (serching 1 file from million
> files) then answer of 1 st question can help here where you just need to
> search particular document from 100 files from documnet diretory, already
> given from DB.
>
> And if question is related to Data then use of indexes (Indexing commonly
> used fields in Db like primary keys and foreigh keys) in mysql can make
> "Access of record" faster.
>
> Thanks,
>
> Anirudh Zala ( azala at lechuon.com <mailto:azala at lechuon.com> )
>
> ---------------------------------------------------------------------------
>- --------------------------
> Anirudh Zala (Project Manager),           Tel: +91 281 2451894
> AUM Computers,                                Gsm: +91 98981 37727
> 317, Star Plaza,
> <mailto:anirudh at aumcomputers.com> anirudh at aumcomputers.com
> Rajkot-360001, Gujarat, INDIA,             <http://www.aumcomputers.com>
> http://www.aumcomputers.com
> ---------------------------------------------------------------------------
>- --------------------------
>
> ----- Original Message -----
> From: "Larry Chuon" <  <mailto:LarryC at indexstock.com>
> LarryC at indexstock.com> To: "NYPHP Talk" <  <mailto:talk at nyphp.org>
> talk at nyphp.org>
> Sent: Wednesday, 07 May, 2003 4:01 AM
> Subject: RE: [nycphp-talk] XML files
>
> > Thank you all for your quick responses.  I'll consider generating XML on
>
> the
>
> > fly (still new at this).  Here's another question.  Besides importing the
> > files to dB, is there another way to index and query large volumes of
> > documents.  I'm going out of tangent a bit here.  Let say, I'm archiving
> > tons of document for a public library.  They want to scan and digitize
> > millions of articles.  What is the best way to index and search for
>
> articles
>
> > say by keywords or captions?
> >
> >
> >
> > -----Original Message-----
> > From: Malcolm, Gary [mailto:gmalcolm at professionalcredit.com]
> > Sent: Tuesday, May 06, 2003 5:53 PM
> > To: NYPHP Talk
> > Subject: RE: [nycphp-talk] XML files
> >
> >  <http://www.phpwebhosting.com/> http://www.phpwebhosting.com/
> >  <http://phphosts.codewalkers.com> http://phphosts.codewalkers.com
> >  <http://www.oinko.net/freephp> http://www.oinko.net/freephp
> >  <http://www.free-php-hosting.com> www.free-php-hosting.com
> >
> > cheap hosting... cheap db access... i love (hearts in eyes) mysql
> >
> > > -----Original Message-----
> > > From:  <mailto:soazine at pop.erols.com> soazine at pop.erols.com
>
> [mailto:soazine at pop.mail.rcn.net]
>
> > > Sent: Tuesday, 06 May, 2003 2:41 PM
> > > To: NYPHP Talk
> > > Subject: Re: [nycphp-talk] XML files
> > >
> > >
> > > Importing the XML files into a database is an ideal solution,
> > > unfortunately, not always an available one, such as in my
> > > case.  I have
> > > space on a remote server where database access is very
> > > expensive (it's my
> > > own site and out of my price range to afford db access), so I have to
> > > resort to XML as well.  PHP parses XML extremely fast and
> > > efficiently; I
> > > highly recommend it.
> > >
> > > I'd use PHP's available XML parsers along with grouping them into
> > > directories sorted by a date or some other delimiter to allow
> > > for smaller
> > > amount of files per directory.
> > >
> > > Phil
> > >
> > > Original Message:
> > > -----------------
> > > From: Analysis & Solutions  <mailto:danielc at analysisandsolutions.com>
>
> danielc at analysisandsolutions.com
>
> > > Date: Tue,  6 May 2003 17:35:16 -0400
> > > To:  <mailto:talk at nyphp.org> talk at nyphp.org
> > > Subject: Re: [nycphp-talk] XML files
> > >
> > > On Tue, May 06, 2003 at 04:49:21PM -0400, Larry Chuon wrote:
> > > > I'm doing everything that you mention below.
> > >
> > > So, import the files into a database and get rid of the XML files.
> > >
> > > Here's a quick tutorial on how to parse XML in PHP:
> > >    http:www.analysisandsolutions.com/code/phpxml.htm
> > >
> > > Enjoy,
> > >
> > > --Dan
> > >
> > > --
> > >      FREE scripts that make web and database programming easier
> > >             <http://www.analysisandsolutions.com/software/>
>
> http://www.analysisandsolutions.com/software/
>
> > >  T H E   A N A L Y S I S   A N D   S O L U T I O N S   C O M P A N Y
> > >  4015 7th Ave #4AJ, Brooklyn NY    v: 718-854-0335   f: 718-854-0409
> > >
> > >
> > >
> > >
> > >
> > >
> > > --------------------------------------------------------------------
> > > mail2web - Check your email from the web at
> > >  <http://mail2web.com/> http://mail2web.com/ .
> >
> > --- Unsubscribe at  <http://nyphp.org/list/> http://nyphp.org/list/ ---
>
> --- Unsubscribe at http://nyphp.org/list/ ---

-- 

Chalu Kim
eGenius Inc
Technology Partner
chalu at egenius.com, (718) 858-0142




More information about the talk mailing list