[nycphp-talk] Graph Data Structures
Kenneth Downs
ken at secdat.com
Wed Aug 17 16:13:15 EDT 2005
My experience in db servers is limited to MS SQL Server, IBM DB/2 and
Postgres. Postgres is the one I pick when the decision is mine. The
others I really cannot comment on.
Also, as for the table design, a google search on both web pages and
newsgroups should yield some pre-existing solutions.
> Going off topic:
>
> Thinking of ways to persist the graph :
> I've used MySQL a long time and see them as only getting better. But
> their dual license deters me from some uses. - what about SQLite - also
> completely free (LGPL, MIT, or Apache License I think), small, and fast?
> Or dbm functions if I only need to store hashes?
>
> I'll feel silly if I reinvent the work of the RAP RDF datastore. I'll
> have to write tests and a couple implementations so I can know what is
> faster and more efficient.
>
>
>
>
> Kenneth Downs wrote:
>
>>Jonathan,
>>
>>Glad this helped.
>>
>>If your graphs get big then you must do this in a database, don't attempt
>>it in scripts.
>>
>>Many people recommend mySQl because it is free, but PostgreSQL is far
>> more
>>capable and is also free.
>>
>>
>>
>>>Hi Kenneth,
>>>
>>>Thanks very much. I like what you have posted!
>>>I think I would add keys to the arrays and add one more dimension so
>>>that there can be multiple graphs with the same label.
>>>
>>>$arcs['likes'][] = array('a','a');
>>>$arcs['likes'][] = array('a','b');
>>>$arcs['knows'][] = array('a','b');
>>>
>>>$arcs['knows'][] = array('b','a');
>>>$arcs['confusedby'][] = array('b','a');
>>>
>>> I can get the functionality I need from this.
>>>I wonder how scalable this is. What if there are over 5 million arcs. My
>>>traversal algorithms would have to be very efficient.
>>>
>>>Another package I am looking at for graph manipulation is bundled with
>>>the PHP RDF package RAP.
>>>
>>>http://www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi
>>>http://www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/tutorial/usingNamedGraphs.htm#ng-intro
>>>
>>>They use quads - where a quad is a labeled graph. In the above data
>>>structure, a new graph would just be a new set of arcs. So it looks like
>>>nothing is missing from this array based graph.
>>>
>>>Hmm, or why not
>>>
>>>$arcs['a']['likes']['b'] = true;
>>>$arcs['b']['likes']['b'] = false;
>>>or
>>>$arcs['b']['likes']['chocolate'][] = '80%';
>>>$arcs['b']['likes']['chocolate'][] = 'quite significantly';
>>>
>>>Then I can have a "triple" with a label and a value.
>>>
>>>This could be interesting.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>Kenneth Downs wrote:
>>>
>>>
>>>
>>>>Johathan,
>>>>
>>>>Have you considered putting the nodes into one array, and the arcs into
>>>> a
>>>>second?
>>>>
>>>>$nodes = array('a'=>array(..info..),'b'=>array(..info..));
>>>>
>>>>$arcs = array(array('a','b'),array('b','a'))
>>>>
>>>>This allows each node's info to be stored only once, and allows you to
>>>>then treat the arcs cleanly, allowing or disallowing any combo you may
>>>>choose.
>>>>
>>>>You may have to write a little of your code to walk through things, but
>>>>you'll have complete integrity and control.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>I'm trying to formulate a question out of this. If there isn't one
>>>>> here,
>>>>>I hope the read is interesting.
>>>>>My goal is to create the simplest efficient graph data structures -
>>>>> that
>>>>>allow for cycles.
>>>>>
>>>>>The reason one would want cycles in a graph is the following:
>>>>>a->b
>>>>>and
>>>>>b->a
>>>>>(or b->a again with another arc (also known as a hypergraph))
>>>>>or
>>>>>a->a
>>>>>
>>>>>where '->' is an arc
>>>>>Even if the arcs are labled, the data in 'a' is something I don't want
>>>>>to duplicate.
>>>>>
>>>>>I am using php version 4.3.11
>>>>>
>>>>>If I try to do this with a simple php array:
>>>>>
>>>>> $a = array();
>>>>> $b = array();
>>>>>
>>>>> $a['b'] = & $b;
>>>>> $b['a'] = & $a;
>>>>>
>>>>> print_r($a);
>>>>>
>>>>>Array
>>>>>(
>>>>> [a] => Array
>>>>> (
>>>>> [b] => Array
>>>>> (
>>>>> [a] => Array
>>>>>*RECURSION*
>>>>> )
>>>>>
>>>>> )
>>>>>
>>>>>)
>>>>>
>>>>>
>>>>>I get this recursion error. Or, perhaps this is not an error at all.
>>>>> But
>>>>>I can't seem to use this function:
>>>>>
>>>>> function recursive_print($array)
>>>>> {
>>>>> foreach($array as $key => $value)
>>>>> {
>>>>> if (is_array($value))
>>>>> {
>>>>> echo $key .' <hr /> ' .recursive_print($value);
>>>>> }
>>>>> else
>>>>> {
>>>>> echo 'end'.$value;
>>>>> }
>>>>> }
>>>>> }
>>>>>
>>>>>So I went to the PEAR site -
>>>>>http://pear.php.net/package/Structures_Graph
>>>>>This pear package doesn't throw any errors but it also seems to balk
>>>>> -
>>>>>although I am not sure the *RECURSION* will affect functionality
>>>>>
>>>>>
>>>>> include 'Structures/Graph.php';
>>>>> $directedGraph =& new Structures_Graph(true);
>>>>> $nodeOne =& new Structures_Graph_Node();
>>>>> $nodeTwo =& new Structures_Graph_Node();
>>>>>
>>>>>
>>>>> $directedGraph->addNode(&$nodeOne);
>>>>> $directedGraph->addNode(&$nodeTwo);
>>>>>
>>>>>
>>>>> $nodeOne->connectTo($nodeTwo);
>>>>> $nodeTwo->connectTo($nodeOne);
>>>>>
>>>>>
>>>>>Inside the code I found a comment about the Zend engine before the
>>>>> data
>>>>>structure procedes to iteratively loop through the the nodes to see if
>>>>>there are duplicates.
>>>>> /*
>>>>> ZE1 equality operators choke on the recursive cycle
>>>>>introduced by the _graph field in the Node object.
>>>>> So, we'll check references the hard way
>>>>> */
>>>>>
>>>>>Even so, print_r produces many recursion warnings.
>>>>>
>>>>>Maybe I am just trying to use a hammer for a screwdriver. But can
>>>>> anyone
>>>>>offer any insight here?
>>>>>
>>>>>Thanks,
>>>>>
>>>>>- Jonathan Hendler
>>>>>
>>>>>
>>>>>
>>>>>_______________________________________________
>>>>>New York PHP Talk Mailing List
>>>>>AMP Technology
>>>>>Supporting Apache, MySQL and PHP
>>>>>http://lists.nyphp.org/mailman/listinfo/talk
>>>>>http://www.nyphp.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>_______________________________________________
>>>New York PHP Talk Mailing List
>>>AMP Technology
>>>Supporting Apache, MySQL and PHP
>>>http://lists.nyphp.org/mailman/listinfo/talk
>>>http://www.nyphp.org
>>>
>>>
>>>
>>
>>
>>
>>
>
> _______________________________________________
> New York PHP Talk Mailing List
> AMP Technology
> Supporting Apache, MySQL and PHP
> http://lists.nyphp.org/mailman/listinfo/talk
> http://www.nyphp.org
>
--
Kenneth Downs
Secure Data Software
631-379-0010
ken at secdat.com
PO Box 708
East Setauket, NY 11733
More information about the talk
mailing list