[nycphp-talk] [OT] number of files in a directory?
Steve Manes
smanes at magpie.com
Mon Jan 2 20:26:51 EST 2006
max goldberg wrote:
> The downside is that you have to make sure your code really keeps track
> of your file system and you aren't accessing it by hand. Another thing
> you might worry about using md5 is collisions. If this is a mission
> critical system, you may want to avoid md5 as it is possible (but
> somewhat unlikely) you will encounter collisions. I've read anyone with
> a decent computer can create an md5 collision in about an hour, so
> that's something to keep in mind.
Yeah, this is probably the best the solution. To avoid collisions what
you want to do is assign a unique database ID to every asset, use that
ID to create the MD5 hash, then store the asset with a filename
containing that unique ID. That should eliminate collisions. The worst
that can happen is that you'll have two different files in the same
directory but with different filenames, which is cool.
A function like this could be used to both plant the file in the MD5
filesystem and extract its path later on based on that unique ID:
function get_upload_target($file_id) {
$hash_id = md5($file_id);
$subdir = substr($hash_id, 0, 3) .
'/' .
substr($hash_id, 3, 3);
return $subdir;
}
Use case: someone uploads the file "mykitty.jpg" and it's inserted into
the database as id=1234. get_upload_target(1234) returns:
81d/c9b
The file is then written as $ASSET_DIR/81d/c9b/1234
Or 1234.jpg, or 1234.mykitty.jpg, whatever. I like to give the file a
recognizable file type extension.
To extract that file later, just run the ID through get_upload_target()
again to build the filesystem path.
More information about the talk
mailing list