[Voiceglue] Hashing filenames
emiliano esposito
emiespo at tiscali.it
Wed Jan 14 05:34:12 EST 2009
Andrew Cumming ha scritto:
>
> I found that some long text (and some not so long) would create very
> long filenames. I think this also caused some problems when the
> filenames got exceedingly long
>
I was working on a similar thing, though I tried with MD5. The problem
comes into existence with collisions. You should have two hashes to
disambiguate when two texts produce the same SHA-1 digest.
Since the quickest way to do this was to have a counter and append it to
the name, I discarded the MD5 part and simply used a single hash (tied
on the disk for memory usage) with text as key and a counter as data
(and filename). This way there won't be duplicates (ie collisions) and
the hashing logic is done by perl itself. The only limit is the size of
the counter, but I think that with "just" 32 bit you can render over 4
billion of texts... I think it's more than enough :-)
I hope I was clear...
More information about the Voiceglue
mailing list