A Geek Raised by Wolves - Fuzzy Hashing for Truncated files [entries|archive|friends|userinfo]
jessekornblum

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Links
[Links:| Browse by Tag LiveJournal Portal Update Journal Logout ]

Fuzzy Hashing for Truncated files [Oct. 6th, 2006|06:49 am]
Previous Entry Share Next Entry
[Tags|, ]

Along with source code reuse, you can also use fuzzy hashing to find truncated files. Here's a sample using a fake filename. We'll compute the fuzzy hash for the file, make a copy that contains only the first 29% of the original, and then try to match the truncated version back to the original.

$ ls -lsh
-rwxr-xr-x 1 jvalenti users 699M Sep 29 2006 all-the-kings-men.avi

$ ssdeep -b all-the-kings-men.avi > sig.txt

$ cat sig.txt
ssdeep,1.0--blocksize:hash:hash,filename
12582912:fgQl/nUjQAbaBQvHf8yLr5CHJu3dyh YJ27TuXyphJs3wHC6 rEfAV wDrw6C/AT:fPl8cdAUyLr5CHJu3dyh8uzwHC6 reAS,"all-the-kings-men.avi"

$ dd if=all-the-kings-men.avi of=partial.avi bs=1m count=200
200 0 records in
200 0 records out
209715200 bytes transferred in 14.510224 secs (14452926 bytes/sec)

$ ls -lsh partial.avi
-rw-r--r-- 1 jvalenti users 200M Oct 6 06:40 partial.avi

$ ssdeep -bm sig.txt partial.avi
partial.avi matches all-the-kings-men.avi (57)


TA DA!
LinkReply