jessekornblum (jessekornblum) wrote,

Fuzzy Hashing for Truncated files

Along with source code reuse, you can also use fuzzy hashing to find truncated files. Here's a sample using a fake filename. We'll compute the fuzzy hash for the file, make a copy that contains only the first 29% of the original, and then try to match the truncated version back to the original.

$ ls -lsh
-rwxr-xr-x 1 jvalenti users 699M Sep 29 2006 all-the-kings-men.avi

$ ssdeep -b all-the-kings-men.avi > sig.txt

$ cat sig.txt
12582912:fgQl/nUjQAbaBQvHf8yLr5CHJu3dyh YJ27TuXyphJs3wHC6 rEfAV wDrw6C/AT:fPl8cdAUyLr5CHJu3dyh8uzwHC6 reAS,"all-the-kings-men.avi"

$ dd if=all-the-kings-men.avi of=partial.avi bs=1m count=200
200 0 records in
200 0 records out
209715200 bytes transferred in 14.510224 secs (14452926 bytes/sec)

$ ls -lsh partial.avi
-rw-r--r-- 1 jvalenti users 200M Oct 6 06:40 partial.avi

$ ssdeep -bm sig.txt partial.avi
partial.avi matches all-the-kings-men.avi (57)

Tags: forensics, hashing
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded