jessekornblum (jessekornblum) wrote,

Changing Tiger Hashes for Large Files in Hashdeep 4.2

The next version of Hashdeep and Tigerdeep will generate different Tiger hashes for large files than previous versions. There was a bug in the Tiger code used up to and in version 4.1.1. Version 4.2 of the programs, released today, will generate the proper hashes.

The problem was pointed out by alert user Fred-Markus Stober who wrote to me with details of the problem and a potential solution. (Thank you!) The issue is how the file size is included in the padding of each input file and only affects files larger than 4GB. The Tiger code used in the programs, originally from libgcrypt [1], was updated some time ago to fix this problem. With Mr. Stober's help, we adapted and tested the newer code from libgcrypt to the Hashdeep project.

Unfortunately there are no test vectors for large files in Tiger [2], so I've made my own. I made two files, a small and large one. The small file is the string 'abc' without a newline. The large file is the character 'A' repeated many many times. You can download these two files, in a 4.3MB zip file, at http://jessekornblum.com/tools/tiger/tiger-test-vectors.zip. (Yes, the large.txt file is more than 4GB. But because it's just the character 'A' repeated over and over, it compresses well.)

Here's an example of the tigerdeep problem in action. Here are our input files:
$ xxd small.txt
0000000: 6162 63                                  abc

$ xxd large.txt | head -n 5
0000000: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA
0000010: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA
0000020: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA
0000030: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA
0000040: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA

$ ls -l *.txt
-rw-r--r--  1 jessek  staff  4586445864 Jun 11 07:30 large.txt
-rw-r--r--  1 jessek  staff           3 Jun 11 07:19 small.txt

First we use the current version of the tigerdeep, version 4.1.1, on these files:
$ tigerdeep -v
4.1.1

$ tigerdeep -b *.txt
2aab1484e8c158f2bfb8c5ff41b57a525129131c957b5f93  small.txt
af7950d627a6b38a51697a4f658343c2f619b4cad6e5cf4b  large.txt

Then we repeat with the new version of tigerdeep, version 4.2. Notice how the hash of the small file is the same, but the hash of the large file is different.
$ ./src/tigerdeep -v
4.2

$ ./src/tigerdeep -b *.txt
2aab1484e8c158f2bfb8c5ff41b57a525129131c957b5f93  small.txt
0514ecc7a5e83fa25572e1acd56de1c3a2280e4f819c4d0d  large.txt
You can now download the Windows binary or *nix source code for these programs.



[1] libgcrypt, http://www.gnupg.org/documentation/manuals/gcrypt/
[2] Tiger algorithm test vectors, http://www.cs.technion.ac.il/~biham/Reports/Tiger/test-vectors-nessie-format.dat
Tags: forensics, hashdeep, md5deep
  • Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded  

  • 0 comments