The problem was pointed out by alert user Fred-Markus Stober who wrote to me with details of the problem and a potential solution. (Thank you!) The issue is how the file size is included in the padding of each input file and only affects files larger than 4GB. The Tiger code used in the programs, originally from libgcrypt [1], was updated some time ago to fix this problem. With Mr. Stober's help, we adapted and tested the newer code from libgcrypt to the Hashdeep project.
Unfortunately there are no test vectors for large files in Tiger [2], so I've made my own. I made two files, a small and large one. The small file is the string 'abc' without a newline. The large file is the character 'A' repeated many many times. You can download these two files, in a 4.3MB zip file, at http://jessekornblum.com/tools/tiger/ti
Here's an example of the tigerdeep problem in action. Here are our input files:
$ xxd small.txt 0000000: 6162 63 abc $ xxd large.txt | head -n 5 0000000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0000010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0000020: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0000030: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0000040: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA $ ls -l *.txt -rw-r--r-- 1 jessek staff 4586445864 Jun 11 07:30 large.txt -rw-r--r-- 1 jessek staff 3 Jun 11 07:19 small.txt
First we use the current version of the tigerdeep, version 4.1.1, on these files:
$ tigerdeep -v 4.1.1 $ tigerdeep -b *.txt 2aab1484e8c158f2bfb8c5ff41b57a525129131c957b5f93 small.txt af7950d627a6b38a51697a4f658343c2f619b4cad6e5cf4b large.txt
Then we repeat with the new version of tigerdeep, version 4.2. Notice how the hash of the small file is the same, but the hash of the large file is different.
$ ./src/tigerdeep -v 4.2 $ ./src/tigerdeep -b *.txt 2aab1484e8c158f2bfb8c5ff41b57a525129131c957b5f93 small.txt 0514ecc7a5e83fa25572e1acd56de1c3a2280e4f819c4d0d large.txtYou can now download the Windows binary or *nix source code for these programs.
[1] libgcrypt, http://www.gnupg.org/documentation/manu
[2] Tiger algorithm test vectors, http://www.cs.technion.ac.il/~biham/Rep