A Geek Raised by Wolves [entries|archive|friends|userinfo]
jessekornblum

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Links
[Links:| Browse by Tag LiveJournal Portal Update Journal Logout ]

ssdeep 2.11 Released [Sep. 11th, 2014|04:04 pm]
[Tags|, ]

I have published version 2.11 of ssdeep. This is an important update, as described below, and you are encouraged to update immediately. You can download Windows binaries or the *nix source code.

This version corrects a bug in the signature generation code. That is, version 2.10 was generating signatures which were slightly different than in version 2.9. In some cases, the trailing character of each portion of the signature was being truncated. You can see this with an example: Let's look at the /usr/bin/bc file which ships with OS X. It has a SHA-256 hash of cc8e7502045c4698d96e6aa5e2e80ebb52c3b9c266993a8da232add49c797f3e and you can see it on VirusTotal.

When you hash this file with version 2.9, you get:

1536:MsjYdR3Bul8hcURWhEcg4/btZzDcQflbCUPEBEh8wkcGDioxMYeo7:TYf8l8htRWA4ztZsGlWUPEBEh8wmxMYe

With version 2.10:

1536:MsjYdR3Bul8hcURWhEcg4/btZzDcQflbCUPEBEh8wkcGDioxMYeo7:TYf8l8htRWA4ztZsGlWUPEBEh8wmxMY

Note that the trailing 'e' character disappears in the second hash. What was 'mxMYE' is now 'mxMY'. The new version of ssdeep, version 2.11, restores the original signatures:

1536:MsjYdR3Bul8hcURWhEcg4/btZzDcQflbCUPEBEh8wkcGDioxMYeo7:TYf8l8htRWA4ztZsGlWUPEBEh8wmxMYe

Alert readers will notice that VirusTotal has the ssdeep hash from version 2.10. This leads to my next point, which is that any ssdeep hashes you've created with version 2.10 should be recomputed. The signatures aren't wrong per se. They're just not as good as they should be. For reference, version 2.10 was released in July 2013, and so you should update any hashes produced after that date.
LinkLeave a comment

Hashdeep version 4.4 released [Jan. 29th, 2014|12:41 pm]
[Tags|, ]

I have just published Hashdeep version 4.4 (aka md5deep). There is one new feature, -E mode adds case insensitive auditing. Otherwise this version has a lot of bug fixes and clean up, but is not a high-priority update. This is my first use of Github's new 'Releases' tool, so please let me know what you think!

https://github.com/jessek/hashdeep/releases/tag/release-4.4
LinkLeave a comment

Searching the NSRL online [Jan. 13th, 2014|02:48 pm]
[Tags|]

You can now search the NSRL online thanks to SA David Black, http://www.hashsets.com/nsrl/search/.
Link1 comment|Leave a comment

Visualizing Recovered Executables from Memory Images [Aug. 12th, 2013|11:52 pm]
[Tags|, , ]

I like to use a picture to help explain how we can recover executables from memory images. For example, here's the image I was using 2008:



This post will explain what's happening in that picture—how PE executables are loaded and recovered—and provide a different visualization of the process. Instead of just a stylized representation, we can produce pictures from actual data. This post explains how to do that and the tools used in the process.


When executables are loaded from the disk, Windows uses the PE header to determine how many pages, and with which permissions, will be allocated for each section. The header describes the size and location of each section on the disk and its size and location in memory. Because the sections needs to page aligned in memory, but not on the disk, this generally results in some space being added between the sections when they're loaded into memory. There are also changes made in memory due to relocations and imported functions.

When we recover executables from memory, we can use the PE header to map the sections back to their size and locations as they were on the disk. Generally memory forensics tools don't undo the other modifications made by the Windows loader. The changes made in memory remain the new version we recover. In addition, due to paging and other limitations we don't always get all of the pages of the executable from memory. They could have been paged out, are invalid, or were never loaded in the first place.

That's a tidy description of the picture above. The reality, of course, is a little messier. I've used my colorize and filecompare tools to produce visualizations for an executable on the disk, what it looked like in memory, and what it looked like when recovered from the memory image. In addition to those tools, I used the Volatility™ memory forensics framework [1] and the Picasion tool for making animated gifs [2]. For the memory image, I'm using the xp-laptop memory image from the NIST CFReDS project [3]. In particular, we'll be looking at cmd.exe, process 3256.

Here's a representation of the original executable from the disk as produced with colorize. This image is a little different than some of the others I've posted before. Instead of being vertically oriented, it's horizontal. The data starts at the top left, and then goes down and then right. I've also changed the images to be 512 pixels wide instead of the default 100. I made the image this way to make it appear similar to the image at the start of this post. Here's the command I used to generate the picture:

$ colorize -o -w 512 cmd.exe

and here's the result: http://jessekornblum.com/tools/colorize/img/cmd.exe.bmp

It gets interesting when we compare this picture to the data we can recover from the memory image. First, we can recover the in-memory representation of the executable using the Volatility™ plugin procmemdump. In the files generated by this plugin the pages are memory aligned, not disk aligned. Here's the command line to run the plugin:

$ python vol.py -f cases/xp-laptop-2005-07-04-1430.vmem --profile=WinXPSP2x86 procmemdump --pid=3256 --dump-dir=output
Volatile Systems Volatility Framework 2.3_alpha
Process(V) ImageBase  Name                 Result
---------- ---------- -------------------- ------
0x8153f480 0x4ad00000 cmd.exe              OK: executable.3256.exe


Here's how we can colorize it:

$ mv executable.3256.exe executable-procmemdump.3256.exe
$ colorize -o -w 512 executable-procmemdump.3256.exe


Which leads to this result: http://jessekornblum.com/tools/colorize/img/executable-procmemdump.3256.exe.bmp

There's a lot going on here, but things will get more clear with a third image. For the third picture we'll recover the executable again, but this time realigning the sections back to how there were on the disk. This is done by parsing the PE header in memory and using it to undo some of the changes made when it was loaded. We can do this using the procexedump plugin, like this:

$ python vol.py -f xp-laptop-2005-07-04-1430.vmem --profile=WinXPSP2x86 procexedump --pid=3256 --dump-dir=output
Volatile Systems Volatility Framework 2.3_alpha
Process(V) ImageBase  Name                 Result
---------- ---------- -------------------- ------
0x8153f480 0x4ad00000 cmd.exe              OK: executable.3256.exe


We repeat the process for colorizing this sample:

$ mv executable.3256.exe executable-procexedump.3256.exe
$ colorize -o -w 512 executable-procexedump.3256.exe


Which produces this image: http://jessekornblum.com/tools/colorize/img/executable-procexedump.3256.exe.bmp

First, let's compare the recovered executable back to the original. Even before we start our visualizations, we can see there were changes between the original and this version. The MD5 hashes of the two files are different:

$ md5deep -b cmd.exe executable-procexedump.3256.exe
eeb024f2c81f0d55936fb825d21a91d6  cmd.exe
ff8a9a332a9471e1bf8d5cebb941fc66  executable-procexedump.3256.exe


Amazingly, however, they match using fuzzy hashing via the ssdeep tool [4]:

$ ssdeep -bda cmd.exe executable-procexedump.3256.exe
executable-procexedump.3256.exe matches cmd.exe (66)


There's also a match with the sdhash similarity detection tool [5]:

$ sdhash -g -t 0 cmd.exe executable-procexedump.3256.exe
cmd.exe|executable-procexedump.3256.exe|046


(You haven't heard of sdhash? Don't get tunnel vision! There are many similarity detection tools.)

Those matches are good signs. But attempting to compare the colorized image of the recovered executable back to the original is a little tricky. To make it easier, I made a kind of blink comparator. The free site Picasion allows you to make animated GIFs from submitted pictures. Combined with some annotations on the pictures, here's the result:



There are two important things to notice here. First, we didn't recover all of the executable. The bands of black which appear on the left-hand side in the recovered image are pages which weren't found in memory. Also notice how much of the data from the end of the file is missing, too. Almost all of it! (Isn't it amazing that fuzzy hashing can still generate a match between these two files?)

The second thing to notice is the changes in the data. It's a little hard to see in the GIF, but you can get a better view using the filecompare and colorize tools together. We can compare the two files at the byte level and then colorize the result:

$ filecompare -b 1 cmd.exe executable-procexedump.3256.exe > orig-to-exe.dat
$ colorize -o - w 512 orig-to-exe.dat


Here's the result: http://jessekornblum.com/tools/colorize/img/orig-to-exe.dat.bmp

Here we can clearly see, in red, the changes throughout the file. The blocks of mostly red, or heavily speckled red, and the places where we weren't able to recover data from the memory image. Because some of the values in the original executable were zeros, those appear to match the zeros we recovered from the memory image--hence the speckled pattern. The changes to the executable you can clearly see a pattern of dashed red lines.

Finally, we can visualize the changes between the in-memory representation of the file and the disk representation the file. I've made another animated GIF, this time between these versions of the executable as recovered by procexedump and procmemdump:



The most obvious difference between these two pictures is the black band on the left-hand side of the image. That's the space, created by the realignment from disk to memory, being added by the Windows loader to page align the first section of the executable.


References

[1] The Volatility™ framework, https://code.google.com/p/volatility/. Volatility™ is a trademark of Verizon. Jesse Kornblum is not sponsored or approved by, or affiliated with Verizon.

[2] Picasion.com, http://picasion.com/.

[3] The Computer Forensic Reference Data Sets project, National Institute of Standards and Technology, http://www.cfreds.nist.gov/.

[4] Jesse Kornblum, ssdeep, http://ssdeep.sf.net/.

[5] Vassil Roussev, sdhash, http://sdhash.org/.
LinkLeave a comment

Where is sha3deep? The Standardization Process [Aug. 7th, 2013|10:01 am]
[Tags|, , ]

You might think that because NIST chose KECCAK to be SHA-3 standard that we can all start using SHA-3 right away. Unfortunately it's not quite that simple. In my last post on the topic I mentioned a few "flavors" of SHA-3. These correspond to input parameters to the underlying algorithm. They control the performance of the code and the size of the output. As you might imagine, there is an inverse relationship between performance and resistance to attack. The KECCAK team gave a detailed talk on these issues in February [1].

NIST will eventually make a decision about these trade offs and declare one (or more) standard versions of SHA-3 in a Federal Information Processing Standard (FIPS). When that document is released, I can get to work making a sha3deep [2].

The NIST standardization process is not opaque! Later this fall NIST will have a public comment period before making a decision in 2014 [3]. I strongly encourage you to make your opinions known to NIST. The opinions of practitioners—the people who are going to be using the algorithm—matter as much as the opinions of mathematicians.



[1] http://csrc.nist.gov/groups/ST/hash/sha-3/documents/Keccak-slides-at-NIST.pdf

[2] There is some experimental SHA-3 code in the Hashdeep git repo now, but you don't want to use it.

[3] http://csrc.nist.gov/groups/ST/hash/sha-3/sha-3_standardization.html
LinkLeave a comment

The Forensics of Things Starts With Cruise Ships [Aug. 6th, 2013|11:22 pm]
[Tags|, , , ]

The Internet of Things is going to lead us to the Forensics of Things. Do you think your {thermostat, electric meter, car, DVR, elevator, refrigerator, hotel door lock} isn't going to get questioned at some point? It's coming faster than you think.

Starting with cruise ship forensics, Modern ships Voyage Data Recorders: A forensics perspective on the Costa Concordia shipwreck.
LinkLeave a comment

Filecompare and Colorize updated to handle large files [Jul. 19th, 2013|08:32 am]
[Tags|, , , ]

I've updated both filecompare and colorize to handle large files on Windows. Special thanks to M— for reporting the issue and testing out the fix. If you'd like to stop reading now, here are links to the updated Windows binary and source code.

The technical explanation is that the program was failing because the fseek() and ftell() functions couldn't handle files larger than 4GB. Switching to the Microsoft 64-bit functions _ftelli64() and _fseeki64(), solved the problem. Unfortunately, however, the cross-compiler I was using didn't support them. Until now I've been cross-compiling from OS X to Windows using the MacPorts MinGW32. That compiler is using code last updated in 2006. Although newer is not always better, one has to assume that things have improved since the movie 300 was released.

Thanks to some great work over at the MinGW-w64 project, I am now able to use these 64-bit functions in my programs. This is great news for filecompare and colorize, of course. But you can expect to see my other projects switch over to the new compilers as well.

Happy Friday!
LinkLeave a comment

Search All The Strings! [Jul. 18th, 2013|09:31 pm]
[Tags|]

Here's something which is both simple and useful regarding strings. Search all the strings!

For better or worse, the 'strings' program varies widely between operating systems. It flat out doesn't come with Windows, on Linux doesn't search the whole file or for Unicode strings by default, and on OS X simply can't search Unicode strings. What is a forensic examiner to do?

Here's a little wrapper script around the srch_strings program from the Sleuth Kit. It ensures that you get both ASCII and Unicode strings together, search the whole file, and get consistent results on both Linux and OS X. Special thanks to chort for the idea to use srch_strings in the first place. (Sorry Windows users, but there is no pre-built srch_strings binary, which means I can't write a batch script around it.)

To use the script, copy the text below to a new file. I call my mine allstrings.sh. Make it executable

$ chmod 755 allstrings.sh

and then put it in a directory in your PATH. Places like ~/bin or /usr/local/bin should work.

You may need to update the script for your system depending on where srch_strings is installed. You can determine that location using the which command:

$ which srch_strings
/opt/local/bin/srch_strings


If there's no output, you don't have srch_strings installed. See http://www.sleuthkit.org/sleuthkit/download.php to get it directly or, on OS X, get it through MacPorts, http://macports.org/.

If the output is not /opt/local/bin/srch_strings, replace that value in the script with the value you get.

Here's a sample of the script in action. I'm using /usr/bin/awk as a source of both Unicode and ASCII strings. I'm running srch_strings on the whole file, and then searching the whole file for Unicode strings. Finally, I use my script on awk.

$ srch_strings -a /usr/bin/awk | wc -l
     929
$ srch_strings -a -e l /usr/bin/awk | wc -l
     187
$ allstrings.sh /usr/bin/awk | wc -l 
    1116


When searching the entire file, the srch_strings program found 929 ASCII strings and then separately187 Unicode strings. Adding these together, 929+187 = 1116, or the number of strings found with the allstrings.sh script.

Although this script can process multiple files on the command line, it can't process standard input.

Okay! Enough talk. Here's the script:
#!/bin/bash

SRCH_STRINGS=/opt/local/bin/srch_strings
FLAGS=-a

for FILE in "$@"
do
	$SRCH_STRINGS $FLAGS "$FILE"
	$SRCH_STRINGS $FLAGS -e l "$FILE"
done
Link1 comment|Leave a comment

ssdeep 2.10 released [Jul. 17th, 2013|12:19 am]
[Tags|, ]

Thank you to everybody who beta tested the new ssdeep! I've fixed the bugs you found and have released the new version 2.10. As a reminder, this version contains a complete re-write of the hashing engine itself. The code, written by Helmut Grohne, is now thread-safe. Please note the ssdeep program is not multi-threaded—yet. But the library on which its based can now easily be used in multi-threaded applications.

This version also fixes a long-standing bug regarding long file paths on Windows. By default, paths are limited to 255 characters. There is a way to extend that limit, however, which I have finally gotten around to implementing.

You can directly download the Windows binaries or source code.
Link2 comments|Leave a comment

Beta version of ssdeep 2.10 [Jul. 10th, 2013|12:09 am]
[Tags|, , ]

I have published a new (beta) version of the ssdeep tool for fuzzy hashing. Although reserved for testing purposes only at this point, this is an exciting development for several reasons.

First, this version contains a complete re-write of the hashing engine itself. The code, written by Helmut Grohne, is now thread-safe. Please note the ssdeep program is not multi-threaded—yet. But the library on which its based can now easily be used in multi-threaded applications.

Second, this version fixes a long-standing bug regarding long file paths on Windows. By default, paths are limited to 255 characters. There is a way to extend that limit, however, which I have finally gotten around to implementing.

Third and finally, this version contains a few miscellaneous bug fixes.

That being said, of course, this is a beta version and I'm sure there are a few new bugs I've introduced. The first person who reports each new, reproducable bug gets a free beer from me. Please post those bug reports to the SourceForge site.

Thanks, and happy fuzzy hashing!
LinkLeave a comment

navigation
[ viewing | most recent entries ]
[ go | earlier ]