Log in

A Geek Raised by Wolves [entries|archive|friends|userinfo]

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

[Links:| Browse by Tag LiveJournal Portal Update Journal Logout ]

Why LiveJournal [Mar. 21st, 2017|09:10 am]
Somebody recently asked me why I stick with LiveJournal after all of these years. It's because of Frank, http://www.livejournal.com/site/goat.bml
Link1 comment|Leave a comment

Beta version of sha3deep produces SHA-384 hashes [Feb. 26th, 2017|12:28 am]
[Tags|, , ]

After reading the news about SHA-1 falling this week, I decided to finally implement SHA-3 in Hashdeep. I have published a beta version of hashdeep/md5deep which includes the SHA-3 algorithm.

Like SHA2, the SHA3 specification has four bit lengths, 224, 256, 384, and 512 bits. These four variants are technically called SHA2-224, SHA2-256, SHA2-384, and SHA2-512. For better or worse, SHA2-256 has become the de facto standard. When people refer to "SHA256", they are actually referring to SHA2-256.

When making sha3deep, I had to choose a variant. To avoid any confusion, I chose SHA3-384, the 384 bit variant as. I chose this variant as it is stronger than a 256 bit variant and will produce an output with a different length. SHA2-256 hashes are 64 bytes when represented in hex. The hashes generated by sha3deep, for now, are 96 bytes. This unique length should make them recognizable even without context. Will these new hashes become "SHA384"? "SHA3"? I don't know. Let me know what you think in the comments.

You can download and try out the new code at https://github.com/jessek/hashdeep/tree/sha3deep.

Here are some vectors for SHA1, SHA2, and SHA3, http://www.di-mgt.com.au/sha_testvectors.html. For example, here are sha256deep and sha3deep in action:

$ echo -n 'abc' | ./sha256deep

$ echo -n 'abc' | ./sha3deep

Special thanks to Andrey Jivsov for publishing a SHA3 implementation in the public domain, https://github.com/brainhub/SHA3IUF.
LinkLeave a comment

The ssdeep team, version 2.13, and moving to GitHub [Jun. 5th, 2015|12:21 am]

Here are a few updates about ssdeep. In the post I'll be talking about the people in the project, the latest release, and our upcoming move to Github. For the impatient, we have published ssdeep version 2.13. You can download a Windows binary from http://sourceforge.net/projects/ssdeep/files/ssdeep-2.13/ssdeep-2.13.zip/download and the *nix source code http://sourceforge.net/projects/ssdeep/files/ssdeep-2.13/ssdeep-2.13.tar.gz/download.

Before talking about the changes in the program itself, I want to introduce the two people who have done the majority of the work in this release. These folks have generously volunteered their time and considerable expertise to make the program better. I'd like to acknowledge them here so that they can get the credit they have earned.

First, Helmut Grohne has been working on the code since 2013, including doing a major update of the fuzzy hashing engine. In his words, hacks on free software related to Debian and on software related to quality assurance in general. His current pet project is making the core of Debian cross buildable, see https://wiki.debian.org/HelmutGrohne/rebootstrap for details.

Second, Tsukasa OI has written several improvements and major bug fixes in the fuzzy hashing engine. You will find several of his innovations in the latest release, described below. These made ssdeep faster, behave properly on unusual files, and in general, a better program.

Both of these people have made significant improvements in the program for version 2.13. In this release we've added some new features and fixed a few bugs. The most visible change is that the bug fixes will change the hash computation and hash comparisons for a small number of files. First, the program can now handle inputs up to 192 GB. Previously both were limited to much smaller sizes. Next, the hash generation and comparison functions have been improved when working with relatively simple files. These can be small files or files with low entropy. Finally, we've fixed some portability issues for getting the program to compile and run on different systems. Please let us know if you have any questions about what we've done or have ideas on future enhancements.

Finally, I am moving the project to Github. There are several reasons for this change, but I am hoping that the new UI will enable us to develop the program more quickly and more openly share it with the community. The move will take some time, and a little while longer to restore the releases. I'll do my best to get things up and running again quickly, but ask for your patience during this process!
LinkLeave a comment

ssdeep 2.11 Released [Sep. 11th, 2014|04:04 pm]
[Tags|, ]

I have published version 2.11 of ssdeep. This is an important update, as described below, and you are encouraged to update immediately. You can download Windows binaries or the *nix source code.

This version corrects a bug in the signature generation code. That is, version 2.10 was generating signatures which were slightly different than in version 2.9. In some cases, the trailing character of each portion of the signature was being truncated. You can see this with an example: Let's look at the /usr/bin/bc file which ships with OS X. It has a SHA-256 hash of cc8e7502045c4698d96e6aa5e2e80ebb52c3b9c266993a8da232add49c797f3e and you can see it on VirusTotal.

When you hash this file with version 2.9, you get:


With version 2.10:


Note that the trailing 'e' character disappears in the second hash. What was 'mxMYE' is now 'mxMY'. The new version of ssdeep, version 2.11, restores the original signatures:


Alert readers will notice that VirusTotal has the ssdeep hash from version 2.10. This leads to my next point, which is that any ssdeep hashes you've created with version 2.10 should be recomputed. The signatures aren't wrong per se. They're just not as good as they should be. For reference, version 2.10 was released in July 2013, and so you should update any hashes produced after that date.
LinkLeave a comment

Hashdeep version 4.4 released [Jan. 29th, 2014|12:41 pm]
[Tags|, ]

I have just published Hashdeep version 4.4 (aka md5deep). There is one new feature, -E mode adds case insensitive auditing. Otherwise this version has a lot of bug fixes and clean up, but is not a high-priority update. This is my first use of Github's new 'Releases' tool, so please let me know what you think!

LinkLeave a comment

Searching the NSRL online [Jan. 13th, 2014|02:48 pm]

You can now search the NSRL online thanks to SA David Black, http://www.hashsets.com/nsrl/search/.
Link1 comment|Leave a comment

Visualizing Recovered Executables from Memory Images [Aug. 12th, 2013|11:52 pm]
[Tags|, , ]

I like to use a picture to help explain how we can recover executables from memory images. For example, here's the image I was using 2008:

This post will explain what's happening in that picture—how PE executables are loaded and recovered—and provide a different visualization of the process. Instead of just a stylized representation, we can produce pictures from actual data. This post explains how to do that and the tools used in the process.

When executables are loaded from the disk, Windows uses the PE header to determine how many pages, and with which permissions, will be allocated for each section. The header describes the size and location of each section on the disk and its size and location in memory. Because the sections needs to page aligned in memory, but not on the disk, this generally results in some space being added between the sections when they're loaded into memory. There are also changes made in memory due to relocations and imported functions.

When we recover executables from memory, we can use the PE header to map the sections back to their size and locations as they were on the disk. Generally memory forensics tools don't undo the other modifications made by the Windows loader. The changes made in memory remain the new version we recover. In addition, due to paging and other limitations we don't always get all of the pages of the executable from memory. They could have been paged out, are invalid, or were never loaded in the first place.

That's a tidy description of the picture above. The reality, of course, is a little messier. I've used my colorize and filecompare tools to produce visualizations for an executable on the disk, what it looked like in memory, and what it looked like when recovered from the memory image. In addition to those tools, I used the Volatility™ memory forensics framework [1] and the Picasion tool for making animated gifs [2]. For the memory image, I'm using the xp-laptop memory image from the NIST CFReDS project [3]. In particular, we'll be looking at cmd.exe, process 3256.

Here's a representation of the original executable from the disk as produced with colorize. This image is a little different than some of the others I've posted before. Instead of being vertically oriented, it's horizontal. The data starts at the top left, and then goes down and then right. I've also changed the images to be 512 pixels wide instead of the default 100. I made the image this way to make it appear similar to the image at the start of this post. Here's the command I used to generate the picture:

$ colorize -o -w 512 cmd.exe

and here's the result: http://jessekornblum.com/tools/colorize/img/cmd.exe.bmp

It gets interesting when we compare this picture to the data we can recover from the memory image. First, we can recover the in-memory representation of the executable using the Volatility™ plugin procmemdump. In the files generated by this plugin the pages are memory aligned, not disk aligned. Here's the command line to run the plugin:

$ python vol.py -f cases/xp-laptop-2005-07-04-1430.vmem --profile=WinXPSP2x86 procmemdump --pid=3256 --dump-dir=output
Volatile Systems Volatility Framework 2.3_alpha
Process(V) ImageBase  Name                 Result
---------- ---------- -------------------- ------
0x8153f480 0x4ad00000 cmd.exe              OK: executable.3256.exe

Here's how we can colorize it:

$ mv executable.3256.exe executable-procmemdump.3256.exe
$ colorize -o -w 512 executable-procmemdump.3256.exe

Which leads to this result: http://jessekornblum.com/tools/colorize/img/executable-procmemdump.3256.exe.bmp

There's a lot going on here, but things will get more clear with a third image. For the third picture we'll recover the executable again, but this time realigning the sections back to how there were on the disk. This is done by parsing the PE header in memory and using it to undo some of the changes made when it was loaded. We can do this using the procexedump plugin, like this:

$ python vol.py -f xp-laptop-2005-07-04-1430.vmem --profile=WinXPSP2x86 procexedump --pid=3256 --dump-dir=output
Volatile Systems Volatility Framework 2.3_alpha
Process(V) ImageBase  Name                 Result
---------- ---------- -------------------- ------
0x8153f480 0x4ad00000 cmd.exe              OK: executable.3256.exe

We repeat the process for colorizing this sample:

$ mv executable.3256.exe executable-procexedump.3256.exe
$ colorize -o -w 512 executable-procexedump.3256.exe

Which produces this image: http://jessekornblum.com/tools/colorize/img/executable-procexedump.3256.exe.bmp

First, let's compare the recovered executable back to the original. Even before we start our visualizations, we can see there were changes between the original and this version. The MD5 hashes of the two files are different:

$ md5deep -b cmd.exe executable-procexedump.3256.exe
eeb024f2c81f0d55936fb825d21a91d6  cmd.exe
ff8a9a332a9471e1bf8d5cebb941fc66  executable-procexedump.3256.exe

Amazingly, however, they match using fuzzy hashing via the ssdeep tool [4]:

$ ssdeep -bda cmd.exe executable-procexedump.3256.exe
executable-procexedump.3256.exe matches cmd.exe (66)

There's also a match with the sdhash similarity detection tool [5]:

$ sdhash -g -t 0 cmd.exe executable-procexedump.3256.exe

(You haven't heard of sdhash? Don't get tunnel vision! There are many similarity detection tools.)

Those matches are good signs. But attempting to compare the colorized image of the recovered executable back to the original is a little tricky. To make it easier, I made a kind of blink comparator. The free site Picasion allows you to make animated GIFs from submitted pictures. Combined with some annotations on the pictures, here's the result:

There are two important things to notice here. First, we didn't recover all of the executable. The bands of black which appear on the left-hand side in the recovered image are pages which weren't found in memory. Also notice how much of the data from the end of the file is missing, too. Almost all of it! (Isn't it amazing that fuzzy hashing can still generate a match between these two files?)

The second thing to notice is the changes in the data. It's a little hard to see in the GIF, but you can get a better view using the filecompare and colorize tools together. We can compare the two files at the byte level and then colorize the result:

$ filecompare -b 1 cmd.exe executable-procexedump.3256.exe > orig-to-exe.dat
$ colorize -o - w 512 orig-to-exe.dat

Here's the result: http://jessekornblum.com/tools/colorize/img/orig-to-exe.dat.bmp

Here we can clearly see, in red, the changes throughout the file. The blocks of mostly red, or heavily speckled red, and the places where we weren't able to recover data from the memory image. Because some of the values in the original executable were zeros, those appear to match the zeros we recovered from the memory image--hence the speckled pattern. The changes to the executable you can clearly see a pattern of dashed red lines.

Finally, we can visualize the changes between the in-memory representation of the file and the disk representation the file. I've made another animated GIF, this time between these versions of the executable as recovered by procexedump and procmemdump:

The most obvious difference between these two pictures is the black band on the left-hand side of the image. That's the space, created by the realignment from disk to memory, being added by the Windows loader to page align the first section of the executable.


[1] The Volatility™ framework, https://code.google.com/p/volatility/. Volatility™ is a trademark of Verizon. Jesse Kornblum is not sponsored or approved by, or affiliated with Verizon.

[2] Picasion.com, http://picasion.com/.

[3] The Computer Forensic Reference Data Sets project, National Institute of Standards and Technology, http://www.cfreds.nist.gov/.

[4] Jesse Kornblum, ssdeep, http://ssdeep.sf.net/.

[5] Vassil Roussev, sdhash, http://sdhash.org/.
LinkLeave a comment

Where is sha3deep? The Standardization Process [Aug. 7th, 2013|10:01 am]
[Tags|, , ]

You might think that because NIST chose KECCAK to be SHA-3 standard that we can all start using SHA-3 right away. Unfortunately it's not quite that simple. In my last post on the topic I mentioned a few "flavors" of SHA-3. These correspond to input parameters to the underlying algorithm. They control the performance of the code and the size of the output. As you might imagine, there is an inverse relationship between performance and resistance to attack. The KECCAK team gave a detailed talk on these issues in February [1].

NIST will eventually make a decision about these trade offs and declare one (or more) standard versions of SHA-3 in a Federal Information Processing Standard (FIPS). When that document is released, I can get to work making a sha3deep [2].

The NIST standardization process is not opaque! Later this fall NIST will have a public comment period before making a decision in 2014 [3]. I strongly encourage you to make your opinions known to NIST. The opinions of practitioners—the people who are going to be using the algorithm—matter as much as the opinions of mathematicians.

[1] http://csrc.nist.gov/groups/ST/hash/sha-3/documents/Keccak-slides-at-NIST.pdf

[2] There is some experimental SHA-3 code in the Hashdeep git repo now, but you don't want to use it.

[3] http://csrc.nist.gov/groups/ST/hash/sha-3/sha-3_standardization.html
LinkLeave a comment

The Forensics of Things Starts With Cruise Ships [Aug. 6th, 2013|11:22 pm]
[Tags|, , , ]

The Internet of Things is going to lead us to the Forensics of Things. Do you think your {thermostat, electric meter, car, DVR, elevator, refrigerator, hotel door lock} isn't going to get questioned at some point? It's coming faster than you think.

Starting with cruise ship forensics, Modern ships Voyage Data Recorders: A forensics perspective on the Costa Concordia shipwreck.
LinkLeave a comment

Filecompare and Colorize updated to handle large files [Jul. 19th, 2013|08:32 am]
[Tags|, , , ]

I've updated both filecompare and colorize to handle large files on Windows. Special thanks to M— for reporting the issue and testing out the fix. If you'd like to stop reading now, here are links to the updated Windows binary and source code.

The technical explanation is that the program was failing because the fseek() and ftell() functions couldn't handle files larger than 4GB. Switching to the Microsoft 64-bit functions _ftelli64() and _fseeki64(), solved the problem. Unfortunately, however, the cross-compiler I was using didn't support them. Until now I've been cross-compiling from OS X to Windows using the MacPorts MinGW32. That compiler is using code last updated in 2006. Although newer is not always better, one has to assume that things have improved since the movie 300 was released.

Thanks to some great work over at the MinGW-w64 project, I am now able to use these 64-bit functions in my programs. This is great news for filecompare and colorize, of course. But you can expect to see my other projects switch over to the new compilers as well.

Happy Friday!
LinkLeave a comment

[ viewing | most recent entries ]
[ go | earlier ]