A Geek Raised by Wolves [entries|archive|friends|userinfo]
jessekornblum

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Links
[Links:| Browse by Tag LiveJournal Portal Update Journal Logout ]

Fuzzy Hashing in FTK [Mar. 26th, 2009|10:47 pm]
[Tags|, ]

So apparently I've been asleep at the switch. Fuzzy hashing has been incorporated into AccessData's flagship Forensic Toolkit! Not only have they added the feature but they've also written a great paper describing fuzzy hashing and how it works in FTK.

Now I know what some of you are thinking. How did AccessData include fuzzy hashing, which is licensed under the GPL2, in a proprietary program like FTK? Well, to tell you the truth, in all this excitement I kind of lost track myself. (Wait... wrong speech.)

I think AccessData rewrote fuzzy hashing. The edit distance code, for example, has been replaced with some database calls. I don't know how they're computing the rolling and FNV hashes, but if they took the time to rewrite the edit distance code they probably rewrote the rest too. The edit distance code dates back to 1989 and was last updated in 1993. There's no sense in rewriting something that's been working for fifteen years unless you absolutely must.

Regardless, go forth and be fuzzy!
LinkLeave a comment

Revised journal articles on BitLocker, Buffalo [Feb. 5th, 2009|12:06 am]
[Tags|, , ]

Good news everyone! Thanks to a revised legal agreement from the journal Digital Investigation I have been able to publish the edited versions of my papers Implementing BitLocker Drive Encryption for Forensic Analysis and Using Every Part of the Buffalo in Windows Memory Analysis. Although there isn't much new content in the latter, the former was almost entirely rewritten between its original submission and the present form. Enjoy!
Link1 comment|Leave a comment

md5deep and Cygwin Ports [Jul. 19th, 2008|01:07 pm]
[Tags|, , ]

Thanks to a blog post by Mark Stam about using md5deep, I've discovered that md5deep has been added to the Cygwin Ports project. The project "provides Cygwin binary and source packages for a large variety of programs and libraries, including the GNOME and KDE desktop environments." This means that Cygwin users can download a binary package of md5deep and its associated tools.

Because I'm not a Cygwin user it's hard for me to test out the automatic installation method, but it appears that you should be able use Cygwin's Setup program to get those ports by adding ftp://sunsite.dk/projects/cygwinports to the server list.

And yes, the screenshot in Mark's post does look a little odd. I'm looking into it.
LinkLeave a comment

md5deep version 3.0 alpha1 [Mar. 12th, 2008|11:42 pm]
[Tags|, ]

I have published an alpha version of md5deep 3.0. Although not much has changed for our friends md5deep, sha1deep, etc, I have created a new program, hashdeep. This program supports multihashing, or computing more than one hash algorithm at a time. I'll post more details later, including describing the new audit mode, but in the meantime please remember this is alpha quality code.

By default the program computes both MD5 and SHA-256 hashes:

$ hashdeep foo bar
%%%% HASHDEEP-1.0
%%%% size,md5,sha256,filename
## Invoked from: /Users/jessekornblum
## $ hashdeep foo bar
29,69a3a1f6e6f671a1a158ee09c7016ec7,f9650a0cf19e246a158318399d35e3d1a27697ceea2ac4abdc6a4ca2b6b6b75c,/Users/jessekornblum/foo
29,30290eea368926965343ce8ff30a458e,26d7b73c6ffd2fa09c0e30d947c776f229d1d6315dd4ab7e012f484c1bad2ed0,/Users/jessekornblum/bar


You can specify more (or fewer) hashes to compute with the -c flag.

$ hashdeep -c md5,tiger,whirlpool,sha256 foo bar
%%%% HASHDEEP-1.0
%%%% size,md5,sha256,tiger,whirlpool,filename
## Invoked from: /Users/jessekornblum
## $ hashdeep -c md5 foo bar
29,69a3a1f6e6f671a1a158ee09c7016ec7,f9650a0cf19e246a158318399d35e3d1a27697ceea2ac4abdc6a4ca2b6b6b75c,7c29873518894c1c6bd793f2f22d2f766fd4cebe4580782e,70f541f3b09a8fbea0f0b5cb4dc4ce86ca2dfe1f50f6e6e6328bb00451ecaad62afbc44ac2d3872c3610f2f540a2027f6f930cbad32b38480d4a05bb70da8ec2,/Users/jessekornblum/foo
29,30290eea368926965343ce8ff30a458e,26d7b73c6ffd2fa09c0e30d947c776f229d1d6315dd4ab7e012f484c1bad2ed0,035d6097b7d7ec26cca39a843949ed7c0d789b4d1a5c0def,556d60677e47cf3b1befbdf4595cef6c1e7aaea8a32255d2039d5d5c6b710503d7576fdf76c81482ea6b29e13f75ca33661fb12bb1b8f75ea2d931afa77e3054,/Users/jessekornblum/bar


Note that the output records the command line arguments and indicates which kinds of hashes a file contains.
Link1 comment|Leave a comment

Fuzzy Hashing and the Russian Business Network [Mar. 3rd, 2008|08:47 am]
[Tags|, ]

Although I haven't had a chance to read it, the Shadowsever Foundation has published a whitepaper titled, RBN "Rizing". The paper describes the Russian Business Network and their activities in Turkey late last year. What jumps right out at me is that they used fuzzy hashing to find correlations between malware samples. (Geek to English translation: They used one of my tools to find a pattern of bad guy activity.) Absolutely a wonderful thing to wake up to on Monday morning!
Link1 comment|Leave a comment

Hash Algorithm Contest [Nov. 13th, 2007|07:00 pm]
[Tags|, ]

The National Institute of Standards and Technology (NIST) is holding a cryptographic hash algorithm contest. Details from their web site:
NIST has opened a public competition to develop a new cryptographic hash algorithm, which converts a variable length message into a short “message digest” that can be used for digital signatures, message authentication and other applications. The competition is NIST’s response to recent advances in the cryptanalysis of hash functions. The new hash algorithm will be called “SHA-3” and will augment the hash algorithms currently specified in FIPS 180-2, Secure Hash Standard. Entries for the competition must be received by October 31, 2008. The competition is announced in the Federal Register Notice published on November 2, 2007
There are many more details available on the competition homepage. I'm hopeful that md5deep will support the winning SHA-3 algorithm.
LinkLeave a comment

DFRWS Day 1 [Aug. 14th, 2007|01:49 am]
[Tags|, , , , ]

After a full day of geekery (and an excellent dinner thanks to a recommendation from ...... ..... .....), here is my writeup from the first day of the 7th Annual Digital Forensic Research Workshop. This is a conference for academics and practitioners working in computer forensics, and as best as I can tell, is the best deal going for seeing what will be cutting edge next year. I say next year because all of the ideas presented here will require some serious elbow grease before they can be moved into a production environment. But the people here are creating the future of our industry.

Something nifty: my work on fuzzy hashing has been expanded by another group of researchers into Multi-Resolution Similarity Hashing. Obviously I haven't had time to evaluate their paper, but I'm thrilled that other people are looking into the problem.

The results from the file carving challenge are also impressive. The science of file carving has come a long way since the early days of foremost. As I'm updating that program to version 2.0 now, I'm hoping to use some of the new techniques developed here. (Like I said, this conference showcases what will be cutting edge in 2008.)

In an unrelated foremost note, you can now use foremost to carve files without pressing a button... but you must think in Russian!

And so, after a day of far too much coffee, I am off to bed.
LinkLeave a comment

md5deep is popular, being updated soon [May. 22nd, 2007|12:00 pm]
[Tags|, ]

Last night I was doing some updates on md5deep and looked at the project statistics. If my math is right, it appears that over the past year, md5deep has been downloaded about 115 times per day. I don't know if that counts copies that included with *nix distributions, but WOW!

Anyway, a new version is coming Real Soon Now, although there won't be too many changes from the user's perspective. I'm fixing what I thought to be a cosmetic bug on Windows but turned out can cause the program to fail. Also new will be support for HP/UX, a new mode to only produce warnings for improperly formatted hashes in the known files, and a more robust piecewise hashing mode. Under the hood, I'm changing the way the program's state is maintained. This could turn out to be a version 2.0 release (as it may require extensive testing with the new architecture) but should hopefully be easier to maintain in the long run. Look for a beta version by mid-June!
LinkLeave a comment

Open Source Encase Tools [May. 13th, 2007|10:53 pm]
[Tags|, , ]

Although I had heard of the project before, this week I got a chance to around with the tools from the libewf project. It's designed to parse the Expert Witness Compression Format (EWF) which is used in both EnCase and SMART images. Very cool!
Link4 comments|Leave a comment

Parsing web server logs for search queries [Feb. 24th, 2007|10:39 am]
[Tags|, ]

I've been looking at my web server logs recently because, well, I'm a geek and my wife is out of town. Using some homebrew scripts I've been able to get a peek at what people have been searching for when they reach my web site.

Every time you view a web page your computer sends a bunch of information to the web server sending you the page. Not only does your computer ask for a particular page, but it also says which page you are coming from. For example, if you're viewing http://www.whitehouse.gov/ and click on the link for http://svr.gov.ru/ [1], the svr.gov.ru web server is told that you came from www.whitehouse.gov. The site that sent you to a new web site is called the referring site and the specific page you were on is called the referring URL [2].

The referring URL can be very informative when dealing with search engines. When you run a search on Google for example, your search terms appears in the URL. For example, searching for "Happy Puppies" on Google sends me to the page http://www.google.com/search?hl=en&q=%22Happy+Puppies%22&btnG=Google+Search. See the words "Happy Puppies" in there? If the user then clicks on a link from that page, the Google URL, including the search term, is sent to the web site hosting the search result.

For example, let's say that somebody searches for the phrase "jesse kornblum" on Google and then follows a link to my site. My computer will record something like this in the log:

18.72.0.3 - - [01/Jan/1904:15:34:11 -0500] "GET /porn/goats/hotgoat07.html HTTP/1.0" 200 8417 "http://www.google.com/search?hl=en&q=jesse+kornblum" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"

In this record, we see somebody from the IP address 18.72.0.3 used Google to search for "jesse kornblum" at 3:34pm on 1 Jan 1904. The logging format, although complete, can be a little tedious to read by hand. Thus I wrote a quick and dirty script to parse out the search terms. )

Most of the searches involved my name, tools I had written, presentations I had given, or something else that made sense. But along with those have been some really interesting queries:

Zoey the dog
zoey naked
naked zoey
luke for zoey
WHY DOGS HIDE UNDER THE BED
caninity
a bigger dog
Zoey ER
tennis zoey
play bows
zoey 18
7508736 BIOS
3efea3144abee232fda1719d2c1a4066
under the clothes
18 inches
goat farm
cow and goat
your a goat
farm cow
koala gif
out of time the horse
horse run
goat milks
horse cow
in peril


[1] This is the web site for the Служба внешней разведки (Sluzhba Vneshney Razvedki), or the Russian Foreign Intelligence Service.

[2] My proxy server, Privoxy, helps me by always sending the referring URL as belonging to the new site I'm visiting. For example, if I'm visiting http://www.whitehouse.gov/pages/context04.html and click on the link for http://svr.gov,ru/, my computer sends the referring URL as http://www.svr.ru/.
Link3 comments|Leave a comment

Fuzzy Hashing MP3s [Jan. 18th, 2007|09:41 am]
[Tags|, , ]

A number of people ([info]granting, [info]brad, [info]thewronghands among them) have discussed using my fuzzy hashing program to find identical MP3 files with different ID3 tags. For example, let's say you had two copies of Hocus Pocus by Focus. One of them is tagged with the title "HOCUS POCUS" and the other as "Hocus Pocus (Focus)". The songs are identical; they came from the same (legal) ripping. The MD5 hashes of these two songs shows that they're different:
C:\> md5deep -b hocuspocus.mp3 "Focus - Hocus Pocus.mp3"
e1b50b4fcbb3b505bf0bd2d5f773dd74  hocuspocus.mp3
7522632eb4c29cf255a65a5e397834ac  Focus - Hocus Pocus.mp3
But because the only differences between the files are in the ID3 tags at the end, a fuzzy hash is a great way to find the association, even if they're in different directories. The -d switch prints one-way matches (e.g. A matches B) and the -p mode prints two-way matches (e.g. A matches B and B matches A).

C:\> ssdeep -rd mp3
C:\mp3\unsorted\hocuspocus.mp3 matches C:\>mp3\Focus\Moving Waves\01 Hocus Pocus.mp3


Or, for the two way matches:

C:\> ssdeep -rp mp3
C:\mp3\unsorted\hocuspocus.mp3 matches C:\>mp3\Focus\Moving Waves\01 Hocus Pocus.mp3

C:\>mp3\Focus\Moving Waves\01 Hocus Pocus.mp3 matches C:\mp3\unsorted\hocuspocus.mp3


Either way, one command line finds all of your matching files!
Link1 comment|Leave a comment

Buffalo Paper Accepted [Jan. 8th, 2007|08:29 am]
[Tags|, , ]

I am pleased to announce that my paper "Using Every Part of the Buffalo in Windows Memory Analysis" has been accepted for publication in the journal Digital Investigation! I should have a preprint available soon, but here is the abstract:
All Windows memory analysis techniques depend on the examiner’s ability to translate the virtual addresses used by programs and operating system components into the true locations of data in a memory image. In some memory images up to 20% of all the virtual addresses in use point to so called “invalid” pages that cannot be found using a naive method for address translation. This paper explains virtual address translation, enumerates the different states of invalid memory pages, and presents a more robust strategy for address translation. This new method incorporates invalid pages and even the paging file to greatly increase the completeness of the analysis. By using every available page, every part of the buffalo as it were, the examiner can better recreate the state of the machine as it existed at the time of imaging.
Link2 comments|Leave a comment

New Curriculum Vitae [Nov. 18th, 2006|11:36 am]
[Tags|, , ]

Having just published one paper and (hopefully) submitting another one soon, I decided it was time to give my Curriculum Vitae a new look. The old version was HTML, and although functional, a real pain to update. This new version was made with LaTeX thanks to a template from Jason Blevins. What do you think?
LinkLeave a comment

Computer Forensics Podcast [Dec. 20th, 2005|07:11 am]
[Tags|, , ]

Hey computer forensic geeks! Check out the new CyberSpeak podcast all about, well, computer forensics! Recorded by two former federal agents, these shows appear both interesting and informative. The latest show even has an interview with our own [info]nickharbour!
Link1 comment|Leave a comment

navigation
[ viewing | most recent entries ]

Advertisement