Thursday, March 28, 2013

Crafting good keywords in EnCase and using conditions to refine results

I was contacted today by an examiner asking about some search hits that contained a very common keyword and how to quickly and easily refine them down to a relevant subset. This is pretty basic EnCase training stuff, but I figured I would put together a quick post that explained some things to think about when crafting keywords and then also the basic mechanics of creating a filter than can help you narrow down your hits. For this article, I am only going to discuss conducting a 'live' keyword search across an image, not using the index feature.

Filters & Conditions are probably one of the most underutilized features in EnCase and they can certainly help with issues like this when you have already run a search and then later realized the keyword was not the best choice. The best way to avoid this is to think ahead and use a good keyword that eliminates a lot of the noise for you and will hopefully leave you with just relevant hits.

Lets first talk about some ideas when crafting your keyword(s) to save you time later. For the purposes of this article, lets assume I want to search for the ANSI keyword "soft". As you can image, this keyword is likely to return thousands and thousands of search hits that are not relevant and create a lot of noise that I would have to filter through to get to the good stuff.

Admittedly, most people could predict that a keyword of 'soft' would result in a lot of hits (software, Microsoft, etc). But sometimes its not that apparent until after you start the search and later realize that your unique keyword is not so unique after all. This is where we will leverage a condition to help, but lets get back to choosing a good keyword.

Int he past, I have seen a lot of examiners create a keyword such as " soft ". Yes, that's a space in front and a space after the keyword itself. While this may work for a small percentage of the hits, this will most definitely miss things you would probably want to see. I do not recommend you ever use a keyword like this except in very rare unique cases. While this keyword would certainly exclude hits such as "Microsoft" & "software", you are also going to miss "soft." or "soft" at the beginning of a line.

Using a test image, a search of " soft " (space before and after the keyword) results in 167 search hits being found.



Another approach is to use the "Whole Word" option in EnCase:


Using the same test image, a search using this option resulted in 6,036 search hits.



Using the same test image, a search of 'soft' (disregarding where is may appear in other words or sentences) resulted in 2,894,317 search hits:



Using a GREP statement of '[^a-z]soft[^a-z]' on the same test image resulted in 6,272 search hits:


So what does the GREP search include that the "Whole word" option does not? The GREP expression catches every occurrence of 'soft' when not preceded or proceeded with a alpha character (ASCII dec 65-90 or dec 97-122) . All other characters are acceptable (ASCII dec 0- 64, dec 91-96 & dec 124-254). 

The "Whole Word" option does not catch all the non-alpha characters. It catches the common punctuation characters, but excludes high-ASCII characters. You can see an example here of what the GREP term found, but the "Whole word" option missed:


The other thing to notice is that with the "whole word" option, the preceding proceeding characters are not part of the hit, whereas with the GREP term they are. This changes your total hit size and beginning offset in case you are running some type of script that requires knowing the starting offset of where they keyword hit begins.

As a reminder, in general GREP searches require more time to run than non-GREP keywords. The exception to this is when using the "Whole word" option since this is treated similar to a GREP search where there could be multiple valid characters that precede/proceed the keyword and have to be checked (thus requiring more time).

So what if you have run your keyword search and have lots of false hits and want to start narrowing down the good hits. This is where a condition really comes into play. Remember that in EnCase v6, the filter and condition pane is exclusive to the display tab you are currently viewing (entries, search hits, keywords, etc). The first thing it to switch to the search hits tab. Then select the "conditions" tab in the lower-left pane and right-click choosing "new".


Name your condition, then right-click on "Main" and choose "new":


Once the "term" window appears, choose the "preview" property, then select MATCHES from the operator options and then enter all the common keywords int he value field that are appearing in your results that you want to exclude. In my example, I want to exclude "microsoft", "software", etc. Click on Okay and you are back to the initial condition window that looks like this:


Right-click on the condition and choose "NOT":


The condition should now say "if NOT Preview matches.......":


Click on "Ok" to complete the condition and then double click on it to run it:

Understanding that depending on the number of hits you are trying to filter down, this could take awhile to apply this filter. 

The other downside to this approach is that it is possible the search hit is valid and only consists of 'soft', but a few words before or after your keyword hit is the word 'microsoft'. This filter would hide that search hit since we are asking it to hide anything (if NOT preview matches "microsoft" or "software") where the words 'microsoft' or 'software' are in the preview field (128 bytes before the hit and 128 bytes after the hit, including the hit). The only way to remove hits that contain common words that are not of interest and not risk excluding the hit because they appear elsewhere in the preview field is to use an actual filter (EnScript code) to check each hit and to exclude those where the actual hit text does not match a list of excluded words.

Here is a link to some brief documentation that I wrote regarding writing your own filters.

The underlying message to this article is that picking your keyword(s) from the beginning may be more important than you think and can save you a lot of time later, so try and think them through from the beginning ;)

Wednesday, March 20, 2013

File Entropy explained

I posted a quick EnScript yesterday that calculates the entropy of any selected object in EnCase. One of the comments I received asked for more information about what entropy is and what do the values represent. This post is to provide a little more detail about what entropy is and how it can be helpful/useful.

Entropy is technically defined here and is the second law of thermodynamics.The technical explanation of entropy in computing terms is described here. Simply put, entropy as it relates to digital information is the measurement of randomness in a given set of values (data).

The equation used by Shannon has a resulting value of something between zero (0) and eight (8). The closer the number is to zero, the more orderly or non-random the data is. The closer the data is to the value of eight, the more random or non-uniform the data is. The formula used by Shannon to represent binary data looks like this:

Here is a simple example. Imagine you have a file that is 100 bytes in length and that is filled with the value of zero (0).


Using the above formula, you get a result of  zero, meaning the probability of any other value other than zero appearing is zero. Now consider the same 100 byte file filled with half zeros and half ones:

Using the same formula as above, the result would be one. 

Take this same file and compress it using Rar:


Entropy = 5.0592468625650353

Take the same message (half zeros and half ones) and encrypt it with PGP and you get this:

Entropy = 7.8347915272089166

Encrypt a blank volume in TrueCrypt and you get this:


Entropy = 7.9999997904861599

The closer you get towards truly random data, the closer the entropy value will be to the maximum value of eight, meaning there is no pattern or probability to guess what the next value might be. 

How is this useful:

Entropy can be used is many different way, but quite commonly to detect encryption and compression, since truly random data is not common in typical user data. This is especially true with executables that have purposely been encrypted with a real-time decryption routine. This prevents an AV engine from seeing "inside" the executable as it sits on the disk in order to detect strings or patterns. It is also very helpful in identifying files that have a high-amount of randomness (as illustrated above), which could indicate an encrypted container/volume that may go otherwise unnoticed.

In the original post, there was some discussion on a forensic message board about using entropy to detect the use of a wiping tool against unallocated space. In that example, you would be looking for repeating patterns that occur over a large area of unallocated. Again, the higher the entropy value, the more random the data, vs. the smaller value which indicates more uniformity of the data.

Reference: "File Entropy", McCreight, Shawn

Tuesday, March 19, 2013

EnCase EnScript to calculate entropy of selected file(s)

I saw a recent post on one of the forensic mailing lists about calculating the entropy of unallocated space to determine the "randomness" of the data to help determine if a wiping utility had been used.

I had written this quick little EnScript awhile back to calculate the entropy of any selected file, but since EnCase treats Unallocated Clusters the same as a file (an object within EnCase), its works just fine on calculating the entropy of Unallocated space.

Simple select (blue check) the files you want to calculate and run the EnScript. Each file and the entropy value will be written to the console tab.



Download here (EnCase v6)

Tuesday, March 5, 2013

EnScript to parse setupapi.dev.log

This EnCase EnScript was written to parse the Vista/7 'setupapi.dev.log' for USB events. This log contains a lot of information about hardware events, including when USB devices are attached.

 
This EnScript extracts and displays relevant USB events:
 
 
 
Download here (EnCase v6)

Computer Forensics, Malware Analysis & Digital Investigations

Random Articles