EnCase filter that uses MSSQL for faster filtering of files by hash values
OIiver Höpli from Switzerland recently emailed me to provide a filter he wrote that may be very useful to some EnCase users. With his permission, I am posting the filter and the description provided by him.
"The script is similar to the "Unique Files by Hash" filter provided by Guidance.
Because the script uses an MSSQL server for storing the hashes and not a NameListClass, it is much faster. In tests it filters about 220,000 entries in 3 minutes. Also the displayed filter applying time is really close to the total time that the filter would actualy run.
To use this script, you have to had a running insance of MSSQL Server local or in your network.
Please use credentials with enough permission to create and modify databases and tables.
The filter creates a table per dongle ID. So you could use this filter simultaneously on different EnCase installations in your lab. Please do not run the filter simultaneously on 2 or more EnCase instances on the same examiner machine.
The express edition of the MSSQL Server 2008 R2 (free available) could be downloaded from:
http://www.microsoft.com/germany/express/products/database.aspx"
Download here
4 comments:
I would be curious to compare the performance against Guidance's (John Stewart's?) binary tree algorithm. I made the below modifications to the filter from memory, so it may not compile.... In the past I have used a tested version and it is much faster. Not sure why guidance doesn't use this data structure.
include "GSI_Basic";
class MainClass {
NameListClass HashList;
BinaryTreeClass Tree;
MainClass() :
HashList()
Tree(HashList,new NodeStringCompareClass())
{
if (SystemClass::CANCEL == SystemClass::Message(SystemClass::ICONINFORMATION | SystemClass::MBOKCANCEL, "Unique Files By Hash",
"Note:\nFiles must be hashed prior to running this filter."))
SystemClass::Exit();
}
bool Main(EntryClass entry) {
HashClass hash = entry.HashValue();
if (!Tree.FastFind(hash))
Tree.FastInsert(new NameListClass(null, hash),hash)
else
return false;
return true;
}
}
Jon is brilliant at this type of stuff and because of him I have used similar types of binary searches for hash vales and other types of keys that I can sort.
A binary search can search a million records and find it within 20 steps, whereas a standard linear search may take up to one million iterations to find it.
Thanks for the input jtorrap@gmail.com. I did some modifications to your code (mostly syntax things) and tested it againt the same casedata that i used for the MSSQL Server test. Your provided filter is really fast, it tooked about 20s to filter the content. My filter runs about 3min for the same data.
This is the filter i used:
include "GSI_Basic"
class MainClass {
NameListClass HashList;
BinaryTreeClass Tree;
MainClass():
HashList(), Tree(HashList,new NodeStringCompareClass())
{
if (SystemClass::CANCEL == SystemClass::Message(SystemClass::ICONINFORMATION | SystemClass::MBOKCANCEL, "Unique Files By Hash",
"Note:\nFiles must be hashed prior to running this filter."))
SystemClass::Exit();
}
bool Main(EntryClass entry) {
HashClass hash = entry.HashValue();
if (entry.HashValue())
{
if (!Tree.FastFind(hash))
{
Tree.FastInsert(new NameListClass(null, hash),hash);
}
else
{
return false;
}
return true;
}
else
{
return true;
}
}
}
Lance, you do me wrong! I'm never going to live up peoples' expectations this way.
There are some things here that I'll try to blog about it more depth.
Post a Comment