tag:blogger.com,1999:blog-1746946614390371171.post6442060124647643984..comments2023-05-09T02:31:13.939-07:00Comments on Computer Forensics, Malware Analysis & Digital Investigations: Filter to remove duplicates for exportLance Muellerhttp://www.blogger.com/profile/15789264000499223230noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-1746946614390371171.post-66531074904964318602009-09-23T01:18:29.449-07:002009-09-23T01:18:29.449-07:00Many thanks, this just cut the number of review fi...Many thanks, this just cut the number of review files in half<br /><br />JohnAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-1746946614390371171.post-39300522702503931202009-05-01T08:43:00.000-07:002009-05-01T08:43:00.000-07:00Lance,
Here is a similar filter, with better perf...Lance,<br /><br />Here is a similar filter, with better performance, using a suggestion by Shawn Mcreight on the GSI forum. Basically just uses a hash array rather than name class array, since hash comparisons are orders of magnitude faster than string comparisons. This is also easily modified to dedupe the entire case.<br /><br />typedef HashClass[] HashArray;<br /> class MainClass {<br /> HashArray HashList;<br /> SearchClass Search;<br /> MainClass() :<br /> HashList(),<br /> Search()<br /> {<br /> }<br /> bool Main(EntryClass entry) {<br /> if (entry.IsSelected()) {<br /> HashClass hash = Search.ComputeHash(entry);<br /> if (HashList.Find(hash) == -1) {<br /> HashList.Add(hash);<br /> return true;<br /> }<br /> else<br /> Console.WriteLine("Duplicate File:\t" + entry.FullPath() + "\t" + hash);<br /> }<br /> return false;<br /> }<br /> }<br /><br />Thanks for the great resources available on your blog.<br /><br />Brian LarsenBrian Larsennoreply@blogger.com