Wednesday, May 26, 2010

EnScript to parse TIFF Metadata

An investigator contacted me this week about an investigation involving several hundred TIFF files that had been generated from a fax machine. The investigator had a need to quickly extract all the metadata out of the TIFF files. A couple different external programs could be used to do this, for example, ExifTool by Phil Harvey.

My goal was to create a quick EnScript to parse the TIFFs and provide the data without having to export the files out of EnCase. This caused me to take a closer look at TIFF format and the associated metadata that is stored inside. TIFF files are pretty common, especially in an environment where scanned documents can be found. This would include the processing stage of some e-Discovery jobs.

The TIFF file format is well documented. It can be found here. There are several fields inside a TIFF file that may be valuable, specifically a TIF that was generated as part of a fax transmission process.  A TIFF file that is generated from a fax, is commonly refereed to as a TIFF-F or a bilevel TIFF. There is an excellent discussion about the TIFF-F format in RFC 2306.

There are several tags (fields) that are commonly associated with a TIFF fax file that may be useful. The ones I have identified and included so far are:

Image File Width
Image File Length
Compression (identifies it clearly as a fax)

Image Description
Page Numbers
Software
Date/Time

There are additional standard TIFF tags, such as data offsets, resolution, resolution unit, etc., but they don't have much value from an investigative standpoint. In addition to the standard TIFF tags, there can also be non-standard custom tags that are added by additional software, such as the Microsoft Document Imager (MDI) that is commonly used when a Windows OS computer is used as a fax.

When the MDI tags are present, there can additional information that can be useful to the investigator. For example:

Title
Author (Windows user account)
Last Saved by (Windows user account)
Last Edit Timestamp
Last Print Time Timestamp
Create Date Timestamp
Last Saved Timestamp
Page count
Word count
Char count
....and several others...

These tags are basically the same ones that you typically find in a Microsoft Office OLE file (doc, xls ppt). 

The EnScript below will parse out all the standard TIF tags mentioned above. In addition, if there is OLE information, it will currently parse out the document name, author & last saved by name. I am still working on parsing some of the other MDI tags, but I don't have many sample TIF files that have this MDI information. If you have access to any TIF files that contain MDI information and are willing to share them, please contact me at lance(at)forensickb.com.

Meanwhile, you can run the EnScript below against any selected TIF files in EnCase and it will bookmark the tag fields mentioned above, as well as print out the metadata information to the console tab. 


If there is MDI information, those fields that are currently being parsed will appear in the bookmarks as well as the message "There is MDI information present" in the console:



Please contact me if you have any TIF files that contain MDI information so I can continue to develop the EnScript to parse the additional pertinent fields.


0 comments:

Post a Comment

Computer Forensics, Malware Analysis & Digital Investigations

Random Articles