This chapter will explain various concepts involved in Microsoft Windows forensics and the important artifacts that an investigator can obtain from the investigation process.
Artifacts are the objects or areas within a computer system that have important information related to the activities performed by the computer user. The type and location of this information depends upon the operating system. During forensic analysis, these artifacts play a very important role in approving or disapproving the investigator’s observation.
Windows artifacts assume significance due to the following reasons −
Around 90% of the traffic in world comes from the computers using Windows as their operating system. That is why for digital forensics examiners Windows artifacts are very essentials.
The Windows operating system stores different types of evidences related to the user activity on computer system. This is another reason which shows the importance of Windows artifacts for digital forensics.
Many times the investigator revolves the investigation around old and traditional areas like user crated data. Windows artifacts can lead the investigation towards non-traditional areas like system created data or the artifacts.
Great abundance of artifacts is provided by Windows which are helpful for investigators as well as for companies and individuals performing informal investigations.
Increase in cyber-crime in recent years is another reason that Windows artifacts are important.
In this section, we are going to discuss about some Windows artifacts and Python scripts to fetch information from them.
It is one of the important Windows artifacts for forensic investigation. Windows recycle bin contains the files that have been deleted by the user, but not physically removed by the system yet. Even if the user completely removes the file from system, it serves as an important source of investigation. This is because the examiner can extract valuable information, like original file path as well as time that it was sent to Recycle Bin, from the deleted files.
Note that the storage of Recycle Bin evidence depends upon the version of Windows. In the following Python script, we are going to deal with Windows 7 where it creates two files: $R file that contains the actual content of the recycled file and $I file that contains original file name, path, file size when file was deleted.
For Python script we need to install third party modules namely pytsk3, pyewf and unicodecsv. We can use pip to install them. We can follow the following steps to extract information from Recycle Bin −
First, we need to use recursive method to scan through the $Recycle.bin folder and select all the files starting with $I.
Next, we will read the contents of the files and parse the available metadata structures.
Now, we will search for the associated $R file.
At last, we will write the results into CSV file for review.
Let us see how to use Python code for this purpose −
First, we need to import the following Python libraries −
from __future__ import print_function from argparse import ArgumentParser import datetime import os import struct from utility.pytskutil import TSKUtil import unicodecsv as csv
Next, we need to provide argument for command-line handler. Note that here it will accept three arguments – first is the path to evidence file, second is the type of evidence file and third is the desired output path to the CSV report, as shown below −
if __name__ == '__main__': parser = argparse.ArgumentParser('Recycle Bin evidences') parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file") parser.add_argument('IMAGE_TYPE', help = "Evidence file format", choices = ('ewf', 'raw')) parser.add_argument('CSV_REPORT', help = "Path to CSV report") args = parser.parse_args() main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.CSV_REPORT)
Now, define the main() function that will handle all the processing. It will search for $I file as follows −
def main(evidence, image_type, report_file): tsk_util = TSKUtil(evidence, image_type) dollar_i_files = tsk_util.recurse_files("$I", path = '/$Recycle.bin',logic = "startswith") if dollar_i_files is not None: processed_files = process_dollar_i(tsk_util, dollar_i_files) write_csv(report_file,['file_path', 'file_size', 'deleted_time','dollar_i_file', 'dollar_r_file', 'is_directory'],processed_files) else: print("No $I files found")
Now, if we found $I file, then it must be sent to process_dollar_i() function which will accept the tsk_util object as well as the list of $I files, as shown below −
def process_dollar_i(tsk_util, dollar_i_files): processed_files = [] for dollar_i in dollar_i_files: file_attribs = read_dollar_i(dollar_i[2]) if file_attribs is None: continue file_attribs['dollar_i_file'] = os.path.join('/$Recycle.bin', dollar_i[1][1:])
Now, search for $R files as follows −
recycle_file_path = os.path.join('/$Recycle.bin',dollar_i[1].rsplit("/", 1)[0][1:]) dollar_r_files = tsk_util.recurse_files( "$R" + dollar_i[0][2:],path = recycle_file_path, logic = "startswith") if dollar_r_files is None: dollar_r_dir = os.path.join(recycle_file_path,"$R" + dollar_i[0][2:]) dollar_r_dirs = tsk_util.query_directory(dollar_r_dir) if dollar_r_dirs is None: file_attribs['dollar_r_file'] = "Not Found" file_attribs['is_directory'] = 'Unknown' else: file_attribs['dollar_r_file'] = dollar_r_dir file_attribs['is_directory'] = True else: dollar_r = [os.path.join(recycle_file_path, r[1][1:])for r in dollar_r_files] file_attribs['dollar_r_file'] = ";".join(dollar_r) file_attribs['is_directory'] = False processed_files.append(file_attribs) return processed_files
Now, define read_dollar_i() method to read the $I files, in other words, parse the metadata. We will use read_random() method to read the signature’s first eight bytes. This will return none if signature does not match. After that, we will have to read and unpack the values from $I file if that is a valid file.
def read_dollar_i(file_obj): if file_obj.read_random(0, 8) != '\x01\x00\x00\x00\x00\x00\x00\x00': return None raw_file_size = struct.unpack('<q', file_obj.read_random(8, 8)) raw_deleted_time = struct.unpack('<q', file_obj.read_random(16, 8)) raw_file_path = file_obj.read_random(24, 520)
Now, after extracting these files we need to interpret the integers into human-readable values by using sizeof_fmt() function as shown below −
file_size = sizeof_fmt(raw_file_size[0]) deleted_time = parse_windows_filetime(raw_deleted_time[0]) file_path = raw_file_path.decode("utf16").strip("\x00") return {'file_size': file_size, 'file_path': file_path,'deleted_time': deleted_time}
Now, we need to define sizeof_fmt() function as follows −
def sizeof_fmt(num, suffix = 'B'): for unit in ['', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi']: if abs(num) < 1024.0: return "%3.1f%s%s" % (num, unit, suffix) num /= 1024.0 return "%.1f%s%s" % (num, 'Yi', suffix)
Now, define a function for interpreted integers into formatted date and time as follows −
def parse_windows_filetime(date_value): microseconds = float(date_value) / 10 ts = datetime.datetime(1601, 1, 1) + datetime.timedelta( microseconds = microseconds) return ts.strftime('%Y-%m-%d %H:%M:%S.%f')
Now, we will define write_csv() method to write the processed results into a CSV file as follows −
def write_csv(outfile, fieldnames, data): with open(outfile, 'wb') as open_outfile: csvfile = csv.DictWriter(open_outfile, fieldnames) csvfile.writeheader() csvfile.writerows(data)
When you run the above script, we will get the data from $I and $R file.
Windows Sticky Notes replaces the real world habit of writing with pen and paper. These notes used to float on the desktop with different options for colors, fonts etc. In Windows 7 the Sticky Notes file is stored as an OLE file hence in the following Python script we will investigate this OLE file to extract metadata from Sticky Notes.
For this Python script, we need to install third party modules namely olefile, pytsk3, pyewf and unicodecsv. We can use the command pip to install them.
We can follow the steps discussed below for extracting the information from Sticky note file namely StickyNote.sn −
Firstly, open the evidence file and find all the StickyNote.snt files.
Then, parse the metadata and content from the OLE stream and write the RTF content to files.
Lastly, create CSV report of this metadata.
Let us see how to use Python code for this purpose −
First, import the following Python libraries −
from __future__ import print_function from argparse import ArgumentParser import unicodecsv as csv import os import StringIO from utility.pytskutil import TSKUtil import olefile
Next, define a global variable which will be used across this script −
REPORT_COLS = ['note_id', 'created', 'modified', 'note_text', 'note_file']
Next, we need to provide argument for command-line handler. Note that here it will accept three arguments – first is the path to evidence file, second is the type of evidence file and third is the desired output path as follows −
if __name__ == '__main__': parser = argparse.ArgumentParser('Evidence from Sticky Notes') parser.add_argument('EVIDENCE_FILE', help="Path to evidence file") parser.add_argument('IMAGE_TYPE', help="Evidence file format",choices=('ewf', 'raw')) parser.add_argument('REPORT_FOLDER', help="Path to report folder") args = parser.parse_args() main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.REPORT_FOLDER)
Now, we will define main() function which will be similar to the previous script as shown below −
def main(evidence, image_type, report_folder): tsk_util = TSKUtil(evidence, image_type) note_files = tsk_util.recurse_files('StickyNotes.snt', '/Users','equals')
Now, let us iterate through the resulting files. Then we will call parse_snt_file() function to process the file and then we will write RTF file with the write_note_rtf() method as follows −
report_details = [] for note_file in note_files: user_dir = note_file[1].split("/")[1] file_like_obj = create_file_like_obj(note_file[2]) note_data = parse_snt_file(file_like_obj) if note_data is None: continue write_note_rtf(note_data, os.path.join(report_folder, user_dir)) report_details += prep_note_report(note_data, REPORT_COLS,"/Users" + note_file[1]) write_csv(os.path.join(report_folder, 'sticky_notes.csv'), REPORT_COLS,report_details)
Next, we need to define various functions used in this script.
First of all we will define create_file_like_obj() function for reading the size of the file by taking pytsk file object. Then we will define parse_snt_file() function that will accept the file-like object as its input and is used to read and interpret the sticky note file.
def parse_snt_file(snt_file): if not olefile.isOleFile(snt_file): print("This is not an OLE file") return None ole = olefile.OleFileIO(snt_file) note = {} for stream in ole.listdir(): if stream[0].count("-") == 3: if stream[0] not in note: note[stream[0]] = {"created": ole.getctime(stream[0]),"modified": ole.getmtime(stream[0])} content = None if stream[1] == '0': content = ole.openstream(stream).read() elif stream[1] == '3': content = ole.openstream(stream).read().decode("utf-16") if content: note[stream[0]][stream[1]] = content return note
Now, create a RTF file by defining write_note_rtf() function as follows
def write_note_rtf(note_data, report_folder): if not os.path.exists(report_folder): os.makedirs(report_folder) for note_id, stream_data in note_data.items(): fname = os.path.join(report_folder, note_id + ".rtf") with open(fname, 'w') as open_file: open_file.write(stream_data['0'])
Now, we will translate the nested dictionary into a flat list of dictionaries that are more appropriate for a CSV spreadsheet. It will be done by defining prep_note_report() function. Lastly, we will define write_csv() function.
def prep_note_report(note_data, report_cols, note_file): report_details = [] for note_id, stream_data in note_data.items(): report_details.append({ "note_id": note_id, "created": stream_data['created'], "modified": stream_data['modified'], "note_text": stream_data['3'].strip("\x00"), "note_file": note_file }) return report_details def write_csv(outfile, fieldnames, data): with open(outfile, 'wb') as open_outfile: csvfile = csv.DictWriter(open_outfile, fieldnames) csvfile.writeheader() csvfile.writerows(data)
After running the above script, we will get the metadata from Sticky Notes file.
Windows registry files contain many important details which are like a treasure trove of information for a forensic analyst. It is a hierarchical database that contains details related to operating system configuration, user activity, software installation etc. In the following Python script we are going to access common baseline information from the SYSTEM and SOFTWARE hives.
For this Python script, we need to install third party modules namely pytsk3, pyewf and registry. We can use pip to install them.
We can follow the steps given below for extracting the information from Windows registry −
First, find registry hives to process by its name as well as by path.
Then we to open these files by using StringIO and Registry modules.
At last we need to process each and every hive and print the parsed values to the console for interpretation.
Let us see how to use Python code for this purpose −
First, import the following Python libraries −
from __future__ import print_function from argparse import ArgumentParser import datetime import StringIO import struct from utility.pytskutil import TSKUtil from Registry import Registry
Now, provide argument for the command-line handler. Here it will accept two arguments - first is the path to the evidence file, second is the type of evidence file, as shown below −
if __name__ == '__main__': parser = argparse.ArgumentParser('Evidence from Windows Registry') parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file") parser.add_argument('IMAGE_TYPE', help = "Evidence file format", choices = ('ewf', 'raw')) args = parser.parse_args() main(args.EVIDENCE_FILE, args.IMAGE_TYPE)
Now we will define main() function for searching SYSTEM and SOFTWARE hives within /Windows/System32/config folder as follows −
def main(evidence, image_type): tsk_util = TSKUtil(evidence, image_type) tsk_system_hive = tsk_util.recurse_files('system', '/Windows/system32/config', 'equals') tsk_software_hive = tsk_util.recurse_files('software', '/Windows/system32/config', 'equals') system_hive = open_file_as_reg(tsk_system_hive[0][2]) software_hive = open_file_as_reg(tsk_software_hive[0][2]) process_system_hive(system_hive) process_software_hive(software_hive)
Now, define the function for opening the registry file. For this purpose, we need to gather the size of file from pytsk metadata as follows −
def open_file_as_reg(reg_file): file_size = reg_file.info.meta.size file_content = reg_file.read_random(0, file_size) file_like_obj = StringIO.StringIO(file_content) return Registry.Registry(file_like_obj)
Now, with the help of following method, we can process SYSTEM> hive −
def process_system_hive(hive): root = hive.root() current_control_set = root.find_key("Select").value("Current").value() control_set = root.find_key("ControlSet{:03d}".format(current_control_set)) raw_shutdown_time = struct.unpack( '<Q', control_set.find_key("Control").find_key("Windows").value("ShutdownTime").value()) shutdown_time = parse_windows_filetime(raw_shutdown_time[0]) print("Last Shutdown Time: {}".format(shutdown_time)) time_zone = control_set.find_key("Control").find_key("TimeZoneInformation") .value("TimeZoneKeyName").value() print("Machine Time Zone: {}".format(time_zone)) computer_name = control_set.find_key("Control").find_key("ComputerName").find_key("ComputerName") .value("ComputerName").value() print("Machine Name: {}".format(computer_name)) last_access = control_set.find_key("Control").find_key("FileSystem") .value("NtfsDisableLastAccessUpdate").value() last_access = "Disabled" if last_access == 1 else "enabled" print("Last Access Updates: {}".format(last_access))
Now, we need to define a function for interpreted integers into formatted date and time as follows −
def parse_windows_filetime(date_value): microseconds = float(date_value) / 10 ts = datetime.datetime(1601, 1, 1) + datetime.timedelta(microseconds = microseconds) return ts.strftime('%Y-%m-%d %H:%M:%S.%f') def parse_unix_epoch(date_value): ts = datetime.datetime.fromtimestamp(date_value) return ts.strftime('%Y-%m-%d %H:%M:%S.%f')
Now with the help of following method we can process SOFTWARE hive −
def process_software_hive(hive): root = hive.root() nt_curr_ver = root.find_key("Microsoft").find_key("Windows NT") .find_key("CurrentVersion") print("Product name: {}".format(nt_curr_ver.value("ProductName").value())) print("CSD Version: {}".format(nt_curr_ver.value("CSDVersion").value())) print("Current Build: {}".format(nt_curr_ver.value("CurrentBuild").value())) print("Registered Owner: {}".format(nt_curr_ver.value("RegisteredOwner").value())) print("Registered Org: {}".format(nt_curr_ver.value("RegisteredOrganization").value())) raw_install_date = nt_curr_ver.value("InstallDate").value() install_date = parse_unix_epoch(raw_install_date) print("Installation Date: {}".format(install_date))
After running the above script, we will get the metadata stored in Windows Registry files.