Hacking Exposed

Home > Fantasy > Hacking Exposed > Page 27
Hacking Exposed Page 27

by Aaron Philipp


  Figure 12-2 A Word document that has been modified and quick saved

  First off, you need to understand why Quick Save exists. When a document gets big, it can be very time-consuming to save the document, tying up resources and basically slowing down the whole show. With Office’s Auto Save feature, saving can become distracting and time-consuming while you are working. So Quick Save was created to save documents quickly and painlessly with minimum disruption to the user. It does this by not making changes to the body of the document; instead, it appends the changes, and information about where the changes appear goes at the end of the document. Once a certain file size is exceeded, the save goes back, incorporates all the changes into the main body of the document, and shrinks the file size back down. From a forensic investigation standpoint, this can be a great thing because data that a user thinks is deleted actually still exists in the document.

  Let’s go back to our example. If you open the file in a binary editor (I recommend XEmacs for non-forensics work), you can look for information that may have been “deleted” but not removed from the file. As you can see in Figure 12-3, a simple search of the document reveals information that appears as though it were still included in the document.

  To confirm, we open Word and perform an undo to see what comes back. As you can see in Figure 12-4, the data deleted from the document in Figure 12-3 has been recovered.

  This technique will work through multiple changes to the document and can actually go pretty far back in the revision history. You can typically find the data you are looking for using a keyword text search on the document with a tool such as EnCase or a binary editor, and then go back into Word to reconstruct the document.

  Figure 12-3 Locating deleted data in a Word document

  Figure 12-4 The data after a single undo

  Word 97 MAC Address

  If you are lucky enough to find a document that was created in Word 97, you can actually get the MAC address of the machine on which the document was created. A MAC address is like the fingerprint of a network card and is typically a number formatted like so: 00-09-5B-E6-24-5D. In the Word document, however, it’s formatted a bit differently. Take a look at Figure 12-5, which shows the MAC address in the document itself.

  To find the MAC address in a document, open the file in a binary editor and do a search for PID. This will bring up the entry.

  Let’s look at the PID-GUID for the Melissa virus document:

  PID_GUID {572 85 8EA-36DD-11D2-885F-004 033E0078E}

  Figure 12-5 The MAC address in a Word document

  If you look at the last chunk of data, 004033E0078E, and break it down, you get 00-40-33-E0-07-8E; this is clearly the MAC address of the machine on which the document was created. It must be stated, however, that this number can be modified and is nonauthoritative.

  You can check for a MAC address by looking at the first three pairs of numbers in the MAC address; this is the vendor ID. You can use any number of Internet database lookup sites to find out who owns that MAC address and who created the card. If you are certain that you know on what machine a document was created, you can use this information for cross validation purposes. If the vendor ID and the actual maker of the card do not match, that is a red flag that tampering has occurred.

  When opening an Office document, the program does a couple of very basic file size checks to make sure that nothing has been modified. If the document won’t even open in Office, that should be a red flag that modification of metadata has occurred.

  Past Filenames

  Older Office (pre-Office 2003) documents actually store every filename under which they have ever been saved in the file. This can be very handy if you are looking for directories to go after or network drives that may have been used, or if you need to subpoena removable media to conduct further investigation. The key to this technique is that the filenames are stored in Unicode instead of straight ASCII, so you need to use an application such as strings.exe from Systernals to extract the files. Running strings.exe with the -u argument will output only Unicode text strings from the document. Here’s an example of running the strings program on a Word document:

  Strings -u tester.doc

  Strings v2.1

  Copyright (C) 1999-2003 Mark Russinovich

  Systems Internals - www.sysinternals.com

  …

  D:mystufftest.doc

  …

  Times New Roman

  Root Entry

  C:draft.doc

  As you can see, multiple filenames and paths are stored in the document. You can then use your image to trace back these files, and if they point to network shares, you can use this data as a reason to conduct further discovery during litigation.

  Working with Office Documents

  When you’re working with Office documents, remember to be creative and always look beyond what you see when you open the document. You can pull a wealth of information from these documents if you know where to look for it. In fact, EnCase has built support for reading and searching the Unicode into the latest version to make this type of investigation easier. One caveat, however, is that the data is nonauthoritative by itself. If you base your court case solely upon this data, you are going to have a bad time. Use this information to corroborate evidence you’ve obtained from other sources or to develop new leads that you can follow. That said, a little bit of time with an Office document and a low-level editor can point you in the direction you need to go to investigate your case effectively.

  TRACKING WEB USAGE

  As an investigator, you will frequently find yourself reconstructing a user’s web activity. Lucky for you, it seems as though everyone who decides to write a forensic tool writes it in a way that reads a browser’s cookies and history. The process of going through the working files and reconstructing activity is actually pretty straightforward, and when properly validated it can be reasonably authoritative. To help you understand what we are going to be looking at, we’ll discuss what kinds of records a web browser would keep that denotes user activity.

  First, you have to look at what sites a user visited while using the browser. This information can be obtained from the history file, which stores information on every URL a user has loaded, going back for months. Even if a user has tried to cover her tracks by deleting the history, it may still be recoverable and useful in an investigation. Once you have the URLs that she has visited, you need a way to find out what she did while she was there. Conventionally, you can do this using two methods: by looking at the cookies for the site to determine user behavior or by reconstructing the web pages from the temporary Internet files. Let’s look at how to conduct an investigation for the two most popular browsers: Internet Explorer and Firefox/Netscape.

  Internet Explorer Forensics

  Internet Explorer (IE) has been the default web browser for the Microsoft Windows platform since Windows 95. In fact, later versions of Windows have built IE to interact very closely with the operating system, opening some interesting paths for forensic investigation of activity. Covering your tracks in IE is a nontrivial task. Even if you delete the history using the IE facilities, it can still be recovered because of its close interaction with the OS.

  Viewing the History

  The history utility in IE, shown in Figure 12-6, creates a convenient audit trail for what a user likes to do on the Internet. It can be used to show whether the user frequents certain types of sites, if she lands on a site inadvertently, and what she is doing when she visits a site. This information is useful in everything from policy violation cases all the way up to criminal activities.

  EnCase comes with an EnScript feature that will automatically search the image for IE history and present it in a report format. If you use EnCase, this can greatly speed your investigation, although you should make sure you understand what the script does and how it does it.

  Figure 12-6 Internet Explorer’s history utility

  Table 12-1 Breakdown of File Entries in Windows XP<
br />
  Luckily, as long as you know where to look, you can use tons of tools to make this job easy. For the sake of demonstration, we will use a freeware command-line utility from Foundstone called Pasco. While completely devoid of any kind of flash or bells and whistles that other commercial products have, it gets the job done. It takes an index.dat file and converts the data into a tab-delimited format. Once you have that, you can import it into Excel and slice and dice it as you see fit. Then the fun begins. If you do a search for index.dat, you will find about five to ten entries. As you can quickly see from looking at any one of them, several different types of entries are included. Table 12-1 shows a breakdown of those that exist in Windows XP, their location, and what each one does.

  If you are investigating an older version of Internet Explorer, here are some directories and file locations to look for that will hold the same information:

  • C:WindowsCookiesindex.dat

  • C:WindowsHistoryindex.dat

  • C:WindowsHistoryMSHistXXXXXXXXXXXXXXXXXXindex.dat

  • C :Windows/HistoryHistory.IE5/index.dat

  • C:WindowsHistoryHistory.IE5MSHistXXXXXXXXXXXXXXXXXXindex.dat

  • C:Windows/Temporary Internet Filesindex.dat (only in Internet Explorer 4.x)

  • C:WindowsTemporary Internet FilesContent.IE5index.dat

  • C:WindowsUserDataindex.dat

  • C:WindowsProfiles Cookiesindex.dat

  • C:WindowsProfiles Historyindex.dat

  • C:WindowsProfiles History

  MSHistXXXXXXXXXXXXXXXXXXindex.dat

  • C:WindowsProfiles HistoryHistory.IE5index.dat

  • C:WindowsProfiles HistoryHistory.IE5

  MSHistXXXXXXXXXXXXXXXXXXindex.dat

  • C:WindowsProfilesTemporary Internet Filesindex.dat

  • C:WindowsProfilesTemporary Internet FilesContent.IE5 index.dat

  • C:WindowsProfiles UserDataindex.dat

  Now that you know where to look, let’s examine how these interconnect and how you can use them to trace user activity. The first place you want to go is to the main history to locate what Web sites the user has visited. Here’s a listing of the History.IE5 directory:

  As you can see, five different directories start with MSHist01 followed by a string of numbers. Let’s decipher the sequence that MS uses for this structure.

  The number 2004062820040629, for example, looks pretty meaningless at first glance. If you break it up a bit, though, a pattern emerges: 2004-06-28 and 2004-06-29. If you look at the created time, this suspicion is verified. This is how you tell what dates the directory holds. For our purposes, let’s try to find an event that occurred on 2004-06-28, so we would use the index.dat in MSHist012004062120040628. You would go into the directory and actually extract the data from the file.

  C:/Documents and SettingsLocal SettingsHistoryHistory.IE5MSHis t012004062120040628>“C:Documents and SettingsDesktopPascopasco .exe” index.dat | more

  History File: index.dat

  TYPE, URL, MODIFIED TIME, ACCESS TIME, FILENAME, DIRECTORY, HTTP HEADERS ,URL,:2004062120040628: @http://www.gnu.org/copyleft/gpl.html, Wed Jun 23 11:37:15 2004 , Mon Jun 28 16:12:12 2004 ,URL ,,URL

  This is one line from the raw output of Pasco. As you can see, several fields are stored in the record. You need to determine what each one represents, as shown in Table 12-2.

  For those who are unfamiliar with the command line, you can use the following command to dump the history into a text file that you can import into Excel:

  Pasco >

  Once you have created the text file and imported it into Excel, you should see something similar to the data shown in Figure 12-7.

  From here, you can filter and sort the data to find the information relevant to the case. Most of the all-in-one forensics investigation tools have facilities for searching the history. That being the case, there is still something to be said for this method, because you can leverage the powerful searching and sorting features of a tool such as grep or Excel to help speed the investigation along, while still having a step-by-step process to show the court.

  Table 12-2 What Each Field Represents

  Figure 12-7 History data imported into Excel

  Finding Information in Cookies

  Cookies have become the predominant way for Web sites to store tracking information about their users. Every time you automatically log into a site, it remembers you and a cookie is involved. A cookie is a small text file earmarked with special data that is pertinent to a specific Web site. The information held in these cookies can be invaluable to forensics. Often, the cookie holds information about the username, the user’s preferences, and the frequency with which the user visits the site. Like the history process, pulling information out of cookies is a straightforward process, but the devil is in the details. The first thing you want to do is investigate the history file in the C:Documents and Settings Cookies directory. This file is identical in structure to the main index.dat files, but instead of URLs, it stores the history of cookies. Here’s a line of sample output:

  TYPE, URL, MODIFIED TIME, ACCESS TIME, FILENAME, DIRECTORY, HTTP HEADERS

  URL, Cookie:@imrworldwide.com/cgi-bin, Sun Mar 21 05:25:33 2004 ,Thu

  Jun 24 15:08:31 2004 , @cgi-bin[1].txt,,URL

  The most notable aspect is the fact that the FILENAME field is populated with the name of the cookie as it’s stored on the local hard disk. Notice as well that the filename of the cookie has nothing to do with the Web site from which it came. Some of the “shadier” Web sites will often name cookies to make it more difficult for you to discover that they are tracking you. This is why it’s important to use the history file, because it will show you where the cookie originated and what server-side code produced it. Since the cookie history is identical in structure to the other histories, you can use the same techniques to search and find specific filenames of cookies.

  Oftentimes, the mere existence of a cookie is enough to show that a user was visiting a site. But sometimes you’ll need to delve deeper into the user activity and look inside the cookie itself. To do this, you can use a Foundstone tool called Galleta. It operates identically to the history tool used in the preceding section.

  A cookie is nothing more than a data structure with a series of variable names and values. However, several fields of metadata are of interest and need explanation, as shown in the following table:

  Here is a line created from the Galleta program run on a popular Web site, www.google.com:

  C:Documents and SettingsAdminCookies>“C:Documents and Settings

  AdminDesktopgalletagalleta.exe” “admin@google[1].txt”

  Cookie File: admin@google[1].txt

  SITE VARIABLE VALUE CREATION TIME EXPIRE TIME FLAGS

  google.com/ PREF ID=7 7 57 8 97 55 9c7c13d:FF=4:TB=2:LD=en:NR=10:TM=1063258910:

  LM=1076737164:S=VyefrLtaPC0FoJTZ Sat Feb 14 05:39:23 2004 Sun Jan

  17 19:14:07 2038 1536

  Here you can see the variable PREF (presumably for user preferences) with a string value that Google accesses every time this browser goes to the home page. You can often look to the content inside the cookies to validate that a user spent time and actually logged into a Web site and didn’t just land on it by accident. However, the existence of a cookie by itself isn’t enough. Before you make statements regarding intent, make sure you empirically test how the cookie was created and what the values inside it can show you.

  Reconstructing Activity from the Cache

  To speed up Internet browsing, IE caches most of the pages you visit on your hard drive in case you want to go back. Good for forensics examiners, bad for suspects with something to hide. If you can navigate the maze that is the caching structure, you can re-create pages that the user saw and interacted with, includ
ing their forms data. There is a problem with caching Internet files, however. Think about what would happen if you cached everything under its original filename. The number of collisions in the cache would render the cache nearly useless (consider the number of pages named index.html, for example). As such, Microsoft has created a naming system that prevents that from occurring. In the cache directory, an index.dat file maps the pages on Web sites to files and directories in the cache.

  The process for finding things in the cache is identical to the process for finding things in the history. Convert the index.dat to a readable format, slice and dice it to find the files that are important to the investigation, and then use the FILE and DIRECTORY fields to locate the files themselves. This time, the directory that we care about is C: Documents and SettingsLocal SettingsTemporary Internet Files Content.IE5.

  Let’s look at sample output from the index.dat file:

  TYPE URL MODIFIED TIME ACCESS TIME FILENAME DIRECTORY

  HTTP HEADERS

  URL http://hp.msn.com/17/7M{T57_]6423LU+]0D]QKP.jpg Sat Jun 26 00:52:59 2004

  Mon Jun 28 22:01:05 2004 7M{T57_]64 23LU+]0D]QKP[1].jpg 0PQLIJYD

  HTTP/1.1 200 OK Content-Length: 2547 Content-Type: image/jpeg ETag: “

  6ee55ded175bc41:8b1” P3P: CP=“BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo”

  Let’s try to make sense of this mess. First, notice that the original URL ties back to an MSN site. You can see a date when it was added to the cache and a date when it was last accessed. The areas where this differs from the history are the FILE, DIRECTORY, and HTTP HEADERS fields. The headers field can hold valuable information about the context in which the file was retrieved.

 

‹ Prev