47
Exploring NTFS: Forensic Extraction of Data Jason Medeiros Grayscale Research 2008

Exploring NTFS

Embed Size (px)

DESCRIPTION

Exploring NTFS:Forensic Extraction of DataJason MedeirosGrayscale Research 2008

Citation preview

  • Exploring NTFS:Forensic Extraction of Data

    Jason MedeirosGrayscale Research 2008

  • About This Presentation

    This presentation is a complement to the Grayscale white paper, NTFS Forensics: A Programmers View of Raw Filesystem Data Extraction.

    This paper is available freely in the Whitepapers and Research section of the www.grayscale-research.org.

  • Describing our Intent

    Open a raw physical disk

    Read through its raw file system structures to find a file which is locked by the kernel.

    Read the sectors off the disk into a file which can be read later.

  • Why is this Useful

    Certain files such as the page file are locked by the kernel so that improper accesses cannot occur.

    This is typically to prevent race conditions and to keep hackers and other malcontents from accessing portions of the disc that the operating system does not want accessed.

  • Full Access

    By going through the raw disk instead of the file system routines, we get full access to any file on the disk as well as any sectors on the disk that may be inaccessible through file system locking mechanisms.

  • Step-by-step process overview 1. Find the partition entry which holds the NTFS partition

    2. Navigate to the beginning of the Master File Table

    3. Find the root directory metafile entry in the MFT and extract its index allocation attribute data.

    4. Process the INDX records found within the index allocation attribute data one by one recursively until you find a file that matches the one you are looking for.

    5. In the index entry that matched, find the MFT record number and move to that record position within the MFT.

    6. Record the MFT entry and process its standard attribute headers one by one until the data attribute is encountered.

    7. Use the process of attribute data extraction in order to retrieve the data attribute, which contains the contents of the file that is being accessed.

  • About NTFS The NTFS file system has gone through several

    iterations each of which brought new features to end users and application designers.

    The current version (3.1) is the primary focus of this document although the technical details remain the same for previous versions so this document is rather universal.

  • About NTFS Continued NTFS was created for the purpose of creating a

    high performance replacement for the antiquated FAT file system.

    This new file system featured robust journaling, alternate data streams, disk quotas, sparse files, reparse points, encryption, compression, and several more features that are necessary in modern filesystems.

  • Filesystems And Binary Trees It should be noted that all filesystems utilize

    binary trees in order to hierarchically store data effectively.

    There are many different binary tree types available but the most common type of binary tree utilized for file systems is the B+ binary tree.

    NTFS at its core structure is represented by a B+ binary tree.

  • B+ Trees

    A B+ tree represents sorted data in a way which allows for extremely efficient insertion, removal, and retrieval of data.

    All data is identified by a key within the tree which serves the purpose of being a multilevel index to nodes on the tree.

  • Explanation of B+ Operations In a B+ tree, each node in the tree has a record identifier

    or a hash.

    Each of these hashes can either point, to data on the disk or another node.

    Each node being pointed to can have the same properties as a node before it.

    This hierarchical data storage mechanism allows for a theoretical infinite depth with fast lookup times as long as the storage algorithms to sort the data while its being stored.

  • Figure of an Example B+

  • Beginning the Process of Raw Data Extraction

    Use the platform file system routines in order to open a physical device handle.

    Set the file pointer to the beginning of the file.

    Read in 512 bytes from the start of the disc.

  • Microsoft API caveat When reading data from a raw disk utilizing the

    Microsoft operating system, it should be noted that all drive reads will fail if not read in increments of 512 bytes.

    Additionally it should be said, that the real structures used for any of the operations documented in this presentation are found in the figures sections within the accompanying White paper.

  • The Master Boot Record the 512 bytes which were just read in constitute

    the first sector of the disc, this data contains the Master boot record.

    The first 440 bytes of this master boot record is used as the code area for the MBR and is of no use to us for file extraction.

    Past the code area you have a four byte optional disc signature, and four bytes of null padding.

  • Master boot record continued Directly after the no padding at offset 446

    (0x1be) you will find the system partition table.

    The maximum size of these partition entries is 16 bytes each with a maximum of four possible entries.

    These entries contain the data which will allow us to identify what types of partitions we are looking at.

  • Partition Table The 16 byte partition records stored in the MBR, at the

    fourth of byte offset, contains a one byte valuewhichrepresents one of the possible partition types which can exist.

    In the White paper accompanying this presentation, and the citations and references section there is a link to all of the different partition types which can be encountered.

    However, for the purposes of this presentation it is only important to know that if this value is the number seven, it represents an NTFS partition.

  • Partition Table Continued At the eight byte offset from the beginning of the

    NTFS identified partition, there exists a four byte value which tells us at which logical block address or partition starts.

    This logical block address is simply the size of one disk sector (512 bytes), multiplied by the logical block value.

    As soon as this value has been calculated set the file pointer of your read file to that offset.

  • The NTFS Boot Sector

    After advancing your file pointer to the correct sector offset youll be ready to read in a structure known as the NTFS boot sector.

    This boot sector has several very important pieces of information in it, of particular note it contains a structure called the BIOS parameter block.

  • About The BIOS Parameter Block The bios parameter block contains various

    pieces of information about an NTFS partition.

    This information includes the number of bytes per sector, the number of sectors per cluster, the number of total sectors, and the volume serial number of the partition in question.

    Most importantly however it contains a logical cluster number for the Master file table.

  • The Master File Table The Master File Table (MFT) is a series of

    clusters designated contiguously on the disc for the purpose of holding an extremely large structure array, with each array entry1024 bytes long.

    Each entry in this Master File Table represents one file on the disk. Every single file on the disk has a Master file table entry.

  • Finding the MFT

    To find the start of the MFT extract the bytes per sector member of the BIOS parameter block and multiply it by the number of sectors per cluster also found in the BIOS parameter block.

    This gives you the number of bytes per cluster, where clusters are the smallest storage allocation size for the file system.

  • Finding The MFT Continued Take our newly calculated bytes per cluster

    value and multiply it by the logical cluster number of the MFT to move to the starting position of the Master file table.

    As a note, the majority of the calculations required in processing raw NTFS are relatively simple and documented fully and algorithmically within the accompanying white paper.

  • MFT Record Numbers

    When we were discussing B+ trees earlier in this presentation we talked a bit about nodes, and hash lookup/record numbers.

    In terms of NTFS these lookup records are in essence, MFT record numbers.

    The MFT record number in it self is array index number, in the MFT.

  • Filesystem Data Storage Semantics Data which is being stored on a filesystem is

    stored in what are known as data runs which in essence are comprised of contiguous series of clusters.

    The MFT is no different in the sense that it occupies multiple data runs.

    However, the offset we calculated in the previous slide brings us to the store of the first one in the MFT which has special significance.

  • MFT First Data Run and Metafiles

    Metafiles are special files utilized by NTFS in order to perform this specific functions.

    The first data run contains all pertinent system metafiles as well as the root directory metafile.

    These metafiles are extremely useful for finding files quickly and efficiently.

  • NTFS And The Existence Of Multiple Data Runs Per File

    If a file created by the operating system is relatively small it will literally be stored inside of its MFT record entry.

    However, if it is larger than 1024 bytes, it must be stored in a data run on the disc.

    The difference between these two paradigms is called resident and nonresident data storage.

  • MFT Records And Data Runs As was said earlier each MFT record represents

    a unique file on the file system.

    Each MFT record also contains the different data runs and offsets that contain the actual relevant file content.

    That being said, the MFT itself has an MFT entry which can be used to extract it in full.

  • Extracting Data From MFT Records: Theory

    The concept of extracting data from MFT records relies on what are known as MFT record attributes.

    There are 16 total attribute types which you will encounter when dealing with MFT entries. The data attribute is the one most relevant to data extraction.

    The extraction of the data attribute in the retrieval of its runs will retrieve the actual file content of the file to which the MFT record belongs.

  • About MFT Record Attributes In MFT record entry is actually comprised of

    multiple attribute headers. These attribute headers are responsible for identifying where the particular attribute resides on disk, as well as what type of attribute it is pointing to.

    The next slide has a table of all available attribute header types possibly found in an MFC record.

  • MFT Record Attributes TableAttribute Name Hexidecimal Value

    Unused 0x00 Standard Information 0x10 File Name 0x30 Object ID 0x40 Security Descriptor 0x50 Volume Name 0x60 Volume Information 0x70 Data 0x80 Index Root 0x90 Index Allocation 0xa0 Bitmap 0xb0 Reparse Point 0xc0 EA Information 0xd0 EA 0xe0 Property Set 0xf0 Logged Utility Stream 0x100 First User Defined Attribute 0x1000 End of Attributes (records) 0xffffffff

  • Accessing MFT Attributes In order to access in the MFT attributes, extract

    one 1024 byte MFT record from the start of the file entry in the MFT, into a usable program buffer.

    The first structure from the beginning offset of this buffer is the MFT file entry header.

    The sixth member of this structure is the start of attributes offset. This is where we can start finding our relevant attribute structures.

  • MFT File Entry Headertypedef struct _NTFS_MFT_FILE_ENTRY_HEADER {

    char fileSignature[4]; WORD wFixupOffset; WORD wFixupSize; LONGLONG n64LogSeqNumber; WORD wSequence; WORD wHardLinks; WORD wAttribOffset; WORD wFlags; DWORD dwRecLength; DWORD dwAllLength; LONGLONG n64BaseMftRec; WORD wNextAttrID; WORD wFixupPattern; DWORD dwMFTRecNumber;

    } NTFS_MFT_FILE_ENTRY_HEADER, P_NTFS_MFT_FILE_ENTRY_HEADER;

  • Navigating MFT Attributes MFT attributes structures are dynamic in type

    and purpose.

    The internal structures inside the attributes structure are unioned in a structure, but adopting different purposes depending on a static type flag in the beginning of the attribute header.

    The next slide shows the NTFS attributes static members.

  • NTFS Attribute Structure

    typedef struct _NTFS_ATTRIBUTE {

    DWORD dwType; DWORD dwFullLength; BYTE uchNonResFlag;BYTE uchNameLength; WORD wNameOffset; WORD wFlags; WORD wID; union ATTR { } Attr;

    } _NTFS_ATTRIBUTE, *P_NTFS_ATTRIBUTE;

  • NTFS Attribute Internal Unionunion ATTR {

    struct RESIDENT {

    DWORD dwLength; WORD wAttrOffset; BYTE uchIndexedTag; BYTE uchPadding;

    } Resident;

    struct NONRESIDENT { LONGLONG n64StartVCN; LONGLONG n64EndVCN; WORD wDatarunOffset;WORD wCompressionSize; BYTE uchPadding[4]; LONGLONG n64AllocSize; LONGLONG n64RealSize; LONGLONG n64StreamSize;

    } NonResident; } Attr;

  • The Extraction Of Relevant Attribute Data

    Early in this presentation it was said that if the file was small enough it would be stored in the MFT record it self.

    Data attributes are just like any other attribute, if they are marked resident as an attribute there are stored in the MFT.

  • Resident Data Extraction Continued

    Extract the attributes header resident union structure.

    Extract the offset member from that structure.

    Moved to the beginning of your MFT entry, and advanced that position the number of bytes specified in the offset.

    From that position you can read in the attribute data directly with a length also found in the resident union structure.

  • Concluding Resident Data Extraction

    That is all there is to extracting a resident attribute from an MFT record.

    Nonresident data extraction however is much more complicated.

  • The Extraction Of Nonresident Attribute Data

    Start by extracting the nonresident union structure from the attribute structure in question.

    Create a one byte value which will represent our length offset size.

    Move the file pointer to the resident union structures data run offset and read in the first byte into our newly created one byte value.

  • The Extraction Of Nonresident Data Continued

    The actual byte value recorded is in reality two four bit values.

    You can split these into a union where the top four bits represent a length and the last four bits represent an offset.

  • A Four Bit Union Structure Which Can Be Utilized

    typedef struct _LEN_OFFS_BITFIELD {

    union {

    BYTE val; struct {

    unsigned char offs:4; unsigned char len: 4;

    } bitfield;

    }; } LEN_OFFS_BITFIELD, *P_LEN_OFFS_BITFIELD;

  • Continuing Extraction Create an eight byte value for large integer calculations.

    This value will hold a copy to record length for the first data run.

    Using the length member of the union bit field in the previous slide, copy that number of bytes into the new value. This is the length of the data run as a large integer.

    Advanced the file pointer that number of bytes forward.

  • Still Continuing Extraction (almost done)

    Move the read position forward the number of bytes which you read, and then create in other eight byte value.

    This next value is our offset placeholder.

    Copy in to this New variable the number of bytes specified in the offset bit field from the previous union.

    This eight byte value now contains the proper offset to our first data run.

  • Finishing Up If we were to now move our final pointer to the data run

    offset, and read in the number of bytes specified by the data run length, we would be able to extract a data run from a file.

    Considering that a file can have multiple data runs, and that their records are all stored contiguously, its often best to read in all the data run variables before extracting the actual run data.

    However, the final implementation methodology is left to the developer.

  • Additional Information

    Linux NTFS Driver Project. "NTFS documentation", Richard Russon and Yuval Fledel 2005 http://data.linux-ntfs.org/ntfsdoc.pdf

    Wikipedia entry on NTFS "NTFS", http://en.wikipedia.org/wiki/NTFS

    Wikipedia Entry On The BIOS Parameter Block "BIOS Parameter Block", http://en.wikipedia.org/wiki/BIOS_parameter_block

    Wikipedia Entry on the Master Boot Record "Master Boot Record", http://en.wikipedia.org/wiki/Master_boot_record

  • Theory Application Demo

    Exploring NTFS: Forensic Extraction of DataAbout This PresentationDescribing our IntentWhy is this UsefulFull Access Step-by-step process overviewAbout NTFSAbout NTFS ContinuedFilesystems And Binary TreesB+ TreesExplanation of B+ OperationsFigure of an Example B+Beginning the Process of Raw Data ExtractionMicrosoft API caveatThe Master Boot RecordMaster boot record continuedPartition TablePartition Table ContinuedThe NTFS Boot SectorAbout The BIOS Parameter BlockThe Master File TableFinding the MFTFinding The MFT ContinuedMFT Record NumbersFilesystem Data Storage SemanticsMFT First Data Run and MetafilesNTFS And The Existence Of Multiple Data Runs Per FileMFT Records And Data RunsExtracting Data From MFT Records: TheoryAbout MFT Record AttributesMFT Record Attributes TableAccessing MFT AttributesMFT File Entry HeaderNavigating MFT AttributesNTFS Attribute StructureNTFS Attribute Internal UnionThe Extraction Of Relevant Attribute DataResident Data Extraction ContinuedConcluding Resident Data ExtractionThe Extraction Of Nonresident Attribute DataThe Extraction Of Nonresident Data ContinuedA Four Bit Union Structure Which Can Be UtilizedContinuing ExtractionStill Continuing Extraction (almost done)Finishing UpAdditional InformationTheory Application Demo