Previous Topic: FSA ArchitectureNext Topic: Deploying the FSA


FSA Terminology

Note the following terminology:

Scanned file database

The FSA uses a scanned file database to track the status of each file in a scanning job. For each scanned file, the database contains a hash (see below) to uniquely identify that version of the file plus its ‘last scanned’ date. For scheduled scanning jobs, the FSA checks the hashes in the scanned file database to see whether a file has changed or moved since the last scan. This allows the FSA to skip files which have not changed.

Similarly, if a scanning job gets interrupted before it is finished (because, say, there is a network or system failure), the FSA checks the hashes in the database when the job next runs and skips any files which have not been modified since they were last scanned.

Note: Files (binary data) found during database scans are not stored in the scanned file database.

NIST database

Also known as the National Software Reference Library (NSRL), the NIST database is a list of known benign and malicious files, maintained by the National Institute of Standards and Technology (NIST). Its purpose is to ease the burden of investigating computer files. Desktop computers can contain over 100,000 files, so investigators need to eliminate as many known files as possible from having to be reviewed. From the NIST Web site:

"The NSRL provides a repository of known software, file profiles, and file signatures for use by law enforcement and other organizations in computer forensics investigations."

The FSA can use this database to identify files that do not need scanning. It checks scanned files for specific profiles and signatures; if any match known files in the NSRL database, the FSA can omit these files from the scan. The NSRL also enables the FSA to identify files that are not what they claim to be (for example, a file with the same name, size, and date of a known file, but not the same content).

Note: Files (binary data) found during database scans are not checked against the NIST database.

File hashes

Before scanning a file, the FSA applies a SHA-256 cryptographic hash function to the file based on the file name, path, size and last modified date. From this, it generates a string that represents a digital fingerprint of that file. In FSA terms, this fingerprint is the file hash, also sometimes referred to as a hash code, hash sum, or hash value. File hashes are stored in the FSA scanned file database (see above).

Each time a scanning job is repeated, the FSA compares a file’s newly-generated hash with the hash from the previous scan. If these differ, the FSA infers the file has changed since and so scans it again; if they are the same, it infers the file is unchanged and ignores it.

Note: Hashes are not generated for files (binary data) found during database scans.

DoD deletion

This is forensic deletion, so called because the storage media are purged or ‘sanitized’ to guarantee that data cannot be recovered and used to obtain evidence in legal discovery. ‘DoD’ is a reference to Department of Defense approved methods for purging storage media.

Unlike conventional delete operations where the file header is overwritten, a DoD deletion overwrites the disk sector multiple times in a prescribed pattern to ensure that deleted files cannot be recovered.

Note: DoD deletions are not available for scanned items in Exchange Public Folders and SharePoint sites, or for scanned database records.

DSN

A Database Source Name (DSN) describes a connection to a specific database through an ODBC driver. The DSN specifies all parameters of the connection, including the host machine, port and database name, host server, other information. You can use a DSN in an application to query the database. The FSA installer uses DSNs to connect to the scanned file database and NIST database.

More information:

How Do I Delete a Scanned File Database?