Home
Installation Notes
Documentation
Download
Contacts
 

User Guide - Table of Contents

1. What is Spot The Difference

Spot The Difference is a file integrity checker. Its goal is to detect signs of intrusion by looking for suspicious changes in system files. Crackers, in fact, to do their evil or just to make sure they can work their way back into the system, often change some configuration files, executables and/or log files (usually with rootkits); thus leaving signs of the break-in.

An integrity checker works in two phases:

Information about what files to check, which checks to perform and the connection to the database is set out in the configuration file.

The database created during the update phase can't be modified (you can't add, remove or change records to it). You can, of course, create a new database that reflects the new state of the filesystem.

2. Installation

Spot The Difference is fully developed in python; so you need to have python (at least 2.3.x). If you don't already have it, you can download it here. Using dbm database files to store files information doesn't require the installation of any additional python module. If you wish to use another database (MySQL, PostgreSQL and SQLite are supported), you might need to install database-specific modules.

Python database modules used by Spot The Difference are:

2.1 Unix/Linux

To install Spot The Difference on a Unix/Linux system follow these few steps:

This will copy all modules in the third-party modules directory and the scripts in the local executables directory (usually /usr/local/bin on UN*X systems). A sample configuration (stdiff.conf.sample) file will be copied to /etc.

2.2 Windows

To install Spot The Difference on a Windows system, just run the graphical installer (stdiff-0.2.1.win32.exe); you will be asked a couple of questions:

After a couple of 'next', the installer will copy the scripts in the python scripts directory (<python_dir>\Scripts\) and the modules in the third-party modules directory (<python_dir>\Lib\site-packages\). A sample configuration file (stdiff.conf.sample) will be copied to <python_dir>\etc\.

3. Databases

The next step after installation is to create the database that will hold files information. Spot The Difference supports most of the open source databases (MySQL, PostgreSQL, SQLite and dbm files). If you want to use dbm or SQLite files, you don't need to create the database now: it will be automatically created at runtime.

The advantage of dbm files is their simplicity and portability. You can find a lot of software on the internet for viewing and managing their content. and you don't need to install any additional software or python module.

Also SQLite databases are stored in files and thus don't require setting up a database server. They are much faster than dbm files, but require the installation of an additional python module.

If you wish to use Spot The Difference with MySQL or PostgreSQL databases, you will need to create the database and the table that will hold files information. To do this, simply run the script
  stdiff_install_db db_type
(where db_type can be either mysql or pgsql). It will guide you through the creation of the database and/or tables. You will be prompted to answer a few questions and eventually the database will magically appear.

Using a database server, like MySQL or PostgreSQL, allows you to hold data from multiple monitored machines in a single repository . All machines can query/update a single, centralized, database. The security of the database server machine becomes, of course, fundamental. To view the content of the database you can use the database server tools.

Since the configuration file must contain the password to access to the database, it is recommended to create/update the database with a privileged user and then do the later checks with an unprivileged user, with only SELECT granted.

3.1 dbm File

dbm (Data Base Management) files are binary databases of key-value pairs. They are local files and their integrity must be preserved setting them as read-only (read-only NFS, read-only medium, chflags) after their creation.

3.2 SQLite File

"SQLite is a small C library that implements a self-contained, embeddable, zero-configuration SQL database engine". SQLite databases are local files and, like dbm files, their integrity must be preserved setting them as read-only (read-only NFS, read-only medium, chflags) after their creation.

3.3 MySQL Database

MySQL is "the world's most popular open source database". After the installation, the command:
  stdiff_install_db mysql
will start an interactive script that will drive you through the creation of the database.

You can also create the database and the table yourself. Though you can't change fields names, data types are customizable. These are the default values:

Field Type Description
path VARCHAR(255) BINARY PRIMARY KEY Full path of the file or directory (255 characters max)
md5 CHAR(32) md5 file checksum (16 bytes)
sha CHAR(40) sha1 file checksum (20 bytes)
st_mode SMALLINT UNSIGNED File permissions in decimal format
st_ino MEDIUMINT UNSIGNED File inode number (3 bytes: 16777215 max)
st_dev SMALLINT UNSIGNED File device (2 bytes: 65535 max)
st_nlink SMALLINT UNSIGNED Number of links (2 bytes: 65535 max)
st_uid INT UNSIGNED User ID (2 bytes: 4294967295 max)
st_gid INT UNSIGNED Group ID (2 bytes: 4294967295 max)
st_size BIGINT UNSIGNED File size (8 bytes: 18446744073 GBytes max)
st_atime INT UNSIGNED Access time (timestamp: 4 bytes)
st_mtime INT UNSIGNED Modification time (timestamp: 4 bytes)
st_ctime INT UNSIGNED Change time (timestamp: 4 bytes)

3.3 PostgreSQL Database

"PostgreSQL is an object-relational database management system (ORDBMS) based on POSTGRES, Version 4.2, developed at the University of California at Berkeley Computer Science Department". After the installation, the command
  stdiff_install_db pgsql
will start an interactive script that will drive you through the creation of the database.

You can also create the database and the table yourself. Though you can't change fields names, data types are customizable. These are the default values:

Field Type Description
path VARCHAR(255) PRIMARY KEY Full path of the file or directory (255 characters max)
md5 CHAR(32) md5 file checksum (16 bytes)
sha CHAR(40) sha1 file checksum (20 bytes)
st_mode INT File permissions in decimal format
st_ino INT File inode number (4 bytes)
st_dev INT File device (4 bytes)
st_nlink INT Number of links (4 bytes)
st_uid INT User ID (4 bytes)
st_gid INT Group ID (4 bytes)
st_size BIGINT File size (8 bytes)
st_atime INT Access time (timestamp: 4 bytes)
st_mtime INT Modification time (timestamp: 4 bytes)
st_ctime INT Change time (timestamp: 4 bytes)

4. Configuration File

The next step after the creation of the database, is to edit the configuration file, which defines the run-time behaviour of Spot The Difference. It includes information about connecting to the database, files to check and which checks to perform on those files.

It is made up of:

variables
variables provide all the information needed for Spot The Difference to connect to the database. Required information varies from one database to another (see below). Also e-mail notification parameters (server and recipients) are set through variables;
rules
rules contain strings which represent the paths (files and directories) to check and the checks to perform on those paths. Pay close attention when writing the rules: poorly written rules may generate false positives and/or not detect actual intrusions;
comments
comments start with a hash sign (#) and may be inline or take up a whole line.
A sample configuration file (stdiff.conf.sample) is provided with the software and placed in /etc (<python_dir>\etc on Windows systems).

4.1 Variables

Variables provide all the information needed for Spot The Difference to connect to the database. Firstly, you have to set the value of the db_type variable to the database type to use (legal values are: dbm, sqlite, mysql and pgsql). For example:
  db_type = mysql

The other variables that can be set are login variables (user and passwd), server variables (host, port or unix_socket) and database variables (db and table). Not all database types require the setting of all these variables (e.g. dbm and SQLite database files don't require login or host and port specification). See below for database-specific variables.

If you wish to receive the final report by e-mail (-e option), you have to set a couple of additional variables:

mail_server
containing the SMTP server name or address. If it doesn't use the default port (25), you can specify the port number with the usual syntax server:port. For example:
  mail_server = mailserver.my.domain:2500
mail_recipients
containing a list of whitespace separated e-mail addresses. For example:
  mail_recipients = foo@my.domain bar@my.domain

4.1.1 'dbm' Variables

For a dbm file, you only need to specify its absolute path; it must be assigned to the db variable. E.g.:
  db = /root/stdiff/stdiff.dbm

4.1.2 SQLite Variables

If you use a SQLite database file, you need to specify its absolute path (in the db variable) and the name of the table (in the table variable) in which to insert files information. E.g.:
  db = /root/stdiff/stdiff.sql
  table = my_hostname

4.1.3 MySQL Variables

To connect to a MySQL server, you have to set:

Configuration file entries for a MySQL server connection would look like this:
  user = my_user
  passwd = my_password
  host = localhost
  unix_socket = /var/run/mysql/mysql.sock
  db = Spot
  table = my_hostname

To connect to the database through a TCP port instead of a socket, the fourth entry would have been:
  port = 3306

4.1.3 PostgreSQL Variables

To connect to a PostgreSQL server, you have to set:

Configuration file entries for a PostgreSQL server connection (through a UNIX socket) would look like this:
  user = my_user
  passwd = my_password
  host = /tmp
  db = Spot
  table = my_hostname

To connect to a remote database, you should assign its name or address to the host variable and set the port variable:
  host = 1.2.3.4
  port = 3306

4.2 Rules

Rules specify the paths (files and directories) to check and the checks to perform. Each rule takes one line and consists of one or two whitespace separated fields:

There are four types of rules, identified by their prefix:

No prefix
Rules with no prefix are 'root rules'. They specify which files and directories must be checked. They are made up of a pathname and a checks string. If the pathname is a directory, checks extend to all files and directories below it, recursively. There can be any number of root rules. The following rule:
   /etc      5iplzc
means that all files and directories in /etc must be checked, recursively. For the meaning of the checks string, see below.
!
Pathnames preceded by an exclamation mark are ignored. These rules are only made up of a path. If it is a directory, everything below it is ignored. The following rules:
  /etc         51plzc
  !/etc/motd
  !/etc/X11
check all the files and directories in /etc except the file /etc/motd and the whole directory tree below /etc/X11.
$
A directory or a file inside the directory tree of a root rule may need special checks. Simply write the pathname of that file or directory, preceded by a dollar sign , and the checks to perform on that path. The following rules:
  /etc             5iplzc
  $/etc/inetd.conf 5siplzc
  $/etc/ssh        5siplzc
perform extra checks on files inside the directory /etc/ssh and on the file /etc/inetd.conf. Such rules are non-cascading, i.e. they don't get inherited by subdirectories. In the previous example, directories in /etc/ssh (if any) wouldn't inherit the checks from the /etc/ssh rule, but from the /etc rule.
=
Rules made up of an equal sign followed by a directory pathname mean that all directories below that pathname must be ignored. The following rules:
  /etc       5iplxc
  =/etc/X11
check all the /etc directory tree except all directories below /etc/X11. The files in /etc/X11, instead, are checked.

4.2.1 Checks

As stated previously, some rules must contain the list of checks to perform on a specific pathname. Below is a list of available file checks; each one is identified by a single character:

  5  md5 checksum
  s  sha1 checksum
  p  permessions
  i  inode number
  d  device
  l  number of links
  u  user ID
  g  group ID
  z  size
  a  Most recent access time
  m  Time of the most recent modification of the content of the file
  c  Time of the last modification of inode 'metadata' (on UNIX) or creation
     date (on Windows)

You must specify all the checks you want to be performed (there is no special 'all' string) with no whitespace between. The following rule:
  /etc  5sugmc
will:

Checksums are calculated only on files, not on directories. Checksum calculation needs to open the file for reading, thus modifying its access time. Setting both 5 or s and a checks in the same rule will lead to a number of false positives.

For critical files, it is recommended to calculate both md5 and sha1 checksums, since it's theoretically possible to modify a file and pad it to leave its checksum unchanged. Don't forget, however, that some rootkits serve up the original file (hidden somewhere) when you open it for reading and the compromised file when you execute it. So pay close attention to new, unexpected files.

5. Usage

Well, so far we have created the database and edited the configuration file. What we need to do now is to update the database and then schedule a periodic check of the filesystem. The syntax of Spot The Difference is:

    stdiff.py [-h] [-v|-q] [-C config_file] [-c|-u] [-o output_file] [-e]

Almost all parameters are optional. It is necessary, however, to specify wether a filesystem check (-c) or a database update (-u) is required . The options are as follows:

-C, --configfile
Specify the configuration file path. Default is /etc/stdiff.conf
-u, --update
Create a new 'known-state' database or overwrite an existing database
-c, --check
Check filesystem integrity
-o, --outfile
Specify the pathname of the final report. Default is stdiff.out in the current directory
-e, --email
Turn on email notification
-v, --verbose
Verbose mode
-q, --quiet
Quiet (almost dumb) mode
-h, --help
Print a short help message and exit
--version
Print the version number and exit

Below are some examples. To update the database, preserving all the default settings, simply run:
  # stdiff.py -u
This will parse the default configuration file (/etc/stdiff.conf) and create a new 'known-state' database (or drop and recreate a pre-existing one). The final report will be saved to stdiff.out in the current directory.

If you want to override the default settings, the command:
  # stdiff.py -u -o /root/stdiff/stdiff.out -C /root/stdiff/stdiff.conf -v
will update the database taking database parameters and rules from the configuration file /root/stdiff/stdiff.conf (-C option). The name of all the files inserted in the database will be displayed (verbose mode, -v) and the final report will be saved to /root/stdiff/stdiff.out, as specified by the -o option.

Once you have populated the database, you should schedule periodic checks of the filesystem. The command:
 # stdiff.py -C /root/stdiff/stdiff.conf -c -o /root/stdiff/stdiff.out -e
will compare the current filesystem to the one recorded in the database. It will save the final report to /root/stdiff/stdiff.out and e-mail it (-e option) to the addresses specified in the configuration file /root/stdiff/stdiff.conf.

6. Final Report

After the creation/update of the database, a detailed report is generated. It contains statistics on the update process:

After a filesystem check, the generated report provides all the above data plus a detailed list of:

This is a sample report generated after a database update and this one is generated after a filesystem check.

7. Bugs

Thanks to Jens Engel for pointing out an unhandled exception when a broken symlink was found. Release 0.2.1 has fixed this issue and now, when updating the database, stdiff will report broken links in the final report:
  Could not open these files:

    [...]

    /usr/bin/brokenlink
      No such file or directory: '/usr/bin/brokenlink'
Of course this prevents the broken symlink from being inserted into the database. Then, on the next filesystem check, the broken symlink will be considered a new file, unless you delete it or you tell stdiff to ignore it, adding a rule to the configuration file:
!/usr/bin/brokenlink

Spot The Difference has been tested on *BSD, Linux and Windows sytems. Please send bug reports and comments by email.

8. Author and Copyright

Copyright (c) 2004, Daniele Mazzocchio
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.