Operation

The Sawmill software offers a web-based interface for configuring and generating reports with the software. The login screen is as follows:

Sawmill Login Screen
If viewing this document in online form, click the image for a larger version.

Once the user is successfully authenticated, the are presented with the main screen as follows:

Sawmill Main Screen
If viewing this document in online form, click the image for a larger version.

More details to come regarding how to configure and generate reports.

Generate Logs

This section details the log format we will use for the Report Service, and how to setup your Apache HTTP or Apache Tomcat web servers to produce the logs.

Log Format

The log format we will be using is the Apache/NCSA Combined Format. The format is as follows:

remotehost rfc1413 authuser [date] "request" status bytes referrer user-agent
        

The fields in the format above are described in the table below:

FieldDescription
remotehostThe IP address of the client (remote host) which made the request to the server.
rfc1413The RFC 1413 identity of the client. A "hyphen" in the output indicates that the requested piece of information is not available.
authuserThe userid of the person requesting the document as determined by HTTP authentication.
[date]The time that the request was received [day/month/year:hour:minute:second zone]
"request"The request line from the client is given in double quotes, including, in order, the method used by the client, the requested resource, and the protocol used.
statusThe status code that the server sends back to the client.
bytesThe size of the object returned to the client, not including the response headers.
refererThe "Referer" [sic] HTTP request header. This gives the site that the client reports having been referred from.
user-agentThe User-Agent HTTP request header. This is the identifying information that the client browser reports about itself.

We are using this log format because it is a commonly used format that provides more useful metrics that allow Sawmill to ignore extraneous log entries (e.g., bots, web crawlers, worms).

Setup Apache Tomcat Server

The Apache Tomcat server produces access logs that record all requests processed by the web container. The following are the steps needed to configure the server to create the Combined Format logs:

  1. Log into the machine hosting the web server you would like to gather metrics from.
  2. Open $CATALINA_HOME/conf/server.xml for editing.
  3. Find the Host XML node. It should resemble the following:
    <Host name="localhost"  appBase="webapps"
          unpackWARs="true" autoDeploy="true"
          xmlValidation="false" xmlNamespaceAware="false">
          .
          .
          .
    </Host>
                
  4. Within this block, add the following XML to enable the access logs:
    <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"  
           prefix="localhost_access_log." suffix=".txt" pattern="combined"
           resolveHosts="false"/>
                
  5. A few things to note about the above XML:
    • Be sure to check if this Valve is enabled elsewhere in the Host block. If so, it will now produce a second set of access logs.
    • The value for directory is the absolute or relative pathname of a directory in which log files created by this valve will be placed.
    • Verify the prefix values are different so they do not overwrite each other.
    • See the Tomcat Configuration to set other configuration attributes including path, file name suffix, etc.
  6. Restart the Tomcat server.

Setup Apache HTTP Server

The Apache HTTP server produces access logs that record all requests processed by the server. The following are the steps needed to configure the server to create the Combined Format logs:

  1. Log into the machine hosting the server you would like to gather metrics from.
  2. Open $APACHE_HOME/conf/httpd.conf for editing.
  3. Find the section of the file where LogFormats are declared (if exists) and add the following:
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined
    CustomLog "||$APACHE_HOME/bin/rotatelogs \
    $LOG_DIR/access_log.%Y-%m-%d.txt 86400" combined
                
  4. A few things to note about the above configuration:
    • Be sure to check if these values are already set elsewhere in httpd.conf.
    • $APACHE_HOME is the home directory for the Apache server.
    • $LOG_DIR is the directory where you would like to place the log files.
    • The CustomLog section above denotes using the Apache rotatelogs script that rotates the logs daily instead of creating one large access_log file.
    • The double pipe (||) is used to rotate the logs without opening another shell.