Installation

This document describes how to install the Harvest Tool contained in the harvest package. The following topics can be found in this document:

System Requirements

This section details the system requirements for installing and operating the Harvest Tool.

Java Runtime Environment

The Harvest Tool was developed using Java and will run on any platform with a supported Java Runtime Environment (JRE). The software was specifically compiled for and tested in Java version 1.8. The following commands test the local Java installation in a UNIX-based environment:

% which java
/usr/bin/java

% java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
        

The first command above checks whether the java executable is in the environment's path and the second command reports the version. If Java is not installed or the version is not at least 1.8, Java will need to be downloaded and installed in the current environment. Consult the local system administrator for installation of this software. For the do-it-yourself crowd, the Java software can be downloaded from the Oracle Java Download page. The suggested software package is the Java Standard Edition (SE) 8, either the JDK or the JRE package. The JDK package is not necessary to run the software but could be useful if development and compilation of Java software will also occur in the current environment.

Unpacking the Package

Download the harvest package from the PDS FTP site. The binary distribution is available in identical zip or tar/gzip packages. The installation directory may vary from environment to environment but in UNIX-based environments it is typical to install software packages in the /usr/local directory and in Windows-based environments it is typical to install software packages in the C:\Program Files directory. Unpack the selected binary distribution file with one of the following commands:

% unzip harvest-2.0.0-bin.zip
or
% tar -xzvf harvest-2.0.0-bin.tar.gz
      

Note: Depending on the platform, the native version of tar may produce an error when attempting to unpack the distribution file because many of the file paths are greater than 100 characters. If available, the GNU version of tar will resolve this problem. If that is not available or cannot be installed, the zipped package will work just fine in a UNIX environment.

The commands above result in the creation of the harvest-2.0.0 directory with the following directory structure:

  • README.txt

    A README file directing the user to the available documentation for the project.

  • LICENSE.txt

    The copyright notice from the California Institute of Technology detailing the restrictions regarding the use and distribution of this software. Although the license is strictly worded, the software has been classified as Technology and Software Publicly Available (TSPA) and is available for anyone to download and use.

  • bin/

    This directory contains batch and shell scripts for executing the tool.

  • doc/

    This directory contains a local web site with the Harvest Tool documentation, javadoc, unit test results and other configuration management related information. Just point the desired web browser to the index.html file in this directory.

  • harvest-conf/

    This directory contains examples and specific instances of harvest-related policy files used for specifying how the Harvest Tool discovers products and extracts metadata for registration.

  • lib/

    This directory contains the dependent jar files for the tool along with the executable jar file (harvest-2.0.0.jar) containing the Harvest Tool software.

  • search-conf/

    This directory contains examples and specific instances of search-related policy files used for specifying how the Harvest Tool extracts metadata and generates search index files.

  • resources/

    This directory contains a JSON-formatted file containing a list of Resource Products that are registered at the PDS Engineering Node at the time of the Software Build release. This file will be used to populate resource_ref.* fields set in the Search Core configuration files.

Configuring the Environment

In order to execute the Harvest Tool, the local environment must first be configured appropriately. This section describes how to setup the user environment on UNIX-based and Windows machines.

UNIX-Based Environment

This section details the environment setup for UNIX-based machines. The binary distribution includes a couple shell scripts that must be executed from the command-line. Setting the PATH environment variable to the location of the scripts, enables the shell scripts to be executed from any location on the local machine.

The following command demonstrates how to set the PATH environment variable (in Bourne shell), by appending to its current setting:

% export PATH=${PATH}:/usr/local/harvest-2.0.0/bin
        

In addition, the shell scripts require that the JAVA_HOME environment variable be set to the appropriate location of the Java installation on the local machine. The following command demonstrates how to set the JAVA_HOME environment variable:

% export JAVA_HOME=/path/to/java/home
        

The system administrator for the local machine may need to be consulted for this location. The path specified should have a bin sub-directory that contains the java executable. This variable may also be defined within the scripts. Edit the scripts (files without the .bat extension) and change the line in the example above to represent the local Java installation.

Windows Environment

This section details the environment setup for Windows machines. The binary distribution includes a couple batch scripts that must be executed from the command-line. Setting the PATH environment variable to the location of the files, enables the batch scripts to be executed from any location on the local machine.

The following command demonstrates how to set the PATH environment variable, by appending to its current setting:

C:\> set PATH = %PATH%;C:\Program Files\harvest-2.0.0\bin
        

In addition, the batch scripts require that the JAVA_HOME environment variable be set to the appropriate location of the Java installation on the local machine. This may have already been set when Java was installed. However, if it hasn't, then run the following command to set the JAVA_HOME environment variable:

C:\> set JAVA_HOME = C:\path\to\java\home
        

The system administrator for the local machine may need to be consulted for this location. The path specified should have a bin sub-directory that contains the java executable. This variable may also be defined within the scripts. Edit the scripts (files with the .bat extension) and change the line in the example above to represent the local Java installation. Additional methods for setting Windows environment variables can be found in the Windows System Properties document.

Installation Location

Both the shell and batch scripts for this software utilize system commands for determining the installation home directory that may or may not be available on all platforms. If these commands are not available in the current environment, their use can be replaced in the scripts by setting the PARENT_DIR variable with the actual installation path. Modify the UNIX-based shell scripts as follows (the actual installation path may be different in the current environment):

SCRIPT_DIR=`dirname $0`
PARENT_DIR=`cd ${SCRIPT_DIR}/.. && pwd`

  should be replaced with:

PARENT_DIR=/usr/local/harvest-2.0.0
        

Modify the Windows-based batch scripts as follows (the actual installation path may be different in the current environment):

set SCRIPT_DIR=%~dps0
set PARENT_DIR=%SCRIPT_DIR%..

  should be replaced with:

set PARENT_DIR=C:\Program Files\harvest-2.0.0
        

Verifying the Installation

Verify that the tool was installed correctly by running the shell script or Windows batch file with no arguments. The output should look something like the following:

% harvest

Type 'harvest -h' for usage