Operation

This document describes how to operate the Search Core software for a generic environment. The following topics can be found in this document:

Note: The command-line examples in this section have been broken into multiple lines for readability. The commands should be reassembled into a single line prior to execution.

Tool Execution

Search Core can be executed in various ways. This section describes how to run the tool, as well as its behaviors and caveats.

Command-Line Options

The following table describes the command-line options available:

usage: search-core [options]

===========================================================
Command-line Options--------------Description---------------------------------------
===========================================================
 -a,--all                         Run all components of the Search Core
                                  [default]
 -c,--config-home <directory>     Specify the product class configuration home
                                  directory.
                                  (Default: $SEARCH_CORE_HOME/conf/pds/)
 -C,--clean-dirs                  Removal of all directories from previous
                                  Search Core execution output. These
                                  directories will still be backed up in the
                                  Search Home directory. (Default: True)
 -d,--debug                       Turn on developer debugger.
 -e,--extractor                   Execute component to extract data from
                                  the registry. In order for products to be 
                                  extracted from a Registry Service, they must 
                                  have a status of Approved.
 -H,--search-home <directory>     Specify the Search Home directory. The tool
                                  will output the index files to this directory.
                                  When using the Search Service, this should be
                                  the $SEARCH_SERVICE_HOME/pds directory
                                  (Default: $SEARCH_SERVICE_HOME/pds directory)
 -h,--help                        Display usage.
 -i,--solr-indexer                Execute component to generate a Solr Index
 -l,--log-file <file-name>        Specify a log file name. Default is standard
                                  out.
 -m,--query-max <integer>         Specify the maximum number of registry values
                                  to be returned from query.(Default: 999999999)
 -P,--solr-post                   Execute component to post the index to the
                                  Search Service.
 -p,--properties-file <files>     Specify properties file containing Search
                                  Home, Registry URL, and search core
                                  configurations home directory. Multiple files
                                  can be specified.
 -r,--primary-registry <urls>     Specify the primary Registry Service
                                  instance(s) to query. Multiple registries can
                                  be specified. These registries will be used
                                  for all queries.
 -R,--secondary-registry <urls>   Specify secondary Registry Service instance(s)
                                  to query. Multiple registries can be
                                  specified. These registries will only be used
                                  after a query fails against all primary
                                  registries.
 -s,--service-url <url>           Specify the Search Service URL
                                  endpoint.Default:
                                  http://localhost:8080/search-service
 -v,--verbose <level>             Specify the severity level and above to
                                  include in the log: (0=Debug, 1=Info,
                                  2=Warning, 3=Error). Default is Info and above
                                  (level 1).
 -V,--version                     Display application version.
        

Execute Search Core

This section demonstrates execution of the tool using the command-line options. The examples below execute the tool via the batch/shell script. Search Core requires, at a minimum, a Search Home directory be specified via command-line. The following is the format for the command:

% search-core -H <search-home> [options]
        

Search Home refers to the home directory of the Solr Core we want to generate an index for. With the common Search Service installation, Search Home will be /usr/local/search-service/pds (SEARCH_SERVICE_HOME/pds). The following demonstrates how to run the Search Core with a SEARCH_SERVICE_HOME=/usr/local/search-service, Primary Registry URL of http://localhost:8080/registry, an output log file of run.log, and config home of /usr/local/search-core/conf/defaults/pds/pds3:

% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry \
-c /usr/local/search-core/conf/defaults/pds/pds3 -l run.log
        

By default, the command above runs all components of the Search Core software and produces Solr XML Documents from the Registry Service data. The Solr XML Documents are files formatted for addition to the Search Service index and will appear in the SEARCH_SERVICE_HOME/pds/index directory.

The following does not specify a configuration home directory so the default is set to SEARCH_CORE_HOME/conf/pds/pds3 and output the logs to standard out:

% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry
        

The Search Core Tool also provides the capability to run each component separately, however, they must be completed in the following order:

  1. The following command will run the Registry Extractor component of the Search Core to generate temporary XML metadata files for each Registry context product type specified in the configurations. By default, the output appears in the directory /usr/local/search-service/pds/solr-docs/<config-title>/:

    % search-core -e -H /usr/local/search-service/pds -r http://localhost:8080/registry
                
  2. The following command will run the Solr Indexer component of the Search Core to parse the XML metadata files produced by the Registry Extractor, and generate Lucene Solr Documents located in /usr/local/search-service/pds/index:

    % search-core -i -H /usr/local/search-service/pds -r http://localhost:8080/registry
                
  3. The following command will run the Solr Post component of the Search Core to use HTTP Post and HTTP Get to submit the Lucene Solr Documents to the Search Service Solr Index. The default Search Service end point is http://localhost:8080/search-service/ and assumes the Solr Documents are in /usr/local/search-service/pds/index:

    % search-core -P
                

The following command demonstrates how to test the Search Core with a SEARCH_SERVICE_HOME=/usr/local/search-service and only query 5 products for indexing (useful for testing purposes):

% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry \
-m 5
        

The following command demonstrates how to specify a primary registry and configuration home via a Search Core properties file:

% search-core -H /usr/local/search-service/pds \
-p /usr/local/search-core/conf/defaults/pds/pds3/core.properties
        

The following is an example Search Core properties file (core.properties):

search.core.primary-registry = http://localhost:8080/registry
search.core.config-home = /usr/local/search-core/conf/defaults/pds/pds3
        

If Search Core is deployed to a Windows environment, the path specification for the search.core.config-home property should be updated accordingly (including the drive letter) for each configuration. The following command demonstrates how to specify multiple registries:

% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry \
http://localhost:8080/registry-psa
        

The following command demonstrates how to specify multiple Search Core property files:

% search-core -p /usr/local/search-core/conf/defaults/pds/pds3/core.properties \
/usr/local/search-core/conf/defaults/psa/pds3/core.properties
        

Index Generation Examples

Most users will only require one Registry Service instance and one set of configuration files for product classes to include in the index. Using the default Search Service and Search Core installations specified in the documentation, here is an example of generating an index for a basic installation to generate an index for PDS3 Context Products:

% search-core -H /usr/local/search-service/pds \
-p /usr/local/search-core/conf/defaults/pds/pds3/core.properties
      

For PDS4 Context and Data Products:

% search-core -H /usr/local/search-service/pds \
-p /usr/local/search-core/conf/defaults/pds/pds4/core.properties
      

For PSA Context Products:

% search-core -H /usr/local/search-service/pds \
-p /usr/local/search-core/conf/defaults/psa/pds3/core.properties
      

For all available PDS3 and PDS4 data:

% search-core -H /usr/local/search-service/pds \
-p /usr/local/search-core/conf/defaults/pds/pds3/core.properties \
   /usr/local/search-core/conf/defaults/psa/pds3/core.properties \
   /usr/local/search-core/conf/defaults/pds/pds4/core.properties
      

After you run the indexes, see if it worked.

Search Core Configurations

Running the Search Core is based around XML configuration files that must include query information, data source specifications, and the fields to be included in the index. The following sections will outline the basic schema for creating a configuration file. Once a configuration file has been created, you can specify its location using the -c command-line option or in the properties file.

Defaults

Default configurations are provided for the following data types (assumes Search Core is installed at /usr/local/search-core, if not, update the file paths as needed):

File NameProduct Class
PDS3 Products (/usr/local/search-core/conf/defaults/pds/pds3)
archiveinfo.xml Product_Context (filters on name="*Archive Information")
attribute.xml Product_Attribute_Definition
class.xml Product_Class_Definition
context.xml Product_Context
dataset.xml Product_Data_Set_PDS3
instrument.xml Product_Instrument_PDS3
instrumenthost.xml Product_Instrument_Host_PDS3
investigation.xml Product_Mission_PDS3
service.xml Product_Service
target.xml Product_Target_PDS3
PDS4 Products (/usr/local/search-core/conf/defaults/pds/pds4)
attribute.xml Product_Attribute_Definition
bundle.xml Product_Bundle
class.xml Product_Class_Definition
collection.xml Product_Collection
context.xml Product_Context
document.xml Product_Document
observational.xml Product_Observational
service.xml Product_Service
PDS3 PSA Products (/usr/local/search-core/conf/defaults/psa/pds3)
dataset.xml Product_Data_Set_PDS3

Format

The following is an example snippet of one of the Search Core configuration files:

<?xml version="1.0" encoding="UTF-8"?>
<product>
  <specification>
    <title>PDS4-Observational</title>
    <query>
      <registryPath>objectType</registryPath>
      <value>Product_Observational</value>
    </query>
    <query>
      <registryPath>status</registryPath>
      <value>Approved</value>
    </query>
    <checkAssociations>true</checkAssociations>
  </specification>

  <indexFields>
    <!-- Identifier Fields -->
    <field name="search_id" type="required">
      <outputString format="text">pds4:{lid}</outputString>
    </field>
    <field name="identifier" type="required">
      <registryPath>lid</registryPath>
    </field>
    <field name="version_id" type="string">
      <registryPath>version_id</registryPath>
    </field>
    ...
  </indexFields>
</product>
        

Post To Operations

Some installations require building an index to be replicated between multiple secured machines. Secured machines, meaning POSTing data remotely is forbidden. Instead of generating a new index on each machine, the $SEARCH_CORE_HOME/bin/ops-index script was developed to rsync a previously generated index from a remote machine and POST the data to a local Search Service installation. The following describes the how to use the script:

Using Environment Variables

  • Update $SEARCH_CORE_HOME/bin/env-vars with, at minimum, the following information:
    • SEARCH_HOME - The path for the home directory of local instance of search service.
    • SEARCH_SERVICE_URL - URL of the Search Service instance.
    • SOURCE - Source machine name where initial index was generated
    • SOURCE_USER - Username to connect source machine
    • SOURCE_PATH - Optional variable. Path on source machine to directory containing solr_index.xml.*. Defaults to $SEARCH_HOME/index.
  • Run the ops-index script:
    % $SEARCH_CORE_HOME/ops-index
                

Using Command-Line Arguments

The ops-index script can also be run using command-line arguments in lieu of updating env-vars:

% $SEARCH_CORE_HOME/ops-index <SEARCH_HOME> <SEARCH_SERVICE_URL> \
<SOURCE> <SOURCE_USER> [<SOURCE_PATH>]
        

For example:

% ./ops-index /usr/local/search-service/pds http://localhost:8080/search-service/pds \
pds-gamma.jpl.nasa.gov root
        

Did It Work?

Once you run the Search Core with all components the data should be available through the Search Service interface. Go to http://localhost:8080/search-service/search/?q=*:* to verify data is available (modify domain and port as needed). See the Search Service - Operate page for more information on how to query data.

Common Errors

The following sub-sections destail some of the common errors that might occur for a given installation.

search-core: command not found

This error occurs when the system cannot find the search-core script in the PATH. See the Installation document for more information on adding the search-core executable to the PATH.

Error running Registry Extractor

This error arises when there is an error connecting with the Registry. The following are potential mitigation strategies:

  • Verify the Registry URL specified is correct and accessible.
  • Verify the Registry is populated with data by going to $REGISTRY_URL/report.

501 Method Not Implemented

This error occurs when using solr-post script and the HTTP POST method to ingest data into Solr. This usually means that the Search Service URL specified is either incorrect or attempting to access a port that is not open to the HTTP POST method. The following are potential mitigation strategies:

  • Verify Search Service URL is correct by navigating to the page. The Solr Welcome screen should appear.
  • Try changing the URL to access the localhost or an intranet-only accessible port (i.e. 8080). Often servers do not allow the HTTP POST method through port 80 when accessing Tomcat. For instance, if the Search Service URL used is http://my-host/search-service, try http://localhost:8080/search-service or http://my-host:8080/search-service (ports will differ according to the Tomcat installation).

No Products Indexed in the Search Service

Any of the above errors will create this situation but if Search Core appears to have completed successfully and there are still no results when testing the Search Service it is likely because the content in the target Registry Service(s) has a status of Submitted instead of Approved. See the Registry Service Operation document for instructions on how to approve products in the registry.