This document describes how to operate the Search Core software for a generic environment, see the Engineering Node Operation Procedures for more detailed operation instruction for the Engineering Node installation. The following topics can be found in this document:
Note: The command-line examples in this section have been broken into multiple lines for readability. The commands should be reassembled into a single line prior to execution.
Search Core can be executed in various ways. This section describes how to run the tool, as well as its behaviors and caveats.
The following table describes the command-line options available:
usage: search-core [options] =========================================================== Command-line Options--------------Description--------------------------------------- =========================================================== -a,--all Run all components of the Search Core [default] -c,--config-home <directories> Specify the product class configuration home directory.Multiple directories can be specified to accompany multiple registries. (Default: $SEARCH_CORE_HOME/conf/pds/) -C,--clean-dirs Removal of all directories from previous Search Core execution output. These directories will still be backed up in the Search Home directory. (Default: True) -d,--debug Turn on developer debugger. -e,--extractor Execute component to extract data from registry -H,--search-home <directory> Specify the Search Home directory. The tool will output the index files to this directory. When using the Search Service, this should be the $SEARCH_SERVICE_HOME/pds directory (Default: $SEARCH_SERVICE_HOME/pds directory) -h,--help Display usage. -i,--solr-indexer Execute component to generate a Solr Index -l,--log-file <file name> Specify a log file name. Default is standard out. -m,--query-max <integer> Specify the maximum number of registry values to be returned from query.(Default: 999999999) -P,--solr-post Execute component to post the index to the Search Service. -p,--properties-file <files> Specify properties file containing Search Home, Registry URL, and search core configurations home directory. Multiple files can be specified. -r,--primary-registry <urls> Specify the primary Registry Service instance(s) to query. Multiple registries can be specified. These registries will be used for all queries. -R,--secondary-registry <urls> Specify secondary Registry Service instance(s) to query. Multiple registries can be specified. These registries will only be used after a query fails against all primary registries. -s,--service-url <url> Specify the Search Service URL endpoint.Default: http://localhost:8080/search-service -v,--verbose <level> Specify the severity level and above to include in the log: (0=Debug, 1=Info, 2=Warning, 3=Error). Default is Info and above (level 1). -V,--version Display application version.
This section demonstrates execution of the tool using the command-line options. The examples below execute the tool via the batch/shell script.
The Search Core requires, at minimum, a Search Home directory be specified via command-line. The following is the format for the command:
% search-core -H <search-home> [options]
Search Home refers to the home directory of the Solr Core we want to generate an index for. With the common Search Service installation, Search Home will be /usr/local/search-service/pds (SEARCH_SERVICE_HOME/pds). The following demonstrates how to run the Search Core with a SEARCH_SERVICE_HOME=/usr/local/search-service, Primary Registry URL of http://localhost:8080/registry, an output log file of run.log, and config home of /usr/local/search-core/conf/pds/pds3 :
% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry \ -c /usr/local/search-core/conf/pds/pds3 -l run.log
By default, the command above runs all components of the Search Core software and produces Solr XML Documents from the Registry Service data. The Solr XML Documents are files formatted for addition to the Search Service index and will appear in the SEARCH_SERVICE_HOME/pds/index directory.
The following does not specify a configuration home directory so the default is set to SEARCH_CORE_HOME/conf/pds/pds3 and output the logs to standard out:
% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry
The Search Core Tool also provides the capability to run each component separately, however, they must be completed in the following order:
The following command will run the Registry Extractor component of the Search Core to generate temporary XML metadata files for each Registry context product type specified in the configurations. By default, the output appears in the directory /usr/local/search-service/pds/registry-data/<object-type>/:
% search-core -e -H /usr/local/search-service/pds -r http://localhost:8080/registry
The following command will run the Solr Indexer component of the Search Core to parse the XML metadata files produced by the Registry Extractor, and generate Lucene Solr Documents located in /usr/local/search-service/pds/index :
% search-core -i -H /usr/local/search-service/pds -r http://localhost:8080/registry
The following command will run the Solr Post component of the Search Core to use HTTP Post and HTTP Get to submit the Lucene Solr Documents to the Search Service Solr Index. The default Search Service URL is http://localhost:8080/search-service/ and assumes the Solr Documents are in /usr/local/search-service/pds/index :
% search-core -P
The following command demonstrates how to test the Search Core with a SEARCH_SERVICE_HOME=/usr/local/search-service and only query 5 products for indexing (useful for testing purposes):
% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry \ -m 5
The following command demonstrates how to specify a primary registry and configuration home via a Search Core properties file:
% search-core -H /usr/local/search-service/pds \ -p /usr/local/search-core/conf/pds/pds3/core.properties
An example Search Core properties file looks like this:
search.core.primary-registry = http://localhost:8080/registry search.core.config-home = /usr/local/search-core/conf/pds/pds3
The following command demonstrates how to specify multiple registries:
% search-core -H /usr/local/search-service/pds -r http://localhost:8080/registry \ http://localhost:8080/registry-psa
The following command demonstrates how to specify multiple Search Core property files:
% search-core -p /usr/local/search-core/conf/pds/pds3/core.properties \ /usr/local/search-core/conf/psa/pds3/core.properties
Most users will only require 1 registry and 1 set of configuration files for product classes to include in the index. Using the default Search Service and Search Core installations specified in the documentation, here is an example of generating an index for a basic installation to generate an index for PDS3 Context Products:
% search-core -H /usr/local/search-service/pds \ -p /usr/local/search-core/conf/pds/pds3/core.properties
For PDS4 Product Search:
% search-core -H /usr/local/search-service/pds \ -p /usr/local/search-core/conf/pds/pds4/core.properties
For PSA Context Products:
% search-core -H /usr/local/search-service/pds \ -p /usr/local/search-core/conf/psa/pds3/core.properties
For all available PDS3 and PDS4 data:
% search-core -H /usr/local/search-service/pds \ -p /usr/local/search-core/conf/pds/pds3/core.properties \ /usr/local/search-core/conf/psa/pds3/core.properties \ /usr/local/search-core/conf/pds/pds4/core.properties
After you run the indexes, see if it worked.
Running the Search Core is based around XML configuration files that must include query information, data source specifications, and the fields to be included in the index. The following sections will outline the basic schema for creating a configuration file. Once a configuration file has been created, you can specify its location using the -c command-line option or in the properties file.
Default configurations are provided for the following data types (assumes Search Core is installed at /usr/local/search-core, if not, update the file paths as needed):
Location | File Name | Object Type | Comments |
---|---|---|---|
/usr/local/search-core/conf/pds/pds3 | Queries PDS3 versioned PDS products | ||
archiveinfo.xml | Product_Context | Also filters where name="*Archive Information" | |
dataset.xml | Product_Data_Set_PDS3 | ||
instrument.xml | Product_Instrument_PDS3 | ||
instrumenthost.xml | Product_Instrument_Host_PDS3 | ||
investigation.xml | Product_Mission_PDS3 | ||
target.xml | Product_Target_PDS3 | ||
/usr/local/search-core/conf/pds/pds4 | Queries PDS4 versioned PDS products | ||
bundle.xml | Product_Bundle | ||
collection.xml | Product_Collection | ||
context.xml | Product_Context | ||
observational.xml | Product_Observational | ||
/usr/local/search-core/conf/psa/pds3 | Queries PDS3 versioned PSA Products | ||
psa-dataset.xml | Product_Data_Set_PDS3 |
<?xml version="1.0" encoding="UTF-8"?> <product> <specification> <title>DataSet</title> <registryObjectType>Product_Data_Set_PDS3</registryObjectType> </specification> <indexFields> <field name="identifier" type="required"> <registryPath>lid</registryPath> </field> <field name="title" type="required"> <registryPath>name</registryPath> </field> <field name="description" type="required"> <registryPath>data_set_terse_description</registryPath> </field> <field name="resLocation" type="required"> <outputString>/ds-view/pds/viewDataset.jsp?dsid={data_set_id}</outputString> </field> <field name="objectType" type="string"> <registryPath>objectType</registryPath> </field> <field name="agency_name" type="string" default="Unknown"> <registryPath>node_ref.agency_ref.alternate_id</registryPath> </field> </indexFields> </product>
Description of schema TBD.
Once you run the Search Core with all components the data should be available through the Search Service interface. Go to http://localhost:8080/search-service/pds/search/?q=*:* to verify data is available (modify domain and port as needed). See the Search Service - Operate page for more information on how to query data.
Some installations require building an index to be replicated between multiple secured machines. Secured machines, meaning POSTing data remotely is forbidden. Instead of generating a new index on each machine, the $SEARCH_CORE_HOME/bin/ops-index script was developed to rsync a previously generated index from a remote machine and POST the data to a local Search Service installation. The following describes the how to use the script:
% $SEARCH_CORE_HOME/ops-index
The ops-index script can also be run using command-line arguments in lieu of updating env-vars:
% $SEARCH_CORE_HOME/ops-index <SEARCH_HOME> <SEARCH_SERVICE_URL> \ <SOURCE> <SOURCE_USER> [<SOURCE_PATH>]
For example:
% ./ops-index /usr/local/search-service/pds http://localhost:8080/search-service/pds \ pdsbeta.jpl.nasa.gov root
Once this script completes, the data should be available through the Search Service interface. Go to http://localhost:8080/search-service/pds/search/?q=*:* to verify data is available (modify domain and port as needed). See the Search Service - Operate page for more information on how to query data.
This error arises when there is an error connecting with the Registry. The following are potential mitigation strategies:
This error occurs when using solr-post script and the HTTP POST method to ingest data into Solr. This usually means that the Search Service URL specified is either incorrect or attempting to access a port that is not open to the HTTP POST method. The following are potential mitigation strategies:
This error occurs when the system cannot find the search-core script in the PATH. See Installation Instructions for more information on adding the search-core to the PATH.