OperationThis document describes how to operate the Harvest Tool software. The following topics can be found in this document: Note: The command-line examples in this section have been broken into multiple lines for readability. The commands should be reassembled into a single line prior to execution. Tool ExecutionHarvest Tool can be executed in various ways. This section describes how to run the tool, as well as its behaviors and caveats. Command-Line OptionsThe following table describes the command-line options available:
Execute Harvest ToolThe Harvest Tool operates with a policy file to register product metadata. Details on how to create this policy file can be found in the Harvest Policy File section. This section demonstrates some of the ways that the tool can be executed:
Registering Products From a Single Target The following command demonstrates how to register products to a non-secured registry instance from a target directory, $HOME/directory, where only files that end with a .xml file extension, will be processed: % harvest $HOME/directory -e "*.xml" -c policy.xml Registering Products From Multiple Targets The following command demonstrates how to register products to a non-secured registry instance from two target directories, $HOME/directory1 and $HOME/directory2, using the policy file, policy.xml. Only files that end with a .xml file extension will be processed. The output will go to a log file, log-file.txt: % harvest $HOME/directory1, $HOME/directory2 -e "*.xml" -c policy.xml -l log-file.txt Registering Products From Targets Specified In The Policy File Targets can either be specified on the command-line or in in the policy file. Any targets specified on the command-line will overwrite any targets specified in the policy file. The following command demonstrates registering products based on targets specified in the policy file, policy.xml: % harvest -c policy.xml Registering Products To A Secured Registry Instance The following command demonstrates how to register products to a secured registry instance from a target directory, $HOME/directory, using the policy file, policy.xml: % harvest $HOME/directory -c policy.xml -u {username} -p {password} \ -k {keystorePassword} Excluding Sub-Directories To Traverse From a Target The following command demonstrates registering products from a target directory, $HOME/directory, where the tool will not traverse the sub-directory CONTEXT: % harvest $HOME/directory -c policy.xml -D "CONTEXT" Registering PDS3 Products From a Single Target The following command demonstrates registering PDS3 products to a non-secured registry instance from a target directory, $HOME/pds3Directory, where only files that end with a .LBL file extension, will be processed: % harvest $HOME/pds3Directory -pds3 -e "*.LBL" -c policy.xml See the PDS Product Registration section for more details on how Harvest Tool supports PDS3 product registration. Persistence ModeThe Harvest Tool can be run in persistence mode through an XML-RPC accessible web service called a daemon. Under this scenario, the Harvest Tool wakes up periodically, inspects a target directory or directories, and registers the latest products. This section details how to set this up. In order to run the tool through the daemon, the command-line option flags -P and -w need to be used. This tells the Harvest Tool the port number to use and how long to sleep in between crawls, respectively. When the daemon is running, it can be accessed through the following url: http://localhost:{port number}/xmlrpc. The following command demonstrates launching the Harvest Tool through the daemon on port 9001, where it will wait 120 seconds in between crawls: % harvest -c policy.xml -u {username} -p {password} \ -k {keystorePassword} -l log.txt -P 9001 -w 120 After running the above command, the daemon will be accessible at http://localhost:9001/xmlrpc. In order to stop the daemon from running, a daemon controller is needed. The bin/ directory of the Harvest Tool release package contains a shell script, harvest-ctrl, and a batch file, harvest-ctrl.bat, which are used to gracefully shut down the daemon service on a UNIX-like and Windows system, respectively. In addition, they can provide a few additional statistics about the crawling. The following table describes the command-line options available for harvest-ctrl:
The following table describes the operation names available to pass with the --operation command-line option:
The following examples demonstrate how to run harvest-ctrl using a few of the different operations. For demonstration purposes, assume that the daemon service is located at the following url: http://localhost:9001/xmlrpc. Determine the Status of the Daemon Service The following command is used to find out if the daemon service is still running: % harvest-ctrl --url http://localhost:9001/xmlrpc --operation --isRunning Shutdown the Daemon Service The following command demonstrates shutting down the daemon service: % harvest-ctrl --url http://localhost:9001/xmlrpc --operation --stop Harvest Policy FileThe Harvest policy file is an XML-based configuration file that the tool uses to find products and register their metadata. The schema for the policy file can be found in the Harvest Policy Schema section. This section details how to setup the policy file to do PDS product registration. PDS4 Product RegistrationThe following is an example of a policy file to perform registration of PDS4 data products: <policy> <registryPackage> <name>Harvest Package Example</name> <description>This is an example of a Harvest run.</description> </registryPackage> <collections> <file>$HOME/VG2PLS_archive/data/Collection_Data.xml</file> <file>$HOME/VG2PLS_archive/document/Collection_document.xml</file> </collections> <directories> <path>$HOME/VG2PLS_archive</path> <fileFilter> <include>*.xml</include> </fileFilter> <directoryFilter> <exclude>CONTEXT</exclude> </directoryFilter> </directories> <accessUrls registerFileUrls="true"> <accessUrl> <baseUrl>http://pds.nasa.gov/pds4</baseUrl> <offset>$HOME</offset> </accessUrl> </accessUrls> <checksums generate="true"> <manifest>$HOME/VG2PLS_archive/vg2pls_archive.md5</manifest> </checksums> <storageIngestion> <serverUrl>http://localhost:9000</serverUrl> </storageIngestion> <candidates> <namespace prefix="dph" uri="http://pds.nasa.gov/schema/pds4/dph/v01"/> <productMetadata objectType="Product_Document"> <xPath slotName="information_model_version"> //Identification_Area/information_model_version </xPath> <xPath slotName="product_class"> //Identification_Area/product_class </xPath> <xPath slotName="alternate_id"> //Identification_Area/Alias_List/Alias/alternate_id </xPath> <xPath slotName="alternate_title"> //Identification_Area/Alias_List/Alias/alternate_title </xPath> <xPath slotName="citation_author_list"> //Identification_Area/Citation_Information/author_list </xPath> <xPath slotName="citation_editor_list"> //Identification_Area/Citation_Information/editor_list </xPath> <xPath slotName="citation_publication_year"> //Identification_Area/Citation_Information/publication_year </xPath> <xPath slotName="citation_keywords"> //Identification_Area/Citation_Information/keywords </xPath> <xPath slotName="citation_description"> //Identification_Area/Citation_Information/description </xPath> <xPath slotName="modification_date"> //Identification_Area/Modification_History/Modification_Detail/modification_date </xPath> <xPath slotName="modification_version_id"> //Identification_Area/Modification_History/Modification_Detail/version_id </xPath> <xPath slotName="modification_description"> //Identification_Area/Modification_History/Modification_Detail/description </xPath> <xPath slotName="external_reference_description"> //Reference_List/External_Reference/description </xPath> <xPath slotName="document_revision_id"> //Document_Description/revision_id </xPath> <xPath slotName="document_name"> //Document_Description/document_name </xPath> <xPath slotName="document_doi"> //Document_Description/doi </xPath> <xPath slotName="document_author_list"> //Document_Description/author_list </xPath> <xPath slotName="document_editor_list"> //Document_Description/editor_list </xPath> <xPath slotName="document_acknowledgement_text"> //Document_Description/acknowledgement_text </xPath> <xPath slotName="document_copyright"> //Document_Description/copyright </xPath> <xPath slotName="document_description"> //Document_Description/description </xPath> <xPath slotName="document_publication_date"> //Document_Description/publication_date </xPath> </productMetadata> <productMetadata objectType="Product_Observational"> <xPath slotName="information_model_version"> //Identification_Area/information_model_version </xPath> <xPath slotName="product_class"> //Identification_Area/product_class </xPath> <xPath slotName="alternate_id"> //Identification_Area/Alias_List/Alias/alternate_id </xPath> <xPath slotName="alternate_title"> //Identification_Area/Alias_List/Alias/alternate_title </xPath> <xPath slotName="citation_author_list"> //Identification_Area/Citation_Information/author_list </xPath> <xPath slotName="citation_editor_list"> //Identification_Area/Citation_Information/editor_list </xPath> <xPath slotName="citation_publication_year"> //Identification_Area/Citation_Information/publication_year </xPath> <xPath slotName="citation_keywords"> //Identification_Area/Citation_Information/keywords </xPath> <xPath slotName="citation_description"> //Identification_Area/Citation_Information/description </xPath> <xPath slotName="modification_date"> //Identification_Area/Modification_History/Modification_Detail/modification_date </xPath> <xPath slotName="modification_version_id"> //Identification_Area/Modification_History/Modification_Detail/version_id </xPath> <xPath slotName="modification_description"> //Identification_Area/Modification_History/Modification_Detail/description </xPath> <xPath slotName="observation_comment"> //Observation_Area/comment </xPath> <xPath slotName="observation_start_date_time"> //Observation_Area/Time_Coordinates/start_date_time </xPath> <xPath slotName="observation_stop_date_time"> //Observation_Area/Time_Coordinates/stop_date_time </xPath> <xPath slotName="observation_local_mean_solar_time"> //Observation_Area/Time_Coordinates/local_mean_solar_time </xPath> <xPath slotName="observation_local_true_solar_time"> //Observation_Area/Time_Coordinates/local_true_solar_time </xPath> <xPath slotName="observation_solar_longitute"> //Observation_Area/Time_Coordinates/solar_longitude </xPath> <xPath slotName="primary_result_type"> //Observation_Area/Primary_Result_Description/type </xPath> <xPath slotName="primary_result_purpose"> //Observation_Area/Primary_Result_Description/purpose </xPath> <xPath slotName="primary_result_data_regime"> //Observation_Area/Primary_Result_Description/data_regime </xPath> <xPath slotName="primary_result_reduction_level"> //Observation_Area/Primary_Result_Description/reduction_level </xPath> <xPath slotName="primary_result_description"> //Observation_Area/Primary_Result_Description/description </xPath> <xPath slotName="investigation_name"> //Observation_Area/Investigation_Area/name </xPath> <xPath slotName="investigation_type"> //Observation_Area/Investigation_Area/type </xPath> <xPath slotName="observing_system_name"> //Observation_Area/Observing_System/name </xPath> <xPath slotName="observing_system_description"> //Observation_Area/Observing_System/description </xPath> <xPath slotName="observing_system_component_name"> //Observation_Area/Observing_System/Observing_System_Component/name </xPath> <xPath slotName="observing_system_component_description"> //Observation_Area/Observing_System/Observing_System_Component/description </xPath> <xPath slotName="observing_system_component_type"> //Observation_Area/Observing_System/Observing_System_Component/\ observing_system_component_type </xPath> <xPath slotName="target_name"> //Observation_Area/Target_Identification/name </xPath> <xPath slotName="target_alternate_designation"> //Observation_Area/Target_Identification/alternate_designation </xPath> <xPath slotName="target_type"> //Observation_Area/Target_Identification/type </xPath> <xPath slotName="target_description"> //Observation_Area/Target_Identification/description </xPath> <xPath slotName="spacecraft_clock_start_count"> //Observation_Area/Mission_Area/dph:spacecraft_clock_start_count </xPath> <xPath slotName="spacecraft_clock_stop_count"> //Observation_Area/Mission_Area/dph:spacecraft_clock_stop_count </xPath> <xPath slotName="external_reference_description"> //Reference_List/External_Reference/description </xPath> </productMetadata> </candidates> </policy> The policy file is made up of the following complex type elements: registryPackage, collections, directories, checksums, storageIngestion, accessUrls, candidates, and productMetadata. registryPackage Each time the Harvest Tool runs, it creates a package in the registry to group the product registrations together. Specify this element to give a registry package a name and/or description. The following table describes the elements that are allowed:
collections Specify this element to tell the Harvest Tool to register the collections first before crawling a target directory. This is required if the target directory contains collections that are co-located with its members and in order to distinguish primary versus secondary members. The following table describes the elements that are allowed:
In the example above, the Harvest Tool will register the following collections before crawling the target directory:
Once these collections are registered, the primary and secondary members are cached in memory and as the Harvest Tool crawls through a target directory, any secondary members will be identified and will not be registered. In addition, a SKIP message will be issued in the log report to indicate that the tool has identified a non-primary member. In the case where the target directory consists of a hierarchy structure where the collection product is located one-level above its members, much like the PDS context bundle, then there is no need to specify the collections in the Harvest policy config file. Under this scenario, the collections will be registered first before the Harvest Tool traverses down the sub-directory containing the members. directories Specify this element to tell the Harvest Tool where to crawl for data products. The following table describes the elements that are allowed:
In the example above, the Harvest tool will crawl the directory location, $HOME/VG2PLS_archive, looking for files that have a .xml file extension. The default is to touch all files in the directory if the filePattern element is omitted from the policy file. In addition, the CONTEXT directory will be ignored while traversing the target directory. accessUrls Specify this element to provide links to the physical data products. The links will be placed in the registry as slots under the slot name accessUrl. An optional attribute can be specified named registerFileUrls, which if set to true, will create file url links. The accessUrls element can contain multiple accessUrl element tags. The following table describes the elements that are allowed within the accessUrl tag:
In the policy example above, the Harvest Tool will nix any absolute path of a product starting with $HOME before appending it to the starting base url of http://pds.nasa.gov/pds4. The following example demonstrates what the resulting access url will be for a registered product located at $HOME/VG2PLS_archive/browse/Collection_Browse.xml: http://pds.nasa.gov/pds4/VG2PLS_archive/browse/Collection_Browse.xml checksums Checksum generation is turned off by default in the Harvest Tool. In order to turn this on, set the generate attribute to true. The following table describes the elements that are allowed within the checksum tag:
The following describes the tool behavior based on the different checksum settings: Checksum Manifest File Provided and Generate Flag Set To true Harvest will generate a checksum for each file encountered and verify it against the supplied checksum file. If the data file checksum was supplied in the label, Harvest will verify it as well. A warning message will be issued in the log report if a mismatch occurs. In any case, the generated checksum value is included in the associated Product_File_Repository product. Checksum Manifest File Provided and Generate Flag Set To False (or not set at all) Harvest will not generate checksums, but will use the value from the checksum manifest file to populate the associated Product_File_Repository product. If a data file checksum was supplied in the label, compare the value from the manifest against the value supplied in the label and issue a warning if there is a mismatch. Checksum Manifest File Not Provided and Generate Flag Set To True Harvest will generate a checksum for each file encountered and verify it against an optional checksum supplied in the label. If there is a mismatch, a warning message will be issued in the log report. The generated value is included in the associated Product_File_Repository product. Checksum Manifest File Not Provided and Generate Flag Set To False Harvest will not generate checksums. If the data file checksum was supplied in the label, populate the associated Product_File_Repository product with that value. storageIngestion Specify this element to tell the Harvest Tool to ingest data products to the PDS Storage Service. The following table describes the elements that are allowed:
In the example above, the Harvest Tool will ingest data products to the PDS Storage Service at http://localhost:9000. When a data product is ingested to the PDS Storage, it returns a product id which is a reference to the ingested product. This id is placed as a slot in the registry under the slot name storageServiceProductId. candidates Specify this element to tell the Harvest Tool what product types to register and what metadata to extract from a data product. This is a required element in the policy file. The following table describes the elements that are allowed:
By default, the Harvest Tool defines the default namespace to be the PDS namespace. To override this default, specify the default attribute in the namespace element and give it a value of true. The following sets the dph namespace to be the default namespace in Harvest: <candidates> <namespace prefix="dph" uri="http://pds.nasa.gov/schema/pds4/dph/v01" default="true"/> ... Namespaces need to be defined in the Harvest policy file only if the metadata to be extracted exists in a namespace other than the PDS namespace. In the example above, a namespace with the prefix dph and uri http://pds.nasa.gov/schema/pds4/dph/v01 has been defined. This means that any xPath expressions defined in the policy file will be able to use the dph prefix to be able to extract metadata that are within the dph namespace. xPaths will be explained in greater detail in the productMetadata section. productMetadata Specify this element to tell the Harvest Tool what metadata to register. It requires an attribute called objectType that tells the Harvest Tool what product types to register. The following table describes the elements that are allowed:
In the example above, the policy file tells the Harvest Tool to look for and register the Product_Document and Product_Observational object types. Also in the example is a set of xPath elements found under each productMetadata element. This defines what metadata to extract from the different products. XPath is a query language that uses path expressions to select nodes in an XML document. These path expressions look very much like expressions in a traditional computer file system. In its simplest form, prepending a // before a name will find the element no matter where it is in the XML file. The following XPath expression will find the start_date_time element within the default namespace, no matter where this element is located in the file: //start_date_time The following XPath expression will find the spacecraft_clock_start_count element within the dph namespace, no matter where this element is located in the file: //dph:spacecraft_clock_start_count The following XPath expression will find all information_model_version elements that are children of Identification_Area within the default namespace: //Identification_Area/information_model_version The following XPath expression will find all name elements that are children of Target_Identification and that have a value of MARS: //Target_Identification/name[text()="MARS"] For a more detailed explanation on XPath, go to your favorite search engine and type XPath tutorial. The slotName attribute within the xPath element allows the renaming of metadata element names when they are registered as slots in the registry. By default, the slot name is set to the element name that results from an xpath expression. For example, for the xpath expression, //Target_Identification/name, the slot name will be set to name. The following demonstrates setting the policy file to find any name elements that are children of Target_Identification and setting the slot name to target_identification_name: <xPath slotName="target_identification_name">//Target_Identification/name</xPath> PDS3 Product RegistrationBy default, the tool registers discovered PDS3 products under the Product_Proxy_PDS3 objectType in the registry. Additionally, the tool has to dynamically create certain metadata in order to support ingestion of PDS3 data products into the registry. The following is an example of a policy file to perform product registration of PDS3 products: <policy> <!-- Specify a single directory containing the PDS3 data products to register. --> <pds3Directory> <path>/Users/mcayanan/pds3</path> <fileFilter> <include>*.LBL</include> </fileFilter> </pds3Directory> <candidates> <!-- Harvest will register PDS3 data products under the objectType 'Product_Proxy_PDS3'. --> <pds3ProductMetadata> <!-- Tells Harvest what element values to use to create the LID. --> <lidContents prefix="urn:nasa:jpl"> <elementName>DATA_SET_ID</elementName> <elementName>INSTRUMENT_ID</elementName> <elementName>PRODUCT_ID</elementName> </lidContents> <titleContents> <elementName>DATA_SET_ID</elementName> <elementName>PRODUCT_ID</elementName> </titleContents> <staticMetadata> <slot name="information_model_version"> <value>0.8.0.0.k</value> </slot> <slot name="target_ref"> <value>urn:nasa:pds:target:MARS::1.0</value> </slot> <slot name="mission_ref"> <value>urn:nasa:pds:mission.MER</value> </slot> </staticMetadata> <!-- Register any additional metadata. Default is to register metadata defined in the identification area of the Product_Proxy_PDS3 schema. --> <ancillaryMetadata> <elementName slotName="start_time"> START_TIME </elementName> <elementName slotName="stop_time"> STOP_TIME </elementName> </ancillaryMetadata> <includePaths> <path>/data/pds3/label</path> </includePaths> </pds3ProductMetadata> </candidates> </policy> This policy file is made up of the following complex type elements: pds3Directory, pds3ProductMetadata, lidContents, titleContents, staticMetadata, ancillaryMetadata, and includePaths. pds3Directory Specify this element to tell the Harvest Tool the directory location to crawl. The following table describes the elements that are allowed:
In the example above, the Harvest Tool will crawl for PDS3 data products starting at the location /data/pds3/dataset, looking for files with a .LBL file extension. pds3ProductMetadata Specify this element to tell the Harvest Tool what metadata to ingest into the registry when registering PDS3 data products. This element must be specified within the candidates tag as shown in the example. The following table describes the elements that are allowed:
lidContents Specify this element to tell the Harvest Tool how to generate the logical identifier. The prefix attribute specifies what to prepend to the logical identifier. This is a required attribute. In the example above, the logical identifiers of every discovered PDS3 data product will be prefixed with urn:nasa:pds. There is an optional attribute that can be specified called appendFilename. When this attribute is specified and set to true, the Harvest Tool will append the product label filename (minus the extension) to the end of the logical identifier. The following table describes the elements that are allowed:
In the policy example above, the logical identifier will be formed using the following contents: prefix + DATA_SET_ID + ":" + INSTRUMENT_ID + ":" + PRODUCT_ID. titleContents Specify this element to tell the Harvest Tool how to generate the title of the registered PDS3 product. There is an optional attribute that can be specified called appendFilename. When this attribute is specified and set to true, the Harvest Tool will append the product label filename (minus the extension) to the end of the title. The following table describes the elements that are allowed:
In the policy example above, the title contents will be formed using the following contents: DATA_SET_ID + " " + PRODUCT_ID. staticMetadata Specify this element to tell the Harvest Tool to register a set of static metadata with the discovered PDS3 data products. The following table describes the elements that are allowed:
In the policy example above, it specifies that for every PDS3 data product that is registered, the following metadata will be added as slots: information_model_version, target_ref, and mission_ref. ancillaryMetadata Specify this element to tell the Harvest tool what additional metadata to register. The following table describes the elements that are allowed:
In the example above, the values from the following elements will be extracted from a PDS3 product label: START_DATE_TIME and STOP_DATE_TIME. includePaths Specify this element to tell the Harvest tool the locations of where to find file references specified in a label. By default, the tool will look for the file reference in the location of the label file. The following table describes the elements that are allowed:
In the example above, the tool will look at the /data/pds3/label directory for file references if they cannot be found in the same location as the label file. Report FormatThis section describes the contents of the Harvest Tool report. At this time, the Harvest Tool only outputs a series of log messages. The log will report the success or failure of a discovered product attempting to be registered. Additionally, any syntactical errors in a discovered product are reported. A log consists of a severity level, file name, and a message. The following is an example of some of the log messages that can be expected from the Harvest Tool: PDS Harvest Tool Log Version Version 1.3.0-dev Time Wed, Sep 19 2012 at 01:56:18 PM Severity Level INFO Registry Location http://localhost:8080/registry Registry Package Name Harvest Package Example Registration Package GUID urn:uuid:c1e92c9c-8eca-4742-b8d8-5a329aae89a5 INFO: XML extractor set to the following default namespace: \ http://pds.nasa.gov/pds4/pds/v09 INFO: [/Users/mcayanan/pds4/VG2PLS_archive/vg2pls_archive.md5] \ Processing checksum manifest. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Begin processing. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ line 69: Mapping reference type 'bundle_has_browse_collection' to 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ line 74: Mapping reference type 'bundle_has_context_collection' to 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ line 79: Mapping reference type 'bundle_has_data_collection' to 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ line 84: Mapping reference type 'bundle_has_document_collection' to 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ line 89: Mapping reference type 'bundle_has_schema_collection' to 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Setting LID-based association, 'urn:nasa:pds:example.DPH.sampleArchive:browse', \ under slot name 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Setting LID-based association, 'urn:nasa:pds:example.DPH.sampleArchive:context', \ under slot name 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Setting LID-based association, 'urn:nasa:pds:example.DPH.sampleArchive:data', \ under slot name 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Setting LID-based association, 'urn:nasa:pds:example.DPH.sampleArchive:document', \ under slot name 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Setting LID-based association, 'urn:nasa:pds:example.DPH.sampleArchive:schema', \ under slot name 'collection_ref'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Created access url: http://pds.nasa.gov/pds4/VG2PLS_archive/Product_Bundle.xml INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Created access url: file:///Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml SUCCESS: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Successfully registered product: urn:nasa:pds:example.DPH.sampleArchive::1.0 INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Product has the following GUID: urn:uuid:75ab2390-208b-4bf6-a753-d40ae428f2ae INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Capturing file information for Product_Bundle.xml INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Generated checksum '8e82ad4c7c1b3d3f91744f038a0af0af' matches \ the supplied checksum '8e82ad4c7c1b3d3f91744f038a0af0af' in the manifest for file object \ '/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ No checksum to compare against in the product label for file object \ '/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Capturing file object metadata for README.TXT INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Generated checksum '5ef7af310b99d8189e670830c954a290' matches the \ supplied checksum '5ef7af310b99d8189e670830c954a290' in the manifest for file object \ '/Users/mcayanan/pds4/VG2PLS_archive/README.TXT'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Generated checksum '5ef7af310b99d8189e670830c954a290' matches the \ supplied checksum '5ef7af310b99d8189e670830c954a290' in the produt label for file object \ '/Users/mcayanan/pds4/VG2PLS_archive/README.TXT'. INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Created access url: http://pds.nasa.gov/pds4/VG2PLS_archive/Product_Bundle.xml INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Created access url: file:///Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml SUCCESS: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Successfully registered product: urn:nasa:pds:example.DPH.sampleArchive:Product_Bundle.xml::1.0 INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Product has the following GUID: urn:uuid:73cf357f-3ab9-4c0b-9824-23bbeca24b28 INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Created access url: http://pds.nasa.gov/pds4/VG2PLS_archive/README.TXT INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Created access url: file:///Users/mcayanan/pds4/VG2PLS_archive/README.TXT SUCCESS: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Successfully registered product: urn:nasa:pds:example.DPH.sampleArchive:README.TXT::1.0 INFO: [/Users/mcayanan/pds4/VG2PLS_archive/Product_Bundle.xml] \ Product has the following GUID: urn:uuid:18bef1c9-2ed9-4879-88d9-01bf4c3fafe1 ... Summary: 17 of 17 file(s) processed, 3 other file(s) skipped 0 error(s), 11 warning(s) 17 of 17 products registered. 88 of 88 ancillary products registered. Product Types Registered: 2 Product_Document 1 Product_Mission_PDS3 1 Product_Browse 1 Product_Observational 2 Product_File_Text 4 Product_XML_Schema 1 Product_Bundle 88 Product_File_Repository 5 Product_Collection 82 of 82 generated checksums matched their supplied value in the manifest, 6 value(s) not checked. 70 of 71 generated checksums matched the supplied value in their product label, 17 value(s) not checked. 102 of 102 associations registered. End of Log Common ErrorsExecution of the Harvest Tool may result in the following message appearing in the log: INFO: XML extractor set to the following default namespace: \ http://pds.nasa.gov/schema/pds4/pds INFO: [/pds4/VG2PLS_archive/Product_Bundle.xml] Begin processing. SKIP: [/pds4/VG2PLS_archive/Product_Bundle.xml] No product_class element found. The message above is normally the result of a namespace mismatch between the Harvest Tool configuration and the product labels being registered. See the PDS4 Data Product Registration section above for details on specifying the namespace in the configuration file. By the way, the message could be telling the truth where the product label does not contain the <product_class> element. If this is the case, then the file is not a valid PDS product label.
|