DLESE Tools
v1.6.0

org.dlese.dpc.index.writer
Class DleseIMSFileIndexingWriter

java.lang.Object
  extended by org.dlese.dpc.index.writer.FileIndexingServiceWriter
      extended by org.dlese.dpc.index.writer.XMLFileIndexingWriter
          extended by org.dlese.dpc.index.writer.ItemFileIndexingWriter
              extended by org.dlese.dpc.index.writer.DleseIMSFileIndexingWriter
All Implemented Interfaces:
DocWriter

public class DleseIMSFileIndexingWriter
extends ItemFileIndexingWriter

Creates a Lucene Document from a DLESE-IMS XML source file.

The Lucene Document fields that are created by this class are (in addition the the ones listed for FileIndexingServiceWriter):

doctype - Set to 'dlese_ims'. Stored. Note: the actual indexing of this field happens in the superclass FileIndexingServiceWriter.
additional fields - A number of additional fields are defined. See the Java code for method addFrameworkFields(Document, Document) for details.

Author:
John Weatherley, Ryan Deardorff

Constructor Summary
DleseIMSFileIndexingWriter()
          Create a DleseIMSFileIndexingWriter
 
Method Summary
protected  String[] _getIds()
          Gets the id attribute of the DleseIMSFileIndexingWriter object
protected  void addFrameworkFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc)
          Adds custom fields to the index that are unique to DLESE-IMS
protected  void destroy()
          Release map resources for GC after processing.
protected  Date getAccessionDate()
          Returns the accession date, which is null (unknown).
protected  String getAccessionStatus()
          Returns the accession status of this record, for example 'accessioned'.
protected  MmdRec[] getAllMmdRecs()
          Returns the MmdRecs for all records associated with this resouce, including myMmdRec.
protected  MmdRec[] getAssociatedMmdRecs()
          Returns the MmdRecs for records in other collections that catalog the same resource.
protected  String getContent()
          Returns null.
protected  String getContentType()
          Returns null.
protected  Date getCreationDate()
          Returns null.
protected  String getCreator()
          Returns the items creator's full name.
protected  String getCreatorLastName()
          Returns the items creator's last name.
 String getDescription()
          Gets the description attribute of the DleseIMSFileIndexingWriter object
 String getDocType()
          Gets the docType attribute of the DleseIMSFileIndexingWriter, which is 'dlese_ims.'
protected  boolean getHasRelatedResource()
          Returns false (not implemented).
protected  String getKeywords()
          Returns the items keywords.
protected  MmdRec getMyMmdRec()
          Returns the MmdRec for this record only.
 String getReaderClass()
          Gets the name of the concrete DocReader class that is used to read this type of Document, which is "ItemDocReader".
protected  String[] getRelatedResourceIds()
          Returns the IDs of related resources that are cataloged by ID, or null if none are present
protected  String[] getRelatedResourceUrls()
          Returns the URLs of related resources that are cataloged by URL, or null if none are present
 String getTitle()
          Gets the title attribute of the DleseIMSFileIndexingWriter object
 String[] getUrls()
          Gets the url attribute of the DleseIMSFileIndexingWriter object
protected  String getValidationReport()
          Gets a report detailing any errors found in the validation of the data, or null if no error was found.
protected  Date getWhatsNewDate()
          Returns the date used to determine "What's new" in the library, which is null (unknown).
protected  String getWhatsNewType()
          Returns null (unknown).
 boolean indexFullContentInDefaultAndStems()
          Default and stems fields handled here, so do not index full content.
 void initItem(File source, org.apache.lucene.document.Document existingDoc)
          Initialize the XML map prior to processing
 
Methods inherited from class org.dlese.dpc.index.writer.ItemFileIndexingWriter
addFields, getMyAnnoResultDocs, init
 
Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter
addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getXmlIndexer, getXmlIndexerFieldsConfig
 
Methods inherited from class org.dlese.dpc.index.writer.FileIndexingServiceWriter
abortIndexing, addDocToRemove, addToAdminDefaultField, addToDefaultField, create, getConfigAttributes, getDocsource, getFileContent, getFileIndexingPlugin, getFileIndexingService, getLuceneDoc, getPreviousRecordDoc, getSessionAttributes, getSourceDir, getSourceFile, isMakingDeletedDoc, isValidationEnabled, prtln, prtlnErr, setConfigAttributes, setDebug, setFileIndexingPlugin, setFileIndexingService, setIsMakingDeletedDoc, setValidationEnabled
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DleseIMSFileIndexingWriter

public DleseIMSFileIndexingWriter()
Create a DleseIMSFileIndexingWriter

Method Detail

initItem

public void initItem(File source,
                     org.apache.lucene.document.Document existingDoc)
              throws Exception
Initialize the XML map prior to processing

Specified by:
initItem in class ItemFileIndexingWriter
Parameters:
source - The source file being indexed.
existingDoc - A Document that previously existed in the index for this item, if present
Throws:
Exception - Thrown if error reading the XML map

destroy

protected void destroy()
Release map resources for GC after processing.

Specified by:
destroy in class ItemFileIndexingWriter

getReaderClass

public String getReaderClass()
Gets the name of the concrete DocReader class that is used to read this type of Document, which is "ItemDocReader".

Specified by:
getReaderClass in interface DocWriter
Specified by:
getReaderClass in class ItemFileIndexingWriter
Returns:
The STring "rg.dlese.dpc.index.reader.ItemDocReader".

getValidationReport

protected String getValidationReport()
                              throws Exception
Gets a report detailing any errors found in the validation of the data, or null if no error was found.

Specified by:
getValidationReport in class ItemFileIndexingWriter
Returns:
Null if no data validation errors were found, otherwise a String that details the nature of the error.
Throws:
Exception - If error in performing the validation.

getDocType

public final String getDocType()
Gets the docType attribute of the DleseIMSFileIndexingWriter, which is 'dlese_ims.'

Specified by:
getDocType in interface DocWriter
Specified by:
getDocType in class ItemFileIndexingWriter
Returns:
The docType, which is 'dlese_ims.'

_getIds

protected final String[] _getIds()
                          throws Exception
Gets the id attribute of the DleseIMSFileIndexingWriter object

Specified by:
_getIds in class XMLFileIndexingWriter
Returns:
The id value
Throws:
Exception - If an error occurs

getTitle

public final String getTitle()
                      throws Exception
Gets the title attribute of the DleseIMSFileIndexingWriter object

Specified by:
getTitle in class XMLFileIndexingWriter
Returns:
The title value
Throws:
Exception - If an error occurs

getDescription

public final String getDescription()
                            throws Exception
Gets the description attribute of the DleseIMSFileIndexingWriter object

Specified by:
getDescription in class XMLFileIndexingWriter
Returns:
The description value
Throws:
Exception - If an error occurs

getKeywords

protected String getKeywords()
                      throws Exception
Returns the items keywords. An empty String or null is acceptable. The String is tokenized, stored and indexed under the field key 'keywords' and is also indexed in the 'default' field.

Specified by:
getKeywords in class ItemFileIndexingWriter
Returns:
The keywords String
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getCreatorLastName

protected String getCreatorLastName()
                             throws Exception
Returns the items creator's last name. An empty String or null is acceptable. The String is tokenized, stored and indexed under the field the 'default' field only.

Specified by:
getCreatorLastName in class ItemFileIndexingWriter
Returns:
The creator's last name String
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getAssociatedMmdRecs

protected MmdRec[] getAssociatedMmdRecs()
Returns the MmdRecs for records in other collections that catalog the same resource. Does not include myMmdRec.

Specified by:
getAssociatedMmdRecs in class ItemFileIndexingWriter
Returns:
null

getAllMmdRecs

protected MmdRec[] getAllMmdRecs()
Returns the MmdRecs for all records associated with this resouce, including myMmdRec.

Specified by:
getAllMmdRecs in class ItemFileIndexingWriter
Returns:
null

getMyMmdRec

protected MmdRec getMyMmdRec()
Returns the MmdRec for this record only.

Specified by:
getMyMmdRec in class ItemFileIndexingWriter
Returns:
null

getCreator

protected String getCreator()
                     throws Exception
Returns the items creator's full name. An empty String or null is acceptable. The String is tokenized, stored and indexed under the field key 'creator' and is also indexed in the 'default' field.

Specified by:
getCreator in class ItemFileIndexingWriter
Returns:
Creator's full name
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getContent

protected String getContent()
Returns null.

Specified by:
getContent in class ItemFileIndexingWriter
Returns:
null

getContentType

protected String getContentType()
Returns null.

Specified by:
getContentType in class ItemFileIndexingWriter
Returns:
null

getAccessionStatus

protected String getAccessionStatus()
                             throws Exception
Returns the accession status of this record, for example 'accessioned'. The String is tokenized, stored and indexed under the field key 'accessionstatus'.

Specified by:
getAccessionStatus in class ItemFileIndexingWriter
Returns:
The accession status.
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getHasRelatedResource

protected boolean getHasRelatedResource()
                                 throws Exception
Returns false (not implemented).

Specified by:
getHasRelatedResource in class ItemFileIndexingWriter
Returns:
False.
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getRelatedResourceIds

protected String[] getRelatedResourceIds()
                                  throws Exception
Returns the IDs of related resources that are cataloged by ID, or null if none are present

Specified by:
getRelatedResourceIds in class ItemFileIndexingWriter
Returns:
Related resource IDs, or null if none are available
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getRelatedResourceUrls

protected String[] getRelatedResourceUrls()
                                   throws Exception
Returns the URLs of related resources that are cataloged by URL, or null if none are present

Specified by:
getRelatedResourceUrls in class ItemFileIndexingWriter
Returns:
Related resource URLs, or null if none are available
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getUrls

public final String[] getUrls()
                       throws Exception
Gets the url attribute of the DleseIMSFileIndexingWriter object

Specified by:
getUrls in class XMLFileIndexingWriter
Returns:
The url value
Throws:
Exception - If an error occurs

getWhatsNewDate

protected Date getWhatsNewDate()
                        throws Exception
Returns the date used to determine "What's new" in the library, which is null (unknown).

Overrides:
getWhatsNewDate in class ItemFileIndexingWriter
Returns:
The what's new date for the item
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getAccessionDate

protected Date getAccessionDate()
                         throws Exception
Returns the accession date, which is null (unknown).

Specified by:
getAccessionDate in class ItemFileIndexingWriter
Returns:
The what's new date for the item
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getCreationDate

protected Date getCreationDate()
                        throws Exception
Returns null.

Specified by:
getCreationDate in class ItemFileIndexingWriter
Returns:
null
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getWhatsNewType

protected String getWhatsNewType()
                          throws Exception
Returns null (unknown).

Overrides:
getWhatsNewType in class ItemFileIndexingWriter
Returns:
null.
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

indexFullContentInDefaultAndStems

public boolean indexFullContentInDefaultAndStems()
Default and stems fields handled here, so do not index full content.

Specified by:
indexFullContentInDefaultAndStems in class XMLFileIndexingWriter
Returns:
False

addFrameworkFields

protected final void addFrameworkFields(org.apache.lucene.document.Document newDoc,
                                        org.apache.lucene.document.Document existingDoc)
                                 throws Exception
Adds custom fields to the index that are unique to DLESE-IMS

Specified by:
addFrameworkFields in class ItemFileIndexingWriter
Parameters:
newDoc - The feature to be added to the FrameworkFields attribute
existingDoc - The feature to be added to the FrameworkFields attribute
Throws:
Exception - If an error occurs

DLESE Tools
v1.6.0