DLESE Tools
v1.6.0

org.dlese.dpc.index.writer
Class DleseCollectionFileIndexingWriter

java.lang.Object
  extended by org.dlese.dpc.index.writer.FileIndexingServiceWriter
      extended by org.dlese.dpc.index.writer.XMLFileIndexingWriter
          extended by org.dlese.dpc.index.writer.DleseCollectionFileIndexingWriter
All Implemented Interfaces:
DocWriter

public class DleseCollectionFileIndexingWriter
extends XMLFileIndexingWriter

Used to write a Lucene Document for a DLESE Collection XML record. The reader for this type of Document is DleseCollectionDocReader.

Author:
John Weatherley
See Also:
XMLDocReader, RecordDataService, FileIndexingServiceWriter

Nested Class Summary
static class DleseCollectionFileIndexingWriter.CollectionAccessionStatusComparator
          Allows sorting of a Collection accession status XML Node, by date giving precedence to status = accessioned if dates are equal.
 
Constructor Summary
DleseCollectionFileIndexingWriter()
          Create a DleseCollectionFileIndexingWriter.
 
Method Summary
protected  String[] _getIds()
          Gets the ID of this collection record.
protected  void addFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, File sourceFile)
          Adds fields to the index that are part of the collection-level Document.
protected  void destroy()
          This method is called at the conclusion of processing and may be used for tear-down.
protected  void finalize()
          Perform finalization...
protected  Date getAccessionDate()
          Returns the accession date or null if this collection is not currently accessioned.
protected  String getAccessionStatus()
          Gets the most recent accession status found in the XML record.
 String getAdditionalMetadata()
          Gets the additional metadata for this collection that was indicated in org.dlese.dpc.repository.RepositoryManager.putRecord when the collection was created inside an additionalMetadata element, or null.
protected  String getCollectionStatuses()
          Gets the collectionStatus attribute of the DleseCollectionFileIndexingWriter object
protected  String getCost()
          Gets the cost associated with this collection.
static String getCurrentCollectionStatus(Document doc)
          Gets the status of the collection based on the values in the collection-level record.
 org.apache.lucene.document.Document getDeletedDoc_OFF_2006_08_23(org.apache.lucene.document.Document existingDoc)
          Creates a Lucene Document from an existing CollectionFileIndexing Document by setting the field "deleted" to "true" and making the modtime equal to current time.
 String getDescription()
          The description for the collection.
 String getDocType()
          Gets the docType attribute of the DleseCollectionFileIndexingWriter, which is 'dlesecollect.'
protected  String getFormatOfRecords()
          Gets the format of the records in this collection.
protected  String getFullTitle()
          Returns the full title for the collection.
protected  String[] getGradeRanges()
          Gets the gradeRanges for this collection.
protected  String getKey()
          Gets the collection key used to identify the items in the collection this record refers to.
protected  String getKeywords()
          Gets the keywords associated with this collection.
static long getNumInstances()
          Gets the numInstances attribute of the DleseCollectionFileIndexingWriter class
protected  String getPartOfDRC()
          Gets whether the collection is part of the DRC [true|false].
 String getReaderClass()
          Gets the name of the concrete DocReader class that is used to read this type of Document, which is "DleseCollectionDocReader".
protected  String getReviewProcess()
          Gets the collection's review process statement.
protected  String getReviewProcessUrl()
          Gets the URL to the collection's review process statement.
protected  String getScopeUrl()
          Gets the URL to the collection's scope statement.
protected  String getShortTitle()
          Returns the short title for the collection.
protected  String[] getSubjects()
          Gets the subjects for this collection.
 String getTitle()
          Gets the full title
 String[] getUrls()
          Gets the URL to the collection.
protected  String getValidationReport()
          Gets a report detailing any errors found in the XML validation of the collection record, or null if no error was found.
protected  Date getWhatsNewDate()
          Returns the date used to determine "What's new" in the library.
protected  String getWhatsNewType()
          Returns 'collection'.
 boolean indexFullContentInDefaultAndStems()
          Default and stems fields handled here, so do not index full content.
 void init(File source, org.apache.lucene.document.Document existingDoc)
          Performs the necessary init functions (nothing done).
 
Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter
addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyAnnoResultDocs, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getXmlIndexer, getXmlIndexerFieldsConfig
 
Methods inherited from class org.dlese.dpc.index.writer.FileIndexingServiceWriter
abortIndexing, addDocToRemove, addToAdminDefaultField, addToDefaultField, create, getConfigAttributes, getDocsource, getFileContent, getFileIndexingPlugin, getFileIndexingService, getLuceneDoc, getPreviousRecordDoc, getSessionAttributes, getSourceDir, getSourceFile, isMakingDeletedDoc, isValidationEnabled, prtln, prtlnErr, setConfigAttributes, setDebug, setFileIndexingPlugin, setFileIndexingService, setIsMakingDeletedDoc, setValidationEnabled
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DleseCollectionFileIndexingWriter

public DleseCollectionFileIndexingWriter()
Create a DleseCollectionFileIndexingWriter.

Method Detail

finalize

protected void finalize()
                 throws Throwable
Perform finalization... closing resources, etc.

Overrides:
finalize in class Object
Throws:
Throwable - If error

getNumInstances

public static long getNumInstances()
Gets the numInstances attribute of the DleseCollectionFileIndexingWriter class

Returns:
The numInstances value

getFullTitle

protected String getFullTitle()
                       throws Exception
Returns the full title for the collection.

Returns:
The fullTitle value
Throws:
Exception - If error reading XML.

getShortTitle

protected String getShortTitle()
                        throws Exception
Returns the short title for the collection.

Returns:
The shortTitle value
Throws:
Exception - If error reading XML.

getTitle

public String getTitle()
                throws Exception
Gets the full title

Specified by:
getTitle in class XMLFileIndexingWriter
Returns:
The title value
Throws:
Exception - If error

getDescription

public String getDescription()
                      throws Exception
The description for the collection.

Specified by:
getDescription in class XMLFileIndexingWriter
Returns:
The description String
Throws:
Exception - If error reading XML.

getAdditionalMetadata

public String getAdditionalMetadata()
Gets the additional metadata for this collection that was indicated in org.dlese.dpc.repository.RepositoryManager.putRecord when the collection was created inside an additionalMetadata element, or null.

Returns:
The additional metadata element as an String, or null if none.

getPartOfDRC

protected String getPartOfDRC()
                       throws Exception
Gets whether the collection is part of the DRC [true|false].

Returns:
The partOfDRC Value
Throws:
Exception - If error

getAccessionStatus

protected String getAccessionStatus()
                             throws Exception
Gets the most recent accession status found in the XML record.

Returns:
The most recent accession status.
Throws:
Exception - If error

getCollectionStatuses

protected String getCollectionStatuses()
                                throws Exception
Gets the collectionStatus attribute of the DleseCollectionFileIndexingWriter object

Returns:
The collectionStatus value
Throws:
Exception - If error

getKey

protected String getKey()
                 throws Exception
Gets the collection key used to identify the items in the collection this record refers to. For example, dcc or comet.

Returns:
The Key value
Throws:
Exception - If error

getUrls

public String[] getUrls()
                 throws Exception
Gets the URL to the collection.

Specified by:
getUrls in class XMLFileIndexingWriter
Returns:
The collectionUrl value
Throws:
Exception - If error

getScopeUrl

protected String getScopeUrl()
                      throws Exception
Gets the URL to the collection's scope statement.

Returns:
The URL to the collection's scope statement, or null if none.
Throws:
Exception - If error

getReviewProcessUrl

protected String getReviewProcessUrl()
                              throws Exception
Gets the URL to the collection's review process statement.

Returns:
The URL to the collection's review process statement.
Throws:
Exception - If error

getReviewProcess

protected String getReviewProcess()
                           throws Exception
Gets the collection's review process statement.

Returns:
The collection's review process statement.
Throws:
Exception - If error

getFormatOfRecords

protected String getFormatOfRecords()
                             throws Exception
Gets the format of the records in this collection.

Returns:
The records format.
Throws:
Exception - If error

getCost

protected String getCost()
                  throws Exception
Gets the cost associated with this collection.

Returns:
The cost.
Throws:
Exception - If error

getKeywords

protected String getKeywords()
                      throws Exception
Gets the keywords associated with this collection.

Returns:
The all keywords separated by spaces.
Throws:
Exception - NOT YET DOCUMENTED

getGradeRanges

protected String[] getGradeRanges()
                           throws Exception
Gets the gradeRanges for this collection.

Returns:
The gradeRanges value
Throws:
Exception - NOT YET DOCUMENTED

getSubjects

protected String[] getSubjects()
                        throws Exception
Gets the subjects for this collection.

Returns:
The subjects value
Throws:
Exception - NOT YET DOCUMENTED

_getIds

protected String[] _getIds()
                    throws Exception
Gets the ID of this collection record.

Specified by:
_getIds in class XMLFileIndexingWriter
Returns:
The ID
Throws:
Exception - If error

getDocType

public String getDocType()
Gets the docType attribute of the DleseCollectionFileIndexingWriter, which is 'dlesecollect.'

Specified by:
getDocType in interface DocWriter
Specified by:
getDocType in class FileIndexingServiceWriter
Returns:
The docType, which is 'dlese_collect.'

getReaderClass

public String getReaderClass()
Gets the name of the concrete DocReader class that is used to read this type of Document, which is "DleseCollectionDocReader".

Specified by:
getReaderClass in interface DocWriter
Specified by:
getReaderClass in class FileIndexingServiceWriter
Returns:
The String "org.dlese.dpc.index.reader.DleseCollectionDocReader".

getAccessionDate

protected Date getAccessionDate()
                         throws Exception
Returns the accession date or null if this collection is not currently accessioned.

Returns:
The accession date or null
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getWhatsNewDate

protected Date getWhatsNewDate()
                        throws Exception
Returns the date used to determine "What's new" in the library. Just returns the file mod date.

Specified by:
getWhatsNewDate in class XMLFileIndexingWriter
Returns:
The what's new date for the item
Throws:
Exception - This method should throw and Exception with appropriate error message if an error occurs.

getWhatsNewType

protected String getWhatsNewType()
Returns 'collection'.

Specified by:
getWhatsNewType in class XMLFileIndexingWriter
Returns:
The string 'collection'.

init

public void init(File source,
                 org.apache.lucene.document.Document existingDoc)
          throws Exception
Performs the necessary init functions (nothing done).

Specified by:
init in class XMLFileIndexingWriter
Parameters:
source - The source file being indexed
existingDoc - An existing Document that currently resides in the index for the given resource, or null if none was previously present
Throws:
Exception - If an error occured during set-up.

destroy

protected void destroy()
This method is called at the conclusion of processing and may be used for tear-down.

Specified by:
destroy in class FileIndexingServiceWriter

getValidationReport

protected String getValidationReport()
                              throws Exception
Gets a report detailing any errors found in the XML validation of the collection record, or null if no error was found.

Overrides:
getValidationReport in class FileIndexingServiceWriter
Returns:
Null if no data validation errors were found, otherwise a String that details the nature of the error.
Throws:
Exception - If error in performing the validation.

indexFullContentInDefaultAndStems

public boolean indexFullContentInDefaultAndStems()
Default and stems fields handled here, so do not index full content.

Specified by:
indexFullContentInDefaultAndStems in class XMLFileIndexingWriter
Returns:
False

addFields

protected final void addFields(org.apache.lucene.document.Document newDoc,
                               org.apache.lucene.document.Document existingDoc,
                               File sourceFile)
                        throws Exception
Adds fields to the index that are part of the collection-level Document.

Specified by:
addFields in class XMLFileIndexingWriter
Parameters:
newDoc - The new Document that is being created for this resource
existingDoc - An existing Document that currently resides in the index for the given resource, or null if none was previously present
sourceFile - The sourceFile that is being indexed.
Throws:
Exception - If an error occurs

getDeletedDoc_OFF_2006_08_23

public org.apache.lucene.document.Document getDeletedDoc_OFF_2006_08_23(org.apache.lucene.document.Document existingDoc)
                                                                 throws Throwable
Creates a Lucene Document from an existing CollectionFileIndexing Document by setting the field "deleted" to "true" and making the modtime equal to current time.

Parameters:
existingDoc - An existing FileIndexingService Document that currently resides in the index for the given resource.
Returns:
A Lucene FileIndexingService Document with the field "deleted" set to "true" and modtime set to current time.
Throws:
Throwable - Thrown if error occurs

getCurrentCollectionStatus

public static final String getCurrentCollectionStatus(Document doc)
Gets the status of the collection based on the values in the collection-level record.

Parameters:
doc - A dlese_collect XML Document
Returns:
The currentCollectionStatus value

DLESE Tools
v1.6.0