|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.dlese.dpc.index.writer.FileIndexingServiceWriter org.dlese.dpc.index.writer.XMLFileIndexingWriter org.dlese.dpc.index.writer.ItemFileIndexingWriter
public abstract class ItemFileIndexingWriter
Abstract class for writing a Lucene Document
for a collection of
item-level metadata records of a specific format (DLESE IMS, ADN-Item, ADN-Collection, etc). The reader
for this type of Document
is XMLDocReader
or ItemDocReader
.
The Lucene Document
fields that are created by this class are (in
addition the the ones listed for FileIndexingServiceWriter
):
title
- The tile for the resource. Stored.
description
- The description for the resource. Stored.
url
- The url to the resoruce. Stored.
Stored. Appended with a '0' at the beginning to support wildcard searching.
metadatapfx
- The metadata prefix (format) for this record, for example 'adn' or
'oai_dc'. Stored. Appended with a '0' at the beginning to support wildcard searching.
accessionstatus
- The accession status for this record. Stored. Appended with a '0'
at the beginning to support wildcard searching.
annotypes
- Annotataion types that are refer to this record. Keyword.
annopathways
- Annotataion pathways that are refer to this record. Keyword.
associatedids
- A list of record IDs that refer to the same resource. Keyword.
valid
- Indicates whether the record is valid [true | false]. Not stored.
validationreport
- Text describing an error in the validation of the data for this
record. Stored. Only indexed if there was a validation error indicated by the valid field containing
false.
ItemDocReader
,
XMLDocReader
,
RecordDataService
,
FileIndexingServiceWriter
Constructor Summary | |
---|---|
ItemFileIndexingWriter()
|
Method Summary | |
---|---|
protected void |
addFields(org.apache.lucene.document.Document newDoc,
org.apache.lucene.document.Document existingDoc,
File sourceFile)
Adds fields to the index that are common to all item-level documents. |
protected abstract void |
addFrameworkFields(org.apache.lucene.document.Document newDoc,
org.apache.lucene.document.Document existingDoc)
Adds fields to the index that are unique to the given framework. |
protected abstract void |
destroy()
This method is called at the conclusion of processing and may be used for tear-down. |
protected abstract Date |
getAccessionDate()
Returns the accession date for the item, or null if this item is not accessioned. |
protected abstract String |
getAccessionStatus()
Returns the accession status of this record, for example 'accessioned'. |
protected abstract MmdRec[] |
getAllMmdRecs()
Returns the MmdRecs for all records associated with this resouce, including myMmdRec. |
protected abstract MmdRec[] |
getAssociatedMmdRecs()
Returns the MmdRecs for records in other collections that catalog the same resource. |
protected abstract String |
getContent()
Returns the content of the item this record catalogs, or null if not available. |
protected abstract String |
getContentType()
Returns the content type of the item this record catalogs, or null if not available. |
protected abstract Date |
getCreationDate()
Returns the date this item was first created, or null if not available. |
protected abstract String |
getCreator()
Returns the items creator's full name. |
protected abstract String |
getCreatorLastName()
Returns the items creator's last name. |
abstract String |
getDocType()
Returns a unique document type key for this kind of record, corresponding to the format type. |
protected abstract boolean |
getHasRelatedResource()
Returns true if the item has one or more related resource, false otherwise. |
protected abstract String |
getKeywords()
Returns the item's keywords sorted and separated by the '+' symbol. |
protected ResultDocList |
getMyAnnoResultDocs()
Gets the annotations for this record, null or zero length if none available. |
protected abstract MmdRec |
getMyMmdRec()
Returns the MmdRec for this record only. |
abstract String |
getReaderClass()
Gets the fully qualified name of the concrete DocReader class that is
used to read this type of Document , for example
"org.dlese.dpc.index.reader.ItemDocReader". |
protected abstract String[] |
getRelatedResourceIds()
Returns the IDs of related resources that are cataloged by ID, or null if none are present |
protected abstract String[] |
getRelatedResourceUrls()
Returns the URLs of related resources that are cataloged by URL, or null if none are present |
protected abstract String |
getValidationReport()
Gets a report detailing any errors found in the validation of the data, or null if no error was found. |
protected Date |
getWhatsNewDate()
Returns the date used to determine "What's new" in the library, which is the item's accession date. |
protected String |
getWhatsNewType()
Returns 'itemnew' or 'itemannoinprogress' or 'itemannocomplete' whichever came most recelntly. |
void |
init(File source,
org.apache.lucene.document.Document existingDoc)
Initialize the subclasses and record data service data. |
abstract void |
initItem(File source,
org.apache.lucene.document.Document existingDoc)
This method is called prior to processing and may be used to for any necessary set-up. |
Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter |
---|
_getIds, addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDescription, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getTitle, getUrls, getXmlIndexer, getXmlIndexerFieldsConfig, indexFullContentInDefaultAndStems |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public ItemFileIndexingWriter()
Method Detail |
---|
protected abstract String getKeywords() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract String getCreatorLastName() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract String getCreator() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract String getAccessionStatus() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract Date getAccessionDate() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract Date getCreationDate() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract String getContent()
protected abstract MmdRec[] getAssociatedMmdRecs()
protected abstract MmdRec[] getAllMmdRecs()
protected abstract MmdRec getMyMmdRec()
protected abstract String getContentType()
protected abstract boolean getHasRelatedResource() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract String[] getRelatedResourceIds() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract String[] getRelatedResourceUrls() throws Exception
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract void addFrameworkFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc) throws Exception
Example code:
protected void addFrameworkFields(Document newDoc, Document existingDoc) throws Exception {
String customContent = "Some content";
newDoc.add(new Field("mycustomefield", customContent));
}
newDoc
- The new Document
that is being created for this
resourceexistingDoc
- An existing Document
that currently resides in
the index for the given resource, or null if none was previously present
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.public abstract String getDocType() throws Exception
StandardAnalyzer
so it must be lowercase and should not contain any
stop words.
getDocType
in interface DocWriter
getDocType
in class FileIndexingServiceWriter
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.public abstract String getReaderClass()
DocReader
class that is
used to read this type of Document
, for example
"org.dlese.dpc.index.reader.ItemDocReader".
getReaderClass
in interface DocWriter
getReaderClass
in class FileIndexingServiceWriter
DocReader
.public abstract void initItem(File source, org.apache.lucene.document.Document existingDoc) throws Exception
source
- The source file being indexedexistingDoc
- An existing Document that currently resides in the index for the given resource, or
null if none was previously present
Exception
- If an error occured during set-up.protected abstract void destroy()
destroy
in class FileIndexingServiceWriter
protected abstract String getValidationReport() throws Exception
XMLFileIndexingWriter.getTitle()
, addFrameworkFields(Document, Document)
, etc.) so that data
verification can be done during those calls, if needed.
getValidationReport
in class FileIndexingServiceWriter
Exception
- If error in performing the validation.public void init(File source, org.apache.lucene.document.Document existingDoc) throws Exception
init
in class XMLFileIndexingWriter
source
- The source file being indexed.existingDoc
- A Document that previously existed in the index for this item, if present
Exception
- Thrown if error reading the XML mapprotected ResultDocList getMyAnnoResultDocs() throws Exception
getMyAnnoResultDocs
in class XMLFileIndexingWriter
Exception
- If errorprotected final void addFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, File sourceFile) throws Exception
addFields
in class XMLFileIndexingWriter
newDoc
- The new Document that is being created for this resourceexistingDoc
- An existing Document that currently resides in the index for the given resource, or
null if none was previously presentsourceFile
- The sourceFile that is being indexed.
Exception
- If an error occursprotected Date getWhatsNewDate() throws Exception
getWhatsNewDate
in class XMLFileIndexingWriter
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected String getWhatsNewType() throws Exception
getWhatsNewType
in class XMLFileIndexingWriter
Exception
- If error getting whats new type.
|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |