|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.dlese.dpc.index.writer.FileIndexingServiceWriter org.dlese.dpc.index.writer.XMLFileIndexingWriter org.dlese.dpc.index.writer.SimpleXMLFileIndexingWriter
public class SimpleXMLFileIndexingWriter
This is the default writer for generic XML formats. Creates a Lucene Document
from any valid XML file by stripping the XML tags to extract and
index the content. The full content of all Elements and Attributes is indexed in the default and
admindefault fields and is stemmed and indexed in the stems field. The reader for this type of Document is
XMLDocReader.
FileIndexingService
,
XMLDocReader
Constructor Summary | |
---|---|
SimpleXMLFileIndexingWriter()
Constructor for the SimpleXMLFileIndexingWriter object |
Method Summary | |
---|---|
protected String[] |
_getIds()
Returns null to handle by super. |
protected void |
addFields(org.apache.lucene.document.Document newDoc,
org.apache.lucene.document.Document existingDoc,
File sourceFile)
Nothing to do here. |
protected void |
destroy()
Does nothing. |
String |
getDescription()
Gets the description attribute of the SimpleXMLFileIndexingWriter object |
String |
getDocType()
Gets the xml format for this document, for example "oai_dc," "adn," "dlese_ims," or "dlese_anno". |
String |
getReaderClass()
Gets the name of the concrete DocReader class that is used to read
this type of Document , which is
"org.dlese.dpc.index.reader.XMLDocReader". |
String |
getTitle()
Gets the title attribute of the SimpleXMLFileIndexingWriter object |
String[] |
getUrls()
Gets the urls attribute of the SimpleXMLFileIndexingWriter object |
protected String |
getValidationReport()
Gets a report detailing any errors found in the validation of the data, or null if no error was found. |
protected Date |
getWhatsNewDate()
Returns the date used to determine "What's new" in the library, which is null (unknown). |
protected String |
getWhatsNewType()
Returns null (unknown). |
boolean |
indexFullContentInDefaultAndStems()
Place the entire XML content into the default and stems search field. |
void |
init(File sourceFile,
org.apache.lucene.document.Document existingDoc)
This method is called prior to processing and may be used to for any necessary set-up. |
Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter |
---|
addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyAnnoResultDocs, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getXmlIndexer, getXmlIndexerFieldsConfig |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SimpleXMLFileIndexingWriter()
Method Detail |
---|
public String getDocType() throws Exception
getDocType
in interface DocWriter
getDocType
in class FileIndexingServiceWriter
Exception
- If errlr.public String getReaderClass()
DocReader
class that is used to read
this type of Document
, which is
"org.dlese.dpc.index.reader.XMLDocReader".
getReaderClass
in interface DocWriter
getReaderClass
in class FileIndexingServiceWriter
public void init(File sourceFile, org.apache.lucene.document.Document existingDoc) throws Exception
init
in class XMLFileIndexingWriter
sourceFile
- The sourceFile being indexed.existingDoc
- An existing Document that exists for this in the index.
Exception
- If errorprotected Date getWhatsNewDate() throws Exception
getWhatsNewDate
in class XMLFileIndexingWriter
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected String getWhatsNewType() throws Exception
getWhatsNewType
in class XMLFileIndexingWriter
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected void destroy()
destroy
in class FileIndexingServiceWriter
protected String getValidationReport() throws Exception
getValidationReport
in class FileIndexingServiceWriter
Exception
- If error in performing the validation.protected String[] _getIds()
_getIds
in class XMLFileIndexingWriter
public String[] getUrls()
getUrls
in class XMLFileIndexingWriter
public String getDescription()
getDescription
in class XMLFileIndexingWriter
public String getTitle()
getTitle
in class XMLFileIndexingWriter
public boolean indexFullContentInDefaultAndStems()
indexFullContentInDefaultAndStems
in class XMLFileIndexingWriter
protected void addFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, File sourceFile) throws Exception
addFields
in class XMLFileIndexingWriter
newDoc
- The new Document
that is being created for this
resourceexistingDoc
- An existing Document
that currently resides in
the index for the given resource, or null if none was previously presentsourceFile
- The feature to be added to the CustomFields attribute
Exception
- This method should throw and Exception with appropriate error message if an error
occurs.
|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |