|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.dlese.dpc.index.writer.IndexingTools
public class IndexingTools
Tools to aid in indexing.
Field Summary | |
---|---|
static String |
adminDefaultFieldName
Admin default field 'admindefault' |
static String |
defaultFieldName
Default field 'default' |
static String |
PHRASE_SEPARATOR
String used to separate and preserve phrases indexed as text, includes leading and trailing white space. |
static String |
stemsFieldName
Stems field 'stems' |
Constructor Summary | |
---|---|
IndexingTools()
|
Method Summary | |
---|---|
static void |
addToAdminDefaultField(org.apache.lucene.document.Document myDoc,
String content)
Indexes the given text into the admin default field. |
static void |
addToDefaultAndStemsFields(org.apache.lucene.document.Document myDoc,
String content)
Indexes the given text into the default and stems fields. |
static String |
encodeToTerm(String text)
Same as {org.dlese.dpc.index.SimpleLuceneIndex#encodeToTerm(String)}. |
static String |
encodeToTerm(String text,
boolean encodeWildCards)
Same as {org.dlese.dpc.index.SimpleLuceneIndex#encodeToTerm(String,boolean)}. |
static String[] |
extractSeparatePhrasesFromString(String separatedPhrases)
Extracts the phrases from a String that was created using the method makeSeparatePhrasesFromNodes(List nodes) or makeSeparatePhrasesFromStrings(List strings) . |
static String[] |
extractStringsFromString(String separatedWords)
Extracts the words from a String that was created using the method makeStringFromNodes(List
nodes) . |
static String[] |
getAnalyzedTerms(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Extracts all terms in any field from a Lucene query using the given Analyzer . |
static org.apache.lucene.analysis.Token[] |
getAnalyzedTokens(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Extracts all Token s from a Lucene query using the given Analyzer . |
static StringBuffer |
getAnalyzerOutput(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Creates a StringBuffer to display the tokens created by a given analyzer. |
static String |
makeSeparatePhrasesFromNodes(List nodes)
Creates a String separated by the phrase separator term from the text of each of the Element or Attributes dom4j Nodes provided. |
static String |
makeSeparatePhrasesFromStrings(List strings)
Creates a String separated by the phrase separator term from each of the Strings provided. |
static String |
makeSeparatePhrasesFromStrings(String[] strings)
Creates a String separated by the phrase separator term from each of the Strings provided. |
static String |
makeStringFromNodes(List nodes)
Creates a String separated by spaces from the text of each of the Element or Attributes dom4j Nodes provided. |
static String |
tokenizeID(String ID)
Tokenizes a DLESE ID by replacing the char - with a blank space. |
static String |
tokenizeURI(String uri)
Tokenizes a URI by replacing the unindexable chars with a blank space. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String defaultFieldName
public static final String stemsFieldName
public static final String adminDefaultFieldName
public static final String PHRASE_SEPARATOR
Constructor Detail |
---|
public IndexingTools()
Method Detail |
---|
public static final void addToDefaultAndStemsFields(org.apache.lucene.document.Document myDoc, String content)
myDoc
- Document to add tocontent
- Content to addpublic static final void addToAdminDefaultField(org.apache.lucene.document.Document myDoc, String content)
myDoc
- Document to add tocontent
- Content to addpublic static final String makeSeparatePhrasesFromNodes(List nodes)
A call to this method might look like:
String value = makeIndexPhrasesFromNodes(xmlDoc.selectNodes("/news-oppsRecord/topics/topic"));
nodes
- List of Elements or Attributes
public static final String makeSeparatePhrasesFromStrings(List strings)
strings
- List of Strings or null
public static final String makeSeparatePhrasesFromStrings(String[] strings)
strings
- Array of Strings or null
public static final String[] extractSeparatePhrasesFromString(String separatedPhrases)
makeSeparatePhrasesFromNodes(List nodes)
or makeSeparatePhrasesFromStrings(List strings)
.
separatedPhrases
- String that contains the phrase separator to seperate phrases
public static final String makeStringFromNodes(List nodes)
A call to this method might look like:
String value = makeStringFromNodes(xmlDoc.selectNodes("/news-oppsRecord/topics/topic"));
nodes
- List of dom4j Nodes of Elements or Attributes
public static final String[] extractStringsFromString(String separatedWords)
makeStringFromNodes(List
nodes)
.
separatedWords
- DESCRIPTION
public static final String tokenizeID(String ID)
ID
- The ID String
public static final String tokenizeURI(String uri)
uri
- A URL or URI
public static final String encodeToTerm(String text)
text
- Text
public static final String encodeToTerm(String text, boolean encodeWildCards)
text
- TextencodeWildCards
- True to encode the '*' wildcard char, false to leave unencoded.
public static final org.apache.lucene.analysis.Token[] getAnalyzedTokens(String textToParse, String field, org.apache.lucene.analysis.Analyzer analyzer)
Token
s from a Lucene query using the given Analyzer
.
textToParse
- The text to analyze with the analyzeranalyzer
- The analyzer to usefield
- The field this Analyzer should interpret the text as, or null to use 'default'
public static final String[] getAnalyzedTerms(String textToParse, String field, org.apache.lucene.analysis.Analyzer analyzer)
Analyzer
.
textToParse
- The text to analyze with the analyzeranalyzer
- The analyzer to usefield
- The field this Analyzer should interpret the text as, or null to use 'default'
public static final StringBuffer getAnalyzerOutput(String textToParse, String field, org.apache.lucene.analysis.Analyzer analyzer)
textToParse
- The text to analyze with the analyzeranalyzer
- The analyzer to usefield
- The lucene field name, or null to use default
|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |