Package org.apache.lucene.analysis
Class StopwordAnalyzerBase
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.analysis.ReusableAnalyzerBase
-
- org.apache.lucene.analysis.StopwordAnalyzerBase
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
ClassicAnalyzer
,StandardAnalyzer
,StopAnalyzer
,UAX29URLEmailAnalyzer
public abstract class StopwordAnalyzerBase extends ReusableAnalyzerBase
Base class for Analyzers that need to make use of stopword sets.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.ReusableAnalyzerBase
ReusableAnalyzerBase.TokenStreamComponents
-
-
Field Summary
Fields Modifier and Type Field Description protected Version
matchVersion
protected CharArraySet
stopwords
An immutable stopword set
-
Constructor Summary
Constructors Modifier Constructor Description protected
StopwordAnalyzerBase(Version version)
Creates a new Analyzer with an empty stopword setprotected
StopwordAnalyzerBase(Version version, Set<?> stopwords)
Creates a new instance initialized with the given stopword set
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Set<?>
getStopwordSet()
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwordsprotected static CharArraySet
loadStopwordSet(boolean ignoreCase, Class<? extends ReusableAnalyzerBase> aClass, String resource, String comment)
Creates a CharArraySet from a file resource associated with a class.protected static CharArraySet
loadStopwordSet(File stopwords, Version matchVersion)
Creates a CharArraySet from a file.protected static CharArraySet
loadStopwordSet(Reader stopwords, Version matchVersion)
Creates a CharArraySet from a file.-
Methods inherited from class org.apache.lucene.analysis.ReusableAnalyzerBase
createComponents, initReader, reusableTokenStream, tokenStream
-
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getOffsetGap, getPositionIncrementGap, getPreviousTokenStream, setPreviousTokenStream
-
-
-
-
Field Detail
-
stopwords
protected final CharArraySet stopwords
An immutable stopword set
-
matchVersion
protected final Version matchVersion
-
-
Constructor Detail
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase(Version version, Set<?> stopwords)
Creates a new instance initialized with the given stopword set- Parameters:
version
- the Lucene version for cross version compatibilitystopwords
- the analyzer's stopword set
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase(Version version)
Creates a new Analyzer with an empty stopword set- Parameters:
version
- the Lucene version for cross version compatibility
-
-
Method Detail
-
getStopwordSet
public Set<?> getStopwordSet()
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords- Returns:
- the analyzer's stopword set or an empty set if the analyzer has no stopwords
-
loadStopwordSet
protected static CharArraySet loadStopwordSet(boolean ignoreCase, Class<? extends ReusableAnalyzerBase> aClass, String resource, String comment) throws IOException
Creates a CharArraySet from a file resource associated with a class. (SeeClass.getResourceAsStream(String)
).- Parameters:
ignoreCase
-true
if the set should ignore the case of the stopwords, otherwisefalse
aClass
- a class that is associated with the given stopwordResourceresource
- name of the resource file associated with the given classcomment
- comment string to ignore in the stopword file- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
protected static CharArraySet loadStopwordSet(File stopwords, Version matchVersion) throws IOException
Creates a CharArraySet from a file.- Parameters:
stopwords
- the stopwords file to loadmatchVersion
- the Lucene version for cross version compatibility- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
protected static CharArraySet loadStopwordSet(Reader stopwords, Version matchVersion) throws IOException
Creates a CharArraySet from a file.- Parameters:
stopwords
- the stopwords reader to loadmatchVersion
- the Lucene version for cross version compatibility- Returns:
- a CharArraySet containing the distinct stopwords from the given reader
- Throws:
IOException
- if loading the stopwords throws anIOException
-
-