Class StempelFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public final class StempelFilter
    extends org.apache.lucene.analysis.TokenFilter
    Transforms the token stream as per the stemming algorithm.

    Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly!

    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

        org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int DEFAULT_MIN_LENGTH
      Minimum length of input words to be processed.
      • Fields inherited from class org.apache.lucene.analysis.TokenFilter

        input
    • Constructor Summary

      Constructors 
      Constructor Description
      StempelFilter​(org.apache.lucene.analysis.TokenStream in, StempelStemmer stemmer)
      Create filter using the supplied stemming table.
      StempelFilter​(org.apache.lucene.analysis.TokenStream in, StempelStemmer stemmer, int minLength)
      Create filter using the supplied stemming table.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean incrementToken()
      Returns the next input Token, after being stemmed
      • Methods inherited from class org.apache.lucene.analysis.TokenFilter

        close, end, reset
      • Methods inherited from class org.apache.lucene.util.AttributeSource

        addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
    • Field Detail

      • DEFAULT_MIN_LENGTH

        public static final int DEFAULT_MIN_LENGTH
        Minimum length of input words to be processed. Shorter words are returned unchanged.
        See Also:
        Constant Field Values
    • Constructor Detail

      • StempelFilter

        public StempelFilter​(org.apache.lucene.analysis.TokenStream in,
                             StempelStemmer stemmer)
        Create filter using the supplied stemming table.
        Parameters:
        in - input token stream
        stemmer - stemmer
      • StempelFilter

        public StempelFilter​(org.apache.lucene.analysis.TokenStream in,
                             StempelStemmer stemmer,
                             int minLength)
        Create filter using the supplied stemming table.
        Parameters:
        in - input token stream
        stemmer - stemmer
        minLength - For performance reasons words shorter than minLength characters are not processed, but simply returned.
    • Method Detail

      • incrementToken

        public boolean incrementToken()
                               throws IOException
        Returns the next input Token, after being stemmed
        Specified by:
        incrementToken in class org.apache.lucene.analysis.TokenStream
        Throws:
        IOException