java.lang.Object
org.apache.lucene.analysis.cz.CzechStemmer
Light Stemmer for Czech.
Implements the algorithm described in: Indexing and stemming approaches for the Czech language http://portal.acm.org/citation.cfm?id=1598600
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate int
normalize
(char[] s, int len) private int
removeCase
(char[] s, int len) private int
removePossessives
(char[] s, int len) int
stem
(char[] s, int len) Stem an input buffer of Czech text.
-
Constructor Details
-
CzechStemmer
public CzechStemmer()
-
-
Method Details
-
stem
public int stem(char[] s, int len) Stem an input buffer of Czech text.- Parameters:
s
- input bufferlen
- length of input buffer- Returns:
- length of input buffer after normalization
NOTE: Input is expected to be in lowercase, but with diacritical marks
-
removeCase
private int removeCase(char[] s, int len) -
removePossessives
private int removePossessives(char[] s, int len) -
normalize
private int normalize(char[] s, int len)
-