Class HMMChineseTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory
-
public final class HMMChineseTokenizerFactory extends TokenizerFactory
Factory forHMMChineseTokenizerNote: this class will currently emit tokens for punctuation. So you should either add a WordDelimiterFilter after to remove these (with concatenate off), or use the SmartChinese stoplist with a StopFilterFactory via:
words="org/apache/lucene/analysis/cn/smart/stopwords.txt"- Since:
- 4.10.0
- WARNING: This API is experimental and might change in incompatible ways in the next release.
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "hmmChinese"
-
-
Field Summary
Fields Modifier and Type Field Description static StringNAMESPI name-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description HMMChineseTokenizerFactory()Default ctor for compatibility with SPIHMMChineseTokenizerFactory(Map<String,String> args)Creates a new HMMChineseTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizercreate(AttributeFactory factory)-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
-
Method Detail
-
create
public Tokenizer create(AttributeFactory factory)
- Specified by:
createin classTokenizerFactory
-
-