public class HanLPTokenizer
extends org.apache.lucene.analysis.Tokenizer
| Constructor and Description |
|---|
HanLPTokenizer(com.hankcs.hanlp.seg.Segment segment,
Set<String> filter,
boolean enablePorterStemming) |
| Modifier and Type | Method and Description |
|---|---|
void |
end() |
boolean |
incrementToken() |
void |
reset()
必须重载的方法,否则在批量索引文件时将会导致文件索引失败
|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic final boolean incrementToken()
throws IOException
incrementToken in class org.apache.lucene.analysis.TokenStreamIOExceptionpublic void end()
throws IOException
end in class org.apache.lucene.analysis.TokenStreamIOExceptionpublic void reset()
throws IOException
reset in class org.apache.lucene.analysis.TokenizerIOExceptionCopyright © 2014–2018 码农场. All rights reserved.