Package org.lionsoul.jcseg
Interface ISegment
-
- All Known Implementing Classes:
ComplexSeg,DelimiterSeg,DetectSeg,MostSeg,NGramSeg,NLPSeg,Segmenter,SimpleSeg
public interface ISegmentJcseg segmentation interface- Author:
- chenxin
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static classISegment.Type
-
Field Summary
Fields Modifier and Type Field Description static intCHECK_CE_MASkWhether to check the Chinese and English mixed word.static intCHECK_CF_MASKWhether to check the Chinese fraction.static intCHECK_EC_MASKWhether to check the English Chinese mixed suffix For the new implementation of the mixed word recognition Added at 2016/11/22static ISegment.TypeCOMPLEXstatic intCOMPLEX_MODEstatic ISegment.TypeDELIMITERstatic intDELIMITER_MODEstatic ISegment.TypeDETECTstatic intDETECT_MODEstatic ISegment.TypeMOSTstatic intMOST_MODEstatic ISegment.TypeNGRAMstatic intNGRAM_MODEstatic ISegment.TypeNLPstatic intNLP_MODEstatic ISegment.TypeSIMPLESegmentation type constantsstatic intSIMPLE_MODESegmentation type indexstatic intSTART_SS_MASKWhether to start the Latin secondary segmentation.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description intgetStreamPosition()get the current length of the streamIWordnext()segment a word from a char array from a specified position.voidreset(Reader input)reset the reader
-
-
-
Field Detail
-
SIMPLE
static final ISegment.Type SIMPLE
Segmentation type constants
-
COMPLEX
static final ISegment.Type COMPLEX
-
DETECT
static final ISegment.Type DETECT
-
MOST
static final ISegment.Type MOST
-
NLP
static final ISegment.Type NLP
-
DELIMITER
static final ISegment.Type DELIMITER
-
NGRAM
static final ISegment.Type NGRAM
-
SIMPLE_MODE
static final int SIMPLE_MODE
Segmentation type index
-
COMPLEX_MODE
static final int COMPLEX_MODE
-
DETECT_MODE
static final int DETECT_MODE
-
MOST_MODE
static final int MOST_MODE
-
NLP_MODE
static final int NLP_MODE
-
DELIMITER_MODE
static final int DELIMITER_MODE
-
NGRAM_MODE
static final int NGRAM_MODE
-
CHECK_CE_MASk
static final int CHECK_CE_MASk
Whether to check the Chinese and English mixed word.- See Also:
- Constant Field Values
-
CHECK_CF_MASK
static final int CHECK_CF_MASK
Whether to check the Chinese fraction.- See Also:
- Constant Field Values
-
START_SS_MASK
static final int START_SS_MASK
Whether to start the Latin secondary segmentation.- See Also:
- Constant Field Values
-
CHECK_EC_MASK
static final int CHECK_EC_MASK
Whether to check the English Chinese mixed suffix For the new implementation of the mixed word recognition Added at 2016/11/22- See Also:
- Constant Field Values
-
-
Method Detail
-
reset
void reset(Reader input) throws IOException
reset the reader- Throws:
IOException
-
getStreamPosition
int getStreamPosition()
get the current length of the stream
-
next
IWord next() throws IOException
segment a word from a char array from a specified position.- Throws:
IOException
-
-