Class StringUtil


  • public class StringUtil
    extends Object

    a class to deal with the English stop char like the English punctuation

    Author:
    chenxin
    • Constructor Summary

      Constructors 
      Constructor Description
      StringUtil()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static int CJKIndexOf​(String str)  
      static int CJKIndexOf​(String str, int offset)
      get the index of the first CJK char of the specified string
      static String fwsTohws​(String str)
      a static method to replace the full-width char to the half-width char in a given string (65281-65374 for full-width char)
      static int getEnCharType​(int u)
      get the type of the English char defined in this class and start with EN_.
      static char getPunctuationPair​(char c)
      get the pair punctuation' pair
      static String hwsTofws​(String str)
      a static method to replace the half-width char to the full-width char in a given string
      static boolean isCJK​(String str)  
      static boolean isCJK​(String str, int beginIndex, int endIndex)
      check if the specified string is all CJK chars
      static boolean isCJKChar​(int c)
      check the specified char is CJK, Thai...
      static boolean isCnPunctuation​(int c)  
      static boolean isDecimal​(String str)  
      static boolean isDecimal​(String str, int beginIndex, int endIndex)
      check the specified char is a decimal including the full-width char
      static boolean isDigit​(String str)  
      static boolean isDigit​(String str, int beginIndex, int endIndex)
      check the specified char is a digit or not true will return if it is or return false this method can recognize full-with char
      static boolean isEnChar​(int c)
      check the specified char is a basic Latin and Russia and Greece letter.
      static boolean isENKeepPunctuaton​(char c)
      check the given char is English keep punctuation
      static boolean isEnLetter​(int u)
      include the full-width and half-width char
      static boolean isEnNumeric​(int u)
      check the specified char is an English numeric(48-57) including the full-width char
      static boolean isEnPunctuation​(int c)
      check if the given char is half-width punctuation
      static boolean isFWEnChar​(int c)
      check the given char is a full-width char AT+reader: the full-width punctuation is not included here
      static boolean isHWEnChar​(int c)
      check the given char is a half-width char or not
      static boolean isLatin​(String str)  
      static boolean isLatin​(String str, int beginIndex, int endIndex)
      check if the specified string is all Latin chars
      static boolean isLetter​(String str)  
      static boolean isLetter​(String str, int beginIndex, int endIndex)
      check if the specified string is Latin letter
      static boolean isLetterNumber​(int c)
      check the specified char is Letter number like 'ⅠⅡ' true will be return if it is, or return false
      static boolean isLetterOrNumeric​(String str)  
      static boolean isLetterOrNumeric​(String str, int beginIndex, int endIndex)
      check if the specified string is Latin numeric or letter
      static boolean isLowerCaseLetter​(int u)  
      static boolean isNoTailingPunctuation​(char c)
      check if the given punctuation is the one that need to be cleared
      static boolean isNumeric​(String str)  
      static boolean isNumeric​(String str, int beginIndex, int endIndex)
      check if the specified string is Latin numeric
      static boolean isOtherNumber​(int c)
      check the specified char is other number like '①⑩⑽㈩' true will be return if it is, or return false
      static boolean isPairPunctuation​(char c)
      check the given char is pair punctuation or not
      static boolean isPunctuation​(int c)
      check if the given char is a punctuation
      static boolean isPunctuation​(String str)  
      static boolean isPunctuation​(String str, int beginIndex, int endIndex)
      Check if the specified string is all punctuation chars (English and Chinese punctuation)
      static boolean isUpperCaseLetter​(int u)  
      static boolean isWhitespace​(int c)
      check the given string is a whitespace
      static int latinIndexOf​(String str)  
      static int latinIndexOf​(String str, int offset)
      get the index of the first Latin char of the specified string
      static int toLowerCase​(int u)  
      static int toUpperCase​(int u)  
    • Constructor Detail

      • StringUtil

        public StringUtil()
    • Method Detail

      • isCJKChar

        public static boolean isCJKChar​(int c)
        check the specified char is CJK, Thai... char true will be return if it is or return false
        Parameters:
        c -
        Returns:
        boolean
      • isEnChar

        public static boolean isEnChar​(int c)
        check the specified char is a basic Latin and Russia and Greece letter. True will be return if it is or return false. this method can recognize full-width char and letter
        Parameters:
        c -
        Returns:
        boolean
      • isLetterNumber

        public static boolean isLetterNumber​(int c)
        check the specified char is Letter number like 'ⅠⅡ' true will be return if it is, or return false
        Parameters:
        c -
        Returns:
        boolean
      • isOtherNumber

        public static boolean isOtherNumber​(int c)
        check the specified char is other number like '①⑩⑽㈩' true will be return if it is, or return false
        Parameters:
        c -
        Returns:
        boolean
      • isENKeepPunctuaton

        public static boolean isENKeepPunctuaton​(char c)
        check the given char is English keep punctuation
        Parameters:
        c -
        Returns:
        boolean
      • isNoTailingPunctuation

        public static boolean isNoTailingPunctuation​(char c)
        check if the given punctuation is the one that need to be cleared
        Parameters:
        c -
        Returns:
        boolean
      • isUpperCaseLetter

        public static boolean isUpperCaseLetter​(int u)
      • isLowerCaseLetter

        public static boolean isLowerCaseLetter​(int u)
      • toLowerCase

        public static int toLowerCase​(int u)
      • toUpperCase

        public static int toUpperCase​(int u)
      • isEnLetter

        public static boolean isEnLetter​(int u)
        include the full-width and half-width char
        Parameters:
        u -
        Returns:
        boolean
      • isEnNumeric

        public static boolean isEnNumeric​(int u)
        check the specified char is an English numeric(48-57) including the full-width char
        Parameters:
        u -
        Returns:
        boolean
      • getEnCharType

        public static int getEnCharType​(int u)
        get the type of the English char defined in this class and start with EN_. (only half-width)
        Parameters:
        u - char to identity
        Returns:
        int type keywords
      • isHWEnChar

        public static boolean isHWEnChar​(int c)

        check the given char is a half-width char or not

        • 32 -> whitespace
        • 33-47 -> punctuation
        • 48-57 -> 0-9
        • 58-64 -> punctuation
        • 65-90 -> A-Z
        • 91-96 -> punctuation
        • 97-122 -> a-z
        • 123-126 -> punctuation
        Parameters:
        c -
        Returns:
        boolean
      • isFWEnChar

        public static boolean isFWEnChar​(int c)
        check the given char is a full-width char AT+reader: the full-width punctuation is not included here
        Parameters:
        c -
        Returns:
        boolean
      • isEnPunctuation

        public static boolean isEnPunctuation​(int c)
        check if the given char is half-width punctuation
        Parameters:
        c -
        Returns:
        boolean
      • isCnPunctuation

        public static boolean isCnPunctuation​(int c)
      • isPunctuation

        public static boolean isPunctuation​(int c)
        check if the given char is a punctuation
      • isWhitespace

        public static boolean isWhitespace​(int c)
        check the given string is a whitespace
        Parameters:
        c -
        Returns:
        boolean
      • isDigit

        public static boolean isDigit​(String str,
                                      int beginIndex,
                                      int endIndex)
        check the specified char is a digit or not true will return if it is or return false this method can recognize full-with char
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isDigit

        public static boolean isDigit​(String str)
      • isDecimal

        public static boolean isDecimal​(String str,
                                        int beginIndex,
                                        int endIndex)
        check the specified char is a decimal including the full-width char
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isDecimal

        public static boolean isDecimal​(String str)
      • isLatin

        public static boolean isLatin​(String str,
                                      int beginIndex,
                                      int endIndex)
        check if the specified string is all Latin chars
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isLatin

        public static boolean isLatin​(String str)
      • isCJK

        public static boolean isCJK​(String str,
                                    int beginIndex,
                                    int endIndex)
        check if the specified string is all CJK chars
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isCJK

        public static boolean isCJK​(String str)
      • isLetterOrNumeric

        public static boolean isLetterOrNumeric​(String str,
                                                int beginIndex,
                                                int endIndex)
        check if the specified string is Latin numeric or letter
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isLetterOrNumeric

        public static boolean isLetterOrNumeric​(String str)
      • isLetter

        public static boolean isLetter​(String str,
                                       int beginIndex,
                                       int endIndex)
        check if the specified string is Latin letter
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isLetter

        public static boolean isLetter​(String str)
      • isNumeric

        public static boolean isNumeric​(String str,
                                        int beginIndex,
                                        int endIndex)
        check if the specified string is Latin numeric
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isNumeric

        public static boolean isNumeric​(String str)
      • latinIndexOf

        public static int latinIndexOf​(String str,
                                       int offset)
        get the index of the first Latin char of the specified string
        Parameters:
        str -
        offset -
        Returns:
        integer
      • latinIndexOf

        public static int latinIndexOf​(String str)
      • CJKIndexOf

        public static int CJKIndexOf​(String str,
                                     int offset)
        get the index of the first CJK char of the specified string
        Parameters:
        str -
        offset -
        Returns:
        integer
      • CJKIndexOf

        public static int CJKIndexOf​(String str)
      • fwsTohws

        public static String fwsTohws​(String str)
        a static method to replace the full-width char to the half-width char in a given string (65281-65374 for full-width char)
        Parameters:
        str -
        Returns:
        String the new String after the replace.
      • hwsTofws

        public static String hwsTofws​(String str)
        a static method to replace the half-width char to the full-width char in a given string
        Parameters:
        str -
        Returns:
        String the new String after the replace
      • isPairPunctuation

        public static boolean isPairPunctuation​(char c)
        check the given char is pair punctuation or not
        Parameters:
        c -
        Returns:
        boolean true for it is and false for not
      • getPunctuationPair

        public static char getPunctuationPair​(char c)
        get the pair punctuation' pair
        Parameters:
        c -
        Returns:
        char
      • isPunctuation

        public static boolean isPunctuation​(String str,
                                            int beginIndex,
                                            int endIndex)
        Check if the specified string is all punctuation chars (English and Chinese punctuation)
        Parameters:
        str -
        beginIndex -
        endIndex -
        Returns:
        boolean
      • isPunctuation

        public static boolean isPunctuation​(String str)