Class MultiCharacterCaseFolding

java.lang.Object
com.oracle.truffle.regex.tregex.parser.MultiCharacterCaseFolding

public class MultiCharacterCaseFolding extends Object
  • Constructor Details

    • MultiCharacterCaseFolding

      public MultiCharacterCaseFolding()
  • Method Details

    • caseFoldUnfoldString

      public static OracleDBCharClassTrieNode caseFoldUnfoldString(CaseFoldData.CaseFoldAlgorithm algorithm, int[] codepoints, CodePointSet encodingRange, boolean dropAsciiOnStart, boolean transitiveEquivalence, RegexASTBuilder astBuilder, OracleDBCharClassTrieNode root, CompilationBuffer compilationBuffer)
      Appends to the astBuilder a matcher that matches all case variants of the input string.
      Parameters:
      codepoints - the input string as an array of Unicode codepoints
      encodingRange - the range of characters that we should limit ourselves to
      dropAsciiOnStart - whether we should forbid ASCII characters on the first positions of the variants
      transitiveEquivalence - whether to unconditionally include the case-folded version of every character in the generated expression. If this is set to false, case-folded characters that themselves fold to another character are not included in the expression. Example: suppose a character x case-folds to y, but y folds to z. If transitiveEquivalence is set to false, the generated expression will not include y.
      astBuilder - where to append the matcher
      root - add all matching strings to the given prefix tree
    • caseFold

      public static int[] caseFold(CaseFoldData.CaseFoldAlgorithm algorithm, int codePoint)
    • caseClosure

      public static void caseClosure(CaseFoldData.CaseFoldAlgorithm algorithm, CodePointSetAccumulator charClass, CodePointSetAccumulator tmp, BiPredicate<Integer,Integer> filter, CodePointSet allowedCodePoints, boolean transitiveEquivalence)
      This method modifies charClass to contains its closure on case mapping.
    • caseClosureMultiCodePoint

      public static List<org.graalvm.collections.Pair<Integer,int[]>> caseClosureMultiCodePoint(CaseFoldData.CaseFoldAlgorithm algorithm, CodePointSetAccumulator charClass)
      Finds any characters in charClass that have multi-codepoint expansions.
      Returns:
      a list of pairs, with the first element being the expanded codepoint and the second element the expansion
    • equalsIgnoreCase

      public static boolean equalsIgnoreCase(CaseFoldData.CaseFoldAlgorithm algorithm, int codePointA, int codePointB)