Class CharacterClass

All Implemented Interfaces:
AbstractState<Term,ASTTransition>, JsonConvertible

public class CharacterClass extends QuantifiableTerm
A Term that matches characters belonging to a specified set of characters.

Corresponds to the right-hand sides PatternCharacter, . and CharacterClass of the goal symbol Atom and the right-hand sides CharacterClassEscape and CharacterEscape of the goal symbol AtomEscape in the ECMAScript RegExp syntax.

Note that CharacterClass nodes and the CodePointSets that they rely on can only match characters from the Basic Multilingual Plane (and whose code point fits into 16-bit integers). Any term which matches characters outside of the Basic Multilingual Plane is expanded by JSRegexParser into a more complex expression which matches the individual code units that would make up the UTF-16 encoding of those characters.

  • Method Details

    • copy

      public CharacterClass copy(RegexAST ast)
      Description copied from class: RegexASTNode
      Copy this node only, without any child nodes. The ID and minPath of the copied nodes is left unset.
      Specified by:
      copy in class QuantifiableTerm
      Parameters:
      ast - RegexAST the node should belong to.
      Returns:
      A shallow copy of this node.
    • copyRecursive

      public CharacterClass copyRecursive(RegexAST ast, CompilationBuffer compilationBuffer)
      Description copied from class: RegexASTNode
      Recursively copy this subtree. This method should be used instead of CopyVisitor if the copying process is required to be thread-safe. The ID and minPath of the copied nodes is left unset.
      Specified by:
      copyRecursive in class Term
      Parameters:
      ast - RegexAST the new nodes should belong to.
      Returns:
      A deep copy of this node.
    • getParent

      public Sequence getParent()
      Description copied from class: RegexASTNode
      Gets the syntactic parent of this AST node.
      Overrides:
      getParent in class RegexASTNode
    • getCharSet

      public CodePointSet getCharSet()
      Returns the CodePointSet representing the set of characters that can be matched by this CharacterClass.
    • setCharSet

      public void setCharSet(CodePointSet charSet)
    • wasSingleChar

      public boolean wasSingleChar()
    • setWasSingleChar

      public void setWasSingleChar()
    • setWasSingleChar

      public void setWasSingleChar(boolean value)
    • isUnrollingCandidate

      public boolean isUnrollingCandidate()
      Description copied from class: QuantifiableTerm
      Returns true iff the parser should try to unroll this term's quantifier.
      Specified by:
      isUnrollingCandidate in class QuantifiableTerm
    • addLookBehindEntry

      public void addLookBehindEntry(RegexAST ast, LookBehindAssertion lookBehindEntry)
    • hasLookBehindEntries

      public boolean hasLookBehindEntries()
    • getLookBehindEntries

      public Set<LookBehindAssertion> getLookBehindEntries()
      Returns the (fixed-length) look-behind assertions whose first characters can match the same character as this node. Note that the set contains the Group bodies of the LookBehindAssertion nodes, not the LookBehindAssertion nodes themselves.
    • extractSingleChar

      public void extractSingleChar(AbstractStringBuffer literal, AbstractStringBuffer mask)
    • equalsSemantic

      public boolean equalsSemantic(RegexASTNode obj, boolean ignoreQuantifier)
      Specified by:
      equalsSemantic in class QuantifiableTerm
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • toJson

      public JsonValue toJson()