Package com.ibm.icu.impl
Class PatternTokenizer
java.lang.Object
com.ibm.icu.impl.PatternTokenizer
A simple parsing class for patterns and rules. Handles '...' quotations, \\uxxxx and \\Uxxxxxxxx, and symple syntax.
The '' (two quotes) is treated as a single quote, inside or outside a quote
- Any ignorable characters are ignored in parsing.
- Any syntax characters are broken into separate tokens
- Quote characters can be specified: '...', "...", and \x
- Other characters are treated as literals
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
static final char
static final int
static final int
static final int
private UnicodeSet
private UnicodeSet
private static final int
private UnicodeSet
private static int
private int
static final int
private UnicodeSet
private static int
private static final int
private static final int
private String
static final char
private static final int
private int
private static final int
static final int
private UnicodeSet
static final int
private boolean
private boolean
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate void
appendEscaped
(StringBuffer result, int cp) int
getLimit()
int
getStart()
boolean
boolean
int
next
(StringBuffer buffer) quoteLiteral
(CharSequence string) quoteLiteral
(String string) Quote a literal string, using the available settings.setEscapeCharacters
(UnicodeSet escapeCharacters) Set characters to be escaped in literals, in quoteLiteral and normalize, eg new UnicodeSet("[^\\u0020-\\u007E]");setExtraQuotingCharacters
(UnicodeSet syntaxCharacters) Sets the extra characters to be quoted in literalssetIgnorableCharacters
(UnicodeSet ignorableCharacters) Sets the characters to be ignored in parsing, eg new UnicodeSet("[:pattern_whitespace:]");setLimit
(int limit) setPattern
(CharSequence pattern) setPattern
(String pattern) setStart
(int start) setSyntaxCharacters
(UnicodeSet syntaxCharacters) Sets the characters to be interpreted as syntax characters in parsing, eg new UnicodeSet("[:pattern_syntax:]")setUsingQuote
(boolean usingQuote) setUsingSlash
(boolean usingSlash)
-
Field Details
-
ignorableCharacters
-
syntaxCharacters
-
extraQuotingCharacters
-
escapeCharacters
-
usingSlash
private boolean usingSlash -
usingQuote
private boolean usingQuote -
needingQuoteCharacters
-
start
private int start -
limit
private int limit -
pattern
-
SINGLE_QUOTE
public static final char SINGLE_QUOTE- See Also:
-
BACK_SLASH
public static final char BACK_SLASH- See Also:
-
NO_QUOTE
private static int NO_QUOTE -
IN_QUOTE
private static int IN_QUOTE -
DONE
public static final int DONE- See Also:
-
SYNTAX
public static final int SYNTAX- See Also:
-
LITERAL
public static final int LITERAL- See Also:
-
BROKEN_QUOTE
public static final int BROKEN_QUOTE- See Also:
-
BROKEN_ESCAPE
public static final int BROKEN_ESCAPE- See Also:
-
UNKNOWN
public static final int UNKNOWN- See Also:
-
AFTER_QUOTE
private static final int AFTER_QUOTE- See Also:
-
NONE
private static final int NONE- See Also:
-
START_QUOTE
private static final int START_QUOTE- See Also:
-
NORMAL_QUOTE
private static final int NORMAL_QUOTE- See Also:
-
SLASH_START
private static final int SLASH_START- See Also:
-
HEX
private static final int HEX- See Also:
-
-
Constructor Details
-
PatternTokenizer
public PatternTokenizer()
-
-
Method Details
-
getIgnorableCharacters
-
setIgnorableCharacters
Sets the characters to be ignored in parsing, eg new UnicodeSet("[:pattern_whitespace:]");- Parameters:
ignorableCharacters
- Characters to be ignored.- Returns:
- A PatternTokenizer object in which characters are specified as ignored characters.
-
getSyntaxCharacters
-
getExtraQuotingCharacters
-
setSyntaxCharacters
Sets the characters to be interpreted as syntax characters in parsing, eg new UnicodeSet("[:pattern_syntax:]")- Parameters:
syntaxCharacters
- Characters to be set as syntax characters.- Returns:
- A PatternTokenizer object in which characters are specified as syntax characters.
-
setExtraQuotingCharacters
Sets the extra characters to be quoted in literals- Parameters:
syntaxCharacters
- Characters to be set as extra quoting characters.- Returns:
- A PatternTokenizer object in which characters are specified as extra quoting characters.
-
getEscapeCharacters
-
setEscapeCharacters
Set characters to be escaped in literals, in quoteLiteral and normalize, eg new UnicodeSet("[^\\u0020-\\u007E]");- Parameters:
escapeCharacters
- Characters to be set as escape characters.- Returns:
- A PatternTokenizer object in which characters are specified as escape characters.
-
isUsingQuote
public boolean isUsingQuote() -
setUsingQuote
-
isUsingSlash
public boolean isUsingSlash() -
setUsingSlash
-
getLimit
public int getLimit() -
setLimit
-
getStart
public int getStart() -
setStart
-
setPattern
-
setPattern
-
quoteLiteral
-
quoteLiteral
Quote a literal string, using the available settings. Thus syntax characters, quote characters, and ignorable characters will be put into quotes.- Parameters:
string
- String passed to quote a literal string.- Returns:
- A string using the available settings will place syntax, quote, or ignorable characters into quotes.
-
appendEscaped
-
normalize
-
next
-