9.1.1 Regexp Patterns
The portion of a rx or rx_in form within '…' is a pattern that is written with regexp pattern operators. Some pattern operators overlap with expression operators, but they have different meanings and precedence in a pattern. For example, the pattern operator * creates a repetition pattern, instead of multiplying like the expression * operator.
RXMatch("hello", [], {})
#false
RXMatch(Bytes.copy(#"a"), [], {})
regexp operator | |
| |
| |
regexp operator | |
| |
| |
regexp operator | |
|
RXMatch("hello world", [], {})
RXMatch("hello world", [], {})
> rx'"hello"
++ " "
RXMatch("hello world", [], {})
regexp operator | |
|
RXMatch("a", [], {})
RXMatch("b", [], {})
#false
regexp operator | |
|
#false
RXMatch("ac", [], {})
regexp operator | |
| |
| |
regexp operator | |
|
See Regexp Character Sets for character set forms that can be used in charset.
RXMatch("m", [], {})
#false
RXMatch("abc", [], {})
RXMatch("", [], {})
By default, the match uses ~greedy mode, where a larger number
of matches is tried first—
RXMatch("abc", ["abc", ""], {#'head: 1, #'tail: 2})
RXMatch("abc", ["", "abc"], {#'head: 1, #'tail: 2})
RXMatch("abcz", [], {})
#false
RXMatch("abc", [], {})
#false
RXMatch("a", [], {})
RXMatch("", [], {})
#false
regexp operator | |
| |
| |
regexp operator | |
| |
regexp operator | |
RXMatch("aa", [], {})
RXMatch("aaa", [], {})
#false
RXMatch("a", [], {})
#false
> rx'enable_newline: .'.match("\n")
RXMatch("\n", [], {})
RXMatch("abc", [], {})
A regexp created with rx (as opposed to rx_in is implicitly prefixed with bof for use with methods like Regexp.match (as opposed to Regexp.match_in).
RXMatch("a", [], {})
#false
RXMatch("a", [], {})
#false
> rx'enable_newline: ^ "a"'.match_in("x\na")
#false
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
| |
regexp operator | |
| |
| |
regexp operator | |
| |
| |
regexp operator | |
|
When not followed by anything, $ matches the end of input or the position before a newline, analogous to the way that ^ matches the start of input or the position after a newline. The eof operator matches the end of input.
RXMatch("a", [], {})
#false
When followed by an identifier and a : for a block containing pat, $ creates a capture group. The portion of input that is matched against pat is recorded and associated with the name identifier. If the enclosing pattern uses pat zero or multiple times, then identifier is associated to #false if the pattern is used zero times, or it is associated to the latest match if used multiple times.
RXMatch("ab", ["b"], {#'m: 1})
"b"
RXMatch("a", [#false], {#'m: 1})
When followed by an identifier and no subsequent block, then $ is either a backreference to a named capture group, or it is a splice of a regexp that is bound to identifier.
The use of $ forms a backreference if identifier is associated to a capture group anywhere in the enclosing pattern; the backreference matches input that is the same as the most recent match for the capture group (and never matches if the capture group does not yet have a match).
RXMatch("abb", ["b"], {#'m: 1})
#false
When $ forms a splice, then a regular expression is formed dynamically by merging the referenced regexp into the enclosing pattern. (A limitation: both the merged regexp and enclosing pattern must be free of backreferences, because backreferences need to be converted from names to absolute positions eagerly.)
When followed by a literal integer, then $ forms a backreference that refers to a capture group by index instead of by name. Capture groups are numbered from 1, since 0 is reserved to refer to the entire match.
RXMatch("abb", ["b"], {#'m: 1})
#false
When followed by an expression other than an identifier or literal integer, then $ always forms a splice.
regexp operator | |
|
"b"
RXMatch("abb", ["b"], {})
regexp operator | |
| |
| |
regexp operator | |
| |
| |
regexp operator | |
| |
| |
regexp operator | |
|
RXMatch("na", [], {})
RXMatch("na", [], {})
RXMatch("ap", [], {})
RXMatch("ap", [], {})
regexp operator | |
| |
regexp operator | |
> rx'any+ ~nongreedy word_boundary'.match_in("cat nap")
RXMatch("cat", [], {})
> rx'any+ ~nongreedy word_continue'.match_in("cat nap")
RXMatch("c", [], {})
regexp operator | |
| |
| |
regexp operator | |
| |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
RXMatch("xxxs", ["x"], {#'x: 1})
RXMatch(".", [#false], {#'x: 1})
regexp operator | |
In the case of a rx_in pattern or use of RX.match_in, cut applies only to a match attempt at a given input position. It does not prevent trying the match at a later position.
#false
RXMatch("ax", [], {})
RXMatch("a", [], {})
RXMatch(Bytes.copy(#"a"), [], {})
#false
RXMatch(Bytes.copy(#"\200"), [], {})
regexp operator | |
| |
| |
regexp operator | |
|
#false
> rx'case_insensitive: "hello"'.match("HELLO")
RXMatch("HELLO", [], {})
regexp operator | |
| |
| |
regexp operator | |
|
enable_newline allows any to match newlines and causes ^ and $ to match only at the beginning and end of the input.
disable_newline prevents any from matching newlines and allows ^ and $ to match just before and after newlines. This is the default mode.
#false
> rx'enable_newline: "x" any "y"'.match("x\ny")
RXMatch("x\ny", [], {})
RXMatch("x", [], {})
> rx'enable_newline: ^ "x" $'.match_in("a\nx\nz")
#false
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
| |
regexp operator | |
RXMatch("m", [], {})
#false