This is a quick reference guide on how to utilize `regular expressions` in R.
Metacharacter | Name | Matches |
---|---|---|
. | dot | any one character |
[...] | character class | any character listed |
[^...] | negated character class | any character not listed |
^ | caret | the position at the start of the line |
$ | dollar | the position at the end of the line |
\< | backslash less-than | the position at the start of a word |
\> | backslash greater-than | the positionat the end of a word |
| | or, bar | matches either expression it separates |
(...) | parentheses | used to limit the scope of | , plus additiona uses |
\\s | space | removes whitespaces | using \\s+ removes one or more white spaces |
Metacharacter | Minimum Required | Maximum to Try | Meaning |
---|---|---|---|
? | none | 1 | one allowed; non required ("one optional") |
* | none | no limit | unlimited allowed; non required ("any amount OK") |
+ | 1 | no limit | unlimited allowed; one required ("at least one") |
Metacharacter | Name | Matches |
---|---|---|
^ | caret | Matches the position at the start of the line |
$ | dollar | Matches the position at the end of the line |
\< | word boundary: beginning of word | Matches the position at the start of a word |
\> | word boundary: end of word | Matches the position at the end of a word |
Metacharacter | Name | Matches |
---|---|---|
| | alternation (bar): e.g. either or | Matches either expression it seperates |
(...) | parentheses | Oimits scop of alternation, provides grouping for the quantifiers, and "captures" for backreferences |
\1,\2,... | backreference | Matches text previously matched within the first, second, etc., set of parentheses |
Metacharacter | Description | Meaning |
---|---|---|
\t | tab | a tab character |
\n | newline | a newline character |
\r | carriage-return | a carriage-return character |
\s | whitespace | matches any "whitespace" character (space, tab, newline, formfeed, and such) |
\S | not a whitespace \s | matches anything that is not a whitespace |
\w | [a-zA-Z0-9_] | useful as in \w+ to ostensibly match a word |
\W | anything not [a-zA-Z0-9_] | anything that is not a word or numeric character |
\d | [0-9] | i.e., a digit |
\D | anything not \d | i.e., [^0-9] |
Type | Regex | Successful if the enclosed subexpression... |
---|---|---|
Positive Lookbehind | (?<=.....) | successful if can match to the left |
Negative Lookbehind | (? | successful if can not match to the left |
Positive Lookahead | (?<=.....) | successful if can match to the right | Negative Lookahead | (? | successful if can not match to the right |