String Manipulation
String Manipulation: Introduction to stringr
and regular expressions.
Some primer on why regex is useful
Case Manipulation
It might be of interest to see how long the sentence is in terms of how many letters each word is
Now the sentence is a string vector, so the str_length
function will vectorize it
String Manipulation
Regular Expression Glossary:
Looking for numbers
\\d
and [0-9]
Example:
Looking for boundary character
\\b
Example:
Looking for word characters
\\w
Example:
Look for characters in the range of a-z (case-sensitive)
[a-z]
Example:
Look for characters in the range of A-Z (case-sensitive)
[A-Z]
Look for characters in the range of A-Z and a-z (case-sensitive)
[aA-zZ]
Match your pattern exactly n times
{n}
Match your pattern >= n
{n,}
Match your pattern between n and k times
{n,k}
Example:
Keep matching until you encounter a new pattern
+
Match any character except for line break. Useful when you don’t know how many characters are in the pattern
.
Match zero or more times
*
Match start of a string
^
Example: WHY DOES THIS NOT WORK?
Match end of a string
$
Example:
Regular Expressions
Replace a word with something else, we will return the string into sentence format
Replace any three letter word with “cake”
Remove a pattern
Detect
Does anything in your string match this pattern
Using stringi
to generate passwords
stri_rand_strings
accepts the following arguments:
n
: The number of strings you want to makelength
: The length of the string you wantpattern
: The pattern you want to match
String Manipulation: Use Cases - String Extraction
Imagine that you were given the following dataset:
Your task is to extract just the numbers. You could do it one of two ways:
Both methods return the same values, but with fewer needed regular expressions to match in the second example