Regex

We need to understand a bit about regexes, or "regular expressions". "Regex" for short, is a programming-language-agnostic way of searching for patterns in text.

To get really good at using regex, we'd need a full course on the topic. For now, let's just cover the basics. In Python, we can use the re modulearrow-up-right to work with regex. It has a findallarrow-up-right function that will return a list of all the matches in a string. See examples below.

Regex for a Single Word

text = "My phone number is 555-555-5555 and my friend's number is 555-555-5556"
matches = re.findall(r"\d{3}-\d{3}-\d{4}", text)
print(matches) # ['555-555-5555', '555-555-5556']
  • \d matches any digit

  • {3} means "exactly three of the preceding character"

  • - is just a literal - that we want to match

Regex for Text Between Parentheses

text = "I have a (cat) and a (dog)"
matches = re.findall(r"\((.*?)\)", text)
print(matches) # ['cat', 'dog']
  • \( and \) are escaped parentheses that we want to match

  • ( and ) is a capture grouparrow-up-right, meaning it groups the matched text, allowing us to reference or extract it separately.

  • .*? matches any number of characters (except for line terminatorsarrow-up-right) between the parentheses

Regex for Emails Multiple Capture Groups

  • \w matches any word character (alphanumericarrow-up-right characters and underscores)

  • + means "one or more of the preceding character"

  • @ is just a literal @ symbol that we want to match

  • \. is a literal . that we want to match (The . is a special character in regex, so we escape it with a leading backslash)

Regex Examples

The findallarrow-up-right function that will return a list of all the matches in a string.

Testing Regexes

Use regexr.comarrow-up-right for interactive regex testing, it breaks down each part of the pattern and explains what it does.

Last updated