REGEX Functions in Excel

If you work with messy data in Excel — mixed-up phone numbers, email addresses buried inside long text, inconsistent date formats — you already know how painful it is to clean that stuff up.

Until now, you had two options. Either build a monster formula using LEFT, MID, RIGHT, FIND, and SUBSTITUTE all nested together. Or write a VBA macro. Both work, but neither is fun.

Now, Excel 365 has three built-in REGEX functions — REGEXTEST, REGEXEXTRACT, and REGEXREPLACE — and they change everything. You write one pattern, and Excel does the rest. No VBA. No complicated nesting. Just a clean, readable formula.

In this guide, I want to walk you through all three functions from the ground up. We'll cover the syntax, real examples, and I'll also show you how to use REGEX inside XLOOKUP and XMATCH. By the end, you'll have a full pattern library you can copy and use right away.

Excel Version: REGEXTEST, REGEXEXTRACT, and REGEXREPLACE are available in Microsoft 365 (Windows, Mac, and Web). They are now generally available — no Insider or Beta channel needed. Just make sure your Excel is fully updated. These functions use the PCRE2 (Perl Compatible Regular Expressions) flavor of regex, which is well documented and widely used. You can test your patterns at regex101.com — just select the PCRE2 flavor.

What Are REGEX Patterns?

REGEX stands for Regular Expressions. It's a way to describe a text pattern using a special set of characters — so you can search, match, extract, or replace text based on what it looks like rather than where it is.

Regex isn't something Microsoft invented. It's been around since the 1950s, and developers, data analysts, and programmers have used it for decades in languages like Python, JavaScript, SQL, and tools like Google Sheets. It's one of the most widely used text-processing tools in the world.

In late 2024, Microsoft finally brought regex into Excel with three new functions. And that changed everything for people who work with messy text data in spreadsheets.

In Plain English A regex pattern is a set of instructions that says: "Find text that looks like this." You describe the shape of the data — how many digits, what characters, in what order — and the regex engine finds everything that matches.

Pattern \d{3}-\d{3}-\d{4} Three digits, a dash, three digits, a dash, four digits. → Matches: 415-555-1234

Pattern [\w\.]+@[\w\.]+\.\w{2,} Word characters and dots, then @, then domain, then extension. → Matches: [email protected]

Pattern [^0-9] Any character that is NOT a digit. → Matches: letters, dashes, spaces, symbols

Each pattern is built from small pieces — character classes, quantifiers, and anchors. You'll learn all of these in the cheat sheet section below. For now, just understand the concept: you describe what you're looking for, and regex finds it.

A Simple Way to Think About It

📍 Traditional Excel Functions Work by position. "Get characters 5 through 10." "Find the 2nd dash." They need to know where the data is.

🔍 REGEX Patterns Work by shape. "Find anything that looks like a phone number." They don't care where it is — just what it looks like.

That's why regex is so powerful for messy data. When your phone numbers are in different positions in every cell, or your dates are in three different formats, traditional functions struggle. Regex doesn't care — it just looks for the pattern.

The Three REGEX Functions in Excel

Excel gives you three functions that use regex patterns. Each one does a different job with the pattern you give it.

REGEXTEST — Checks if the text matches a pattern and returns TRUE or FALSE. Use it when you need to validate data or filter rows.
REGEXEXTRACT — Finds the matching text and pulls it out. Use it when you need to grab a phone number, email, or date from inside a longer text.
REGEXREPLACE — Finds the matching text and replaces it with something else. Use it when you need to clean, reformat, or mask data.

All three functions take the same kind of regex pattern. So once you learn how to build a pattern, you can use it in any of them.

✓ Tip

Bonus: XLOOKUP and XMATCH also support regex now. You don't need a separate function — just set the match_mode argument to 3, and your lookup value becomes a regex pattern. We'll cover this in detail later in the guide.

Why REGEX Patterns Are a Big Deal in Excel

Let me be honest with you. For years, extracting or cleaning text in Excel felt like solving a puzzle with the wrong pieces. You needed four, five, sometimes six functions nested together just to pull out an email address or validate a phone number.

And the worst part? Those formulas broke the moment your data changed even slightly.

REGEX solves that. Instead of telling Excel where the data is (character 5 through character 12), you tell it what the data looks like. That's a completely different approach — and it's way more reliable.

The Real Problems REGEX Solves

Here are the everyday situations where REGEX turns something painful into something simple.

📧 Extracting emails from messy text Your data has email addresses buried inside notes, comments, and paragraphs. With REGEX, one pattern pulls them all out — no matter where they sit in the text.

📞 Validating phone numbers People enter phone numbers in 10 different formats. Parentheses, dashes, spaces, dots — REGEX can check all of them with a single pattern.

🧹 Cleaning inconsistent data Extra spaces, random symbols, mixed-case text — REGEX can find and replace patterns that SUBSTITUTE and CLEAN simply can't handle.

🔀 Rearranging text "First Last" to "Last, First." Date from MM/DD/YYYY to YYYY-MM-DD. REGEX captures groups and lets you reorder them in one formula.

🔒 Masking sensitive data Need to redact the first few digits of account numbers, SSNs, or phone numbers? REGEX replaces specific parts while keeping the rest.

🔍 Smart lookups on messy data With XLOOKUP match mode 3, you can now search using patterns — not just exact values. "USA" or "United States"? One formula handles both.

See the Difference — Before and After REGEX

The best way to understand why REGEX matters is to see it side by side. Let's take a simple task: extract a phone number (format: 415-555-1234) from a cell that has a bunch of text in it.

Before — Without REGEX Nested formula approach =MID(A1,FIND("-",A1)-3,12) This assumes the phone number always has a dash at the same position. If the text changes, if there's more than one dash, or if the phone number moves — the formula breaks. And if there's no phone number at all? You get an error.

After — With REGEX One pattern, done =REGEXEXTRACT(A1,"\d{3}-\d{3}-\d{4}") This says: find 3 digits, a dash, 3 digits, a dash, 4 digits. It doesn't care where the phone number sits. It doesn't care what's around it. It just finds the pattern and pulls it out.

Now let's take another one: check if a cell contains a valid email address.

Before — Without REGEX Multi-condition formula =AND(ISERROR(FIND(" ",A1)),LEN(A1)-LEN(SUBSTITUTE(A1,"@",""))=1,FIND("@",A1)>1,FIND(".",A1,FIND("@",A1))>FIND("@",A1)+1) Four conditions nested inside AND. It checks for spaces, counts the @ symbol, finds the dot after @ — and it still misses edge cases. Good luck explaining this to someone else.

After — With REGEX Clean and readable =REGEXTEST(A1,"^[\w\.\-]+@[\w\.\-]+\.\w{2,}$") One formula. One pattern. It checks the entire structure — characters before @, the domain name, and the extension. Returns TRUE or FALSE. That's it.

REGEX = DESCRIBE THE PATTERN, NOT THE POSITION

★Note

The key mindset shift: Traditional Excel text functions ask "where is the data?" — character 5, position of the second dash, length minus 3. REGEX asks "what does the data look like?" — three digits, then a dash, then three more digits. That's why REGEX doesn't break when your data is inconsistent. It finds the pattern no matter where it appears.

Now that you know why REGEX is worth learning, let's look at which function to use for what.

Which REGEX Function Do You Need?

Before we get into the details, here's a quick table to help you pick the right function for the job.

What You Want to Do	Function	Example Pattern	What It Returns
Check if a cell contains a number	REGEXTEST	[0-9]	TRUE or FALSE
Validate an email address format	REGEXTEST	^[\w\.\-]+@[\w\.\-]+\.\w{2,}$	TRUE or FALSE
Extract a phone number from text	REGEXEXTRACT	\d{3}-\d{3}-\d{4}	The matched phone number
Pull out all hashtags from a post	REGEXEXTRACT	#\w+	All matching hashtags (spills)
Mask the first 3 digits of a number	REGEXREPLACE	^\d{3}	***-456-7890
Swap "First Last" → "Last, First"	REGEXREPLACE	(\w+)\s(\w+)	Last, First
Remove all non-numeric characters	REGEXREPLACE	[^0-9]	Only the digits remain
Look up messy data like "USA" or "United States"	XLOOKUP (mode 3)	USA\|United States	First matching row's value

✓Tip

Quick rule: Need a YES/NO answer? Use REGEXTEST. Need to pull data out? Use REGEXEXTRACT. Need to change or clean data? Use REGEXREPLACE.

Regex Basics You Need to Know Before We Start

If you've never used regular expressions before, don't worry. You don't need to memorize the entire regex language. You just need a handful of building blocks, and you can combine them to do almost anything.

Think of it this way — regex is just a way to describe what text looks like, not where it is. Instead of saying "grab characters 5 through 10," you say "grab anything that looks like a phone number."

Here are the pieces you'll use the most in Excel. Click each tab to explore, and then try your own patterns in the live tester below.

A character class tells regex which characters are allowed at a specific position.

\d Any digit (0–9) \d\d\d → matches "415"

\D Any character that is NOT a digit \D → matches "A" but not "7"

\w Any word character (letter, digit, underscore) \w+ → matches "hello_99"

\W Any character that is NOT a word character \W → matches "@" or a space

\s Any whitespace (space, tab, newline) \s → matches the space in "A B"

\S Any character that is NOT whitespace \S+ → matches "Hello"

. Any character except a newline a.c → matches "abc", "a1c", "a-c"

[A-Z] Any uppercase letter from A to Z [A-Z]{3} → matches "ABC"

[a-z] Any lowercase letter from a to z [a-z]+ → matches "hello"

[0-9] Any digit (same as \d) [0-9]{5} → matches "10001"

[^0-9] Any character that is NOT a digit [^0-9]+ → matches "abc"

✓ The ^ symbol inside square brackets means "NOT." So [^A-Z] matches anything that is not an uppercase letter. Outside square brackets, ^ means "start of text" — see the Anchors tab.

Quantifiers go right after a character class and tell regex how many times that character should appear.

+ One or more \d+ → matches "1", "42", "9999"

* Zero or more \d* → matches "" or "123"

? Zero or one (makes it optional) -? → matches "-" or nothing

{n} Exactly n times \d{3} → matches "415" but not "41"

{n,} n or more times \w{5,} → matches words with 5+ characters

{n,m} Between n and m times \d{2,4} → matches "12", "123", "1234"

Anchors don't match a character. They match a position in the text. Use them when you need to control where the match happens.

^ Start of the text ^ABC → matches "ABC" only at the beginning

$ End of the text xyz$ → matches "xyz" only at the end

\b Word boundary \bcat\b → matches "cat" but not "catalog"

✓ When to use anchors: If you want to validate an entire cell value (like checking if the full string is a valid email), use ^ at the start and $ at the end. Without them, REGEXTEST returns TRUE even if only part of the cell matches.

Groups let you capture parts of a match. The OR operator lets you match one thing or another.

( ) Capture group — groups characters and captures the match (\d{3})-(\d{4}) → captures "415" and "1234" separately

| OR — matches either side of the pipe cat|dog → matches "cat" or "dog"

$1, $2 Backreference — refers to captured groups in REGEXREPLACE $2, $1 → swaps "First Last" to "Last, First"

! Important: Excel uses $1, $2 for backreferences in REGEXREPLACE — not \1, \2 like most other regex tools. This trips people up all the time.

Some characters have special meaning in regex — like . (matches anything) and ( (starts a group). To match the actual character, put a backslash before it.

\. Matches a literal dot Without escape, . matches any character

 Matches literal parentheses $hello$ → matches "(hello)"

\$ Matches a literal dollar sign \$\d+ → matches "$500"

\\ Matches a literal backslash \\\\ → matches "\\"

\* Matches a literal asterisk \*+ → matches "***"

✓ Rule of thumb: If a character does something special in regex and you want the actual character, just add a \ before it. The common ones to escape: . * + ? ^ $ ( ) [ ] { } | \

By default, all three REGEX functions in Excel are case-sensitive. That means [A-Z] will NOT match a lowercase "a". Here's how to change that.

case_sensitivity = 1 Use the function's built-in argument. Set it to 1 to ignore case. =REGEXTEST(A1,"hello",1) → matches "Hello", "HELLO"

(?i) Inline flag — add at start of pattern to ignore case (?i)hello → matches "Hello", "HELLO", "hello"

✓ The (?i) flag is handy when using REGEX inside XLOOKUP (match mode 3), where there's no separate case_sensitivity argument. Just add it to the front of your pattern.

REGEX TESTER — Try It Live

Your Pattern

Test Text

Matches Highlighted

Type a pattern and some text above to see matches.

✓Tip

My recommendation: For complex patterns, also test at regex101.com with the flavor set to PCRE2 — that's the exact same engine Excel uses. The tester above uses your browser's regex engine, which covers 95% of cases but may differ on a few edge cases.

That's all you need to get started. Now let's use these building blocks inside the actual Excel functions.

Function 1

REGEXTEST — Check If Text Matches a Pattern

REGEXTEST is the simplest of the three REGEX functions. You give it some text and a pattern, and it tells you TRUE or FALSE. That's it.

This is the function you use when you need to answer one question: does this cell match a pattern or not?

Syntax

=REGEXTEST(text, pattern, [case_sensitivity])

text The cell or text string you want to test. Can also be a range — if you pass a range, it spills TRUE/FALSE for every cell.
pattern The regex pattern to match against. Must be enclosed in double quotes. Uses PCRE2 syntax.
case_sensitivity Optional. 0 = case-sensitive (this is the default). 1 = ignore case. You can also add (?i) at the start of your pattern instead of using this argument.

★Note

REGEXTEST checks if any part of the text matches the pattern. It doesn't need to match the entire cell. So if your cell has "Hello 123" and your pattern is [0-9], it returns TRUE — because there are digits somewhere in there.

Identify Product Codes That Contain Numbers

Let's say you have a list of product codes in column A. Some codes have numbers in them, some are only letters. You want to quickly flag which ones contain at least one digit.

This is where we use our first pattern from the crash course — [0-9] — which means "any single digit from 0 to 9."

Below are the steps you need to follow.

Click on cell B2 and enter the following formula.

=REGEXTEST(A2,"[0-9]")

In the end, hit Enter. It returns TRUE because cell A2 ("ABC123") has digits in it.
Now drag the formula down from B2 to B7. Excel checks each cell one by one.

Look at the results. "ABC123" has digits — TRUE. "HELLO" doesn't — FALSE. "XY-99-MN" has 99 in it — TRUE. "WIDGET" is only letters — FALSE. Simple.

How This Pattern Works

REGEXTEST(A2, "[0-9]")

A2 This is the text argument — the cell we're testing. REGEXTEST looks at whatever text is inside this cell.

[0-9] This is the pattern. The square brackets create a character class. The 0-9 inside means "any digit from 0 to 9." So this pattern says: is there at least one digit anywhere in this text?

→ Returns TRUE if any digit exists, FALSE if not

Now, I want to point out something important. This pattern [0-9] doesn't check how many digits there are. It doesn't check where the digit appears. It just asks: is there at least one?

That's because REGEXTEST checks if any part of the text matches. It found a digit somewhere in "ABC123" and that was enough to return TRUE.

✓Tip

Quick tip: You could also write this pattern as \d instead of [0-9] — they mean the exact same thing. I used [0-9] here because it's easier to read when you're just getting started.

If you wanted to check that the entire cell is only digits — nothing else — you would use anchors and a quantifier like this:

=REGEXTEST(A2,"^[0-9]+$")

Here ^ means start of text, [0-9]+ means one or more digits, and $ means end of text. Together, they say: the cell must contain only digits from start to finish. "ABC123" would return FALSE with this pattern because it has letters too.

★Note

See how we just used three pieces from the crash course — a character class [0-9], an anchor ^...$, and a quantifier +? That's all regex is. You combine small building blocks into a pattern that describes what you're looking for.

That's REGEXTEST. One function, one pattern, TRUE or FALSE. In the later sections, we'll use REGEXTEST inside IF, FILTER, and XLOOKUP to do more powerful things with it. But for now, let's move to the next function — REGEXEXTRACT.

Function 2

REGEXEXTRACT — Pull Matching Text Out of a Cell

REGEXTEST tells you if a pattern exists. REGEXEXTRACT goes one step further — it actually pulls out the matching text and gives it back to you.

This is the function you use when you need to extract something specific from messy data. A phone number buried in a paragraph. A product ID mixed in with text. A date sitting inside a long string. REGEXEXTRACT finds the pattern and hands it to you.

Syntax

=REGEXEXTRACT(text, pattern, [return_mode], [case_sensitivity])

text The cell or text string you want to extract from. Works with a single cell or a range.
pattern The regex pattern that describes what you want to find. Must be in double quotes.
return_mode Optional. Controls what the function returns. This is the argument that makes REGEXEXTRACT really powerful — see the explanation below.
case_sensitivity Optional. 0 = case-sensitive (default). 1 = ignore case.

Understanding return_mode

The return_mode argument is what sets REGEXEXTRACT apart. It controls how much the function gives back to you.

First Match

Returns only the first piece of text that matches the pattern. This is the default.

All Matches

Returns every match found in the cell. Results spill into multiple cells.

Capture Groups

Returns the parts captured by ( ) groups in your pattern. Each group spills into its own cell.

★Note

For now, we'll use the default (return_mode 0 — first match). We'll explore return_mode 1 and 2 in the real-world examples section later, where they really shine.

Extract the Numeric Part from Product Codes

Remember the product code example from REGEXTEST? We checked if a code had a number. Now let's take it further — let's actually pull out the numeric part.

Our data looks like this: "ABC123", "XY-99-MN", "PART-7A". We want just the numbers — "123", "99", "7".

Below are the steps.

Click on cell B2 and enter this formula.

=REGEXEXTRACT(A2,"[0-9]+")

Hit Enter. It returns "123" — the numeric part of "ABC123".
Drag the formula down to B7.

Look at the results. "ABC123" gave us "123". "XY-99-MN" gave us "99". "PART-7A" gave us "7". And the cells with no numbers — "HELLO", "WIDGET", "STEEL" — returned a #N/A error because there was nothing to extract.

How This Pattern Works

REGEXEXTRACT(A2, "[0-9]+")

[0-9] A character class — matches any single digit from 0 to 9. Same pattern we used in REGEXTEST.

+ A quantifier — means "one or more." So [0-9]+ matches one digit, two digits, three digits… as many consecutive digits as it can find.

→ Finds the first group of consecutive digits and returns them

Notice the difference from REGEXTEST. In the previous section, we used [0-9] without the +. That was fine for REGEXTEST because it only needed to find one digit to say TRUE.

But here, we need REGEXEXTRACT to grab all consecutive digits, not just the first one. That's why we add the + quantifier. Without it, "ABC123" would return just "1" instead of "123".

✓Tip

This is how regex patterns grow. You start with a character class to say what kind of character, then add a quantifier to say how many. That's the rhythm: what + how many. You'll use this combo in almost every pattern you write.

Handling the #N/A Error

You probably noticed that "HELLO", "WIDGET", and "STEEL" returned #N/A. That's because REGEXEXTRACT has nothing to extract — there are no digits in those cells.

To handle this cleanly, wrap the formula in IFERROR.

=IFERROR(REGEXEXTRACT(A2,"[0-9]+"),"")

Now, instead of a #N/A error, those cells will just be blank. You can replace the "" with any text you want — like "None" or "No number found".

!Warning

Important: REGEXEXTRACT always returns text, even when it extracts numbers. So "123" comes back as the text "123", not the number 123. If you need to do math with the result, wrap it in the VALUE function: =VALUE(REGEXEXTRACT(A2,"[0-9]+"))

That's REGEXEXTRACT. Find a pattern, pull it out. In the later sections, we'll use return_mode 1 to extract all matches and return_mode 2 to break a match into separate pieces using capture groups. But the foundation is always the same — describe the pattern, and let the function do the rest.

Function 3

REGEXREPLACE — Find a Pattern and Replace It

REGEXTEST checks for a pattern. REGEXEXTRACT pulls it out. REGEXREPLACE does the third thing — it finds a pattern and swaps it with something else.

This is the function you use when you need to clean data, remove unwanted characters, reformat text, or mask sensitive information. You tell it what to find and what to put in its place.

Syntax

=REGEXREPLACE(text, pattern, replacement, [occurrence], [case_sensitivity])

text The cell or text string you want to modify. Works with a single cell or a range.
pattern The regex pattern that describes what you want to find and replace. Must be in double quotes.
replacement What you want to put in place of the matched text. Use "" (empty quotes) to delete the match entirely. Use $1, $2 to reference capture groups.
occurrence Optional. 0 = replace all occurrences (default). Any other number replaces only that specific occurrence (1 = first, 2 = second, etc.).
case_sensitivity Optional. 0 = case-sensitive (default). 1 = ignore case.

Strip Everything Except Numbers from Product Codes

Let's continue with our product codes. This time, instead of extracting the numbers, we want to remove everything that's not a number — all the letters, dashes, and special characters — so we're left with just the digits.

Below are the steps.

Click on cell B2 and enter this formula.

=REGEXREPLACE(A2,"[^0-9]","")

Hit Enter. It returns "123" — all the non-digit characters have been removed from "ABC123".
Drag the formula down to B7.

Look at the difference from REGEXEXTRACT. "HELLO" doesn't throw a #N/A error this time — it just returns an empty cell. That's because REGEXREPLACE removes every non-digit character, and when there are no digits left, you get an empty string. No error handling needed.

How This Pattern Works

REGEXREPLACE(A2, "[^0-9]", "")

[^0-9] A negated character class. The ^ inside the square brackets flips the meaning — this matches any character that is NOT a digit. Letters, dashes, spaces, symbols — all matched.

"" The replacement text. Empty quotes mean "replace with nothing." So every non-digit character gets deleted.

Default: all We didn't set the occurrence argument, so it defaults to 0 — replace all matches, not just the first one.

→ Every non-digit character gets removed, only numbers stay

Here's a visual of what's happening to each cell:

ABC123 ➜ 123

XY-99-MN ➜ 99

PART-7A ➜ 7

The letters and dashes (shown in red, strikethrough) get removed. Only the digits (shown in green) survive.

★Note

REGEXEXTRACT vs. REGEXREPLACE — when to use which? If you want to keep the matching part and throw away the rest, use REGEXEXTRACT. If you want to remove the matching part and keep the rest, use REGEXREPLACE. In this example, both give us the digits — but the approach is opposite. REGEXEXTRACT says "find digits." REGEXREPLACE says "find everything that's not a digit and delete it."

A Quick Preview: Capture Groups and $1, $2

REGEXREPLACE has one more trick that makes it really powerful — capture groups. You can use parentheses ( ) in your pattern to capture parts of the match, and then rearrange them in the replacement using $1, $2, etc.

For example, if you have names in "First Last" format and want "Last, First":

=REGEXREPLACE(A2,"(\w+)\s(\w+)","$2, $1")

Puneet Gogia ➜ Gogia, Puneet

Here (\w+) captures the first name, \s matches the space, and the second (\w+) captures the last name. In the replacement, $2 puts the last name first, then a comma, then $1 puts the first name.

!Warning

Remember: Excel uses $1, $2 for backreferences — not \1, \2. This is different from most regex tools and programming languages. If your replacement isn't working, this is probably why.

We'll use capture groups in more detail in the real-world examples section. For now, just know that this is possible — and it's one of the things that makes REGEXREPLACE incredibly useful for reformatting data.

✓Tip

Quick summary of all three functions: REGEXTEST answers "does it match?" REGEXEXTRACT answers "what matched?" REGEXREPLACE answers "what does it look like after I change the match?" Three questions, three functions — that's the entire REGEX toolkit in Excel.

Bonus

Use REGEX Inside XLOOKUP and XMATCH

Here's something a lot of people don't know yet. Along with the three REGEX functions, Microsoft also added regex support to two functions you probably already use — XLOOKUP and XMATCH.

You don't need a separate REGEX function. You just change one argument — the match_mode — and suddenly your lookup can search using patterns instead of exact values.

The match_mode Argument

Both XLOOKUP and XMATCH have a match_mode argument that controls how they search. Here are the options:

0 Exact Default. Must be an exact match.

-1 Exact or Smaller Finds exact match or next smallest.

1 Exact or Larger Finds exact match or next largest.

2 Wildcard Uses * and ? wildcards.

3 Regex Uses regex patterns. This is the new one.

Setting match_mode to 3 tells XLOOKUP or XMATCH: "The lookup value I'm giving you is a regex pattern. Use it to search." That's it — one number changes everything.

Availability: Regex match mode (3) in XLOOKUP and XMATCH requires Microsoft 365 (Windows, Mac, or Web). Make sure your Excel is fully updated.

Look Up a Country That's Spelled Differently

This is a very common problem. You have a data table where the country column is messy — some rows say "USA", others say "United States", maybe one says "U.S.A." You need to look up a value, but you're not sure which spelling appears first.

With a regular XLOOKUP, you'd need an exact match. With regex match mode, you can use the OR operator | from the crash course to match any of them at once.

Below are the steps.

Let's say your data is in columns A and B — country names in A, sales figures in B.
In a blank cell, enter this formula.

=XLOOKUP("USA|United States",A2:A10,B2:B10,"Not found",3)

Hit Enter. XLOOKUP finds the first row in column A that contains either "USA" or "United States" and returns the matching value from column B.

Look at what happened. The formula scanned column A from top to bottom. Row 5 has "United States" — that matches the second option in our USA|United States pattern. So XLOOKUP returns 82,500 from column B.

Notice that row 8 also has "USA" which would match too. But XLOOKUP returns the first match it finds, which is row 5.

How This Formula Works

XLOOKUP("USA|United States", A2:A10, B2:B10, "Not found", 3)

"USA|United States" The lookup value — but since we set match_mode to 3, this is treated as a regex pattern. The | means OR. So it matches "USA" or "United States".

3 The match_mode — this is the key. Setting it to 3 tells XLOOKUP to treat the lookup value as a regex pattern instead of a literal value.

→ Finds the first row matching either "USA" or "United States"

★Note

The simple power here: With match_mode 3, even typing just "USA" as the lookup value does a partial match. XLOOKUP checks if the pattern exists anywhere in the cell. So "United States of America" would also match "States" if that was your pattern. If you want an exact cell match, use anchors: "^USA$".

Same Thing with XMATCH

XMATCH works the same way — match_mode 3 enables regex. The difference is that XMATCH returns the position (row number) of the match instead of a value from another column.

=XMATCH("USA|United States",A2:A10,3)

This returns 4 — because "United States" is the 4th item in the range A2:A10. You can use this position number inside INDEX or other functions when you need more control over what gets returned.

Making It Case-Insensitive

One thing to know — regex match_mode in XLOOKUP and XMATCH is case-sensitive by default. So "usa" wouldn't match "USA".

Unlike the three REGEX functions, XLOOKUP doesn't have a separate case_sensitivity argument. So you use the inline flag from the crash course — (?i) — at the start of your pattern.

=XLOOKUP("(?i)usa|united states",A2:A10,B2:B10,"Not found",3)

Now it matches "USA", "usa", "United States", "united states" — any combination of upper and lowercase.

!Warning

Performance tip: Regex matching is slower than exact matching. On large datasets, avoid using whole-column references like A:A. Use specific ranges like A2:A1000 or, even better, convert your data to an Excel Table and use structured references — they expand automatically without scanning empty rows.

✓Tip

Quick recap: You don't need a separate REGEX function for lookups. Just set the 5th argument of XLOOKUP (or the 3rd argument of XMATCH) to 3, and your lookup value becomes a regex pattern. Use | for OR matching, (?i) for case-insensitive, and anchors ^...$ when you need an exact cell match.

Building Real-World Patterns — Step by Step

You now know all three REGEX functions and the basic building blocks. But when you look at a real-world pattern — like one that validates an email address — it can look like a mess of random characters.

It's not random. Every pattern is just small pieces put together in order. Let me walk you through the most common ones, piece by piece, so you can actually read them and even modify them yourself.

Pattern 1 — Phone Number Format: 415-555-1234

Anchor Character Class Quantifier Literal

^ \d {3} - \d {3} - \d {4} $

^ Start of text. Makes sure nothing comes before the phone number. Without this, "Call 415-555-1234 now" would also match.
\d{3} Three digits. \d means any digit (0–9). {3} means exactly three. This is the area code — like "415".
- A literal dash. The dash between the area code and the next group. Since - has no special meaning outside square brackets, no escape needed.
\d{3} Three more digits. The middle group — like "555".
- Another literal dash.
\d{4} Four digits. The last group — like "1234".
$ End of text. Makes sure nothing comes after the phone number. Together with ^, this ensures the entire cell is just a phone number.

What it matches 415-555-1234 800-999-0000 4155551234 (415) 555-1234 Call 415-555-1234

✓Tip

Want to extract a phone number from inside longer text? Just remove the ^ and $ anchors. Without anchors, the pattern finds the phone number no matter where it sits: =REGEXEXTRACT(A2,"\d{3}-\d{3}-\d{4}")

Pattern 2 — Phone Number with Parentheses Format: (415) 555-1234

^ $ \d {3} $ \s \d {3} - \d {4} $

 Escaped parentheses. In regex, ( and ) normally create a capture group. Adding \ before them tells regex: "I mean the actual parenthesis character." So $\d{3}$ matches "(415)".
\s One space. The space between the closing parenthesis and the next three digits. \s matches any whitespace character.

What it matches (415) 555-1234 (800) 999-0000 415-555-1234 (415)555-1234

See how we built on Pattern 1? We just added escaped parentheses and a space. That's the key idea — you're not memorizing patterns, you're combining pieces.

Pattern 3 — Email Address Validate

Anchor Character Class Quantifier Literal Escaped

^ [\w\.\-] + @ [\w\.\-] + \. \w {2,} $

^ Start of text. We want to validate the whole cell, not find an email inside longer text.
[\w\.\-]+ The username part (before the @). The square brackets define a set of allowed characters: \w = any letter, digit, or underscore. \. = a literal dot. \- = a literal dash. The + means one or more of these. So "puneet", "jane.doe", and "first-last" all match.
@ The @ symbol. Just a literal character. Nothing special in regex — it matches itself.
[\w\.\-]+ The domain name (after the @, before the last dot). Same set of characters as the username. Matches "excelchamps", "company.co", "my-domain".
\. A literal dot. The dot before the extension. We escape it with \ because without the escape, . means "any character" in regex.
\w{2,} The extension — two or more word characters. Matches "com", "org", "co", "info". The {2,} means "at least 2" so single-letter extensions are rejected.
$ End of text. Nothing allowed after the extension.

What it matches [email protected] [email protected] [email protected] not-an-email hello@ @domain.com

★Note

To validate vs. to extract — same pattern, different approach. Use this pattern inside REGEXTEST to get TRUE/FALSE. Remove the ^ and $ anchors, then use it inside REGEXEXTRACT to pull email addresses out of longer text.

Pattern 4 — Date Format: 2025-01-15 or 2025/01/15

\d {4} [-/] \d {1,2} [-/] \d {1,2}

\d{4} Four digits. The year — like "2025".
[-/] A dash or slash. The character class [-/] matches either character. This lets the pattern handle both "2025-01-15" and "2025/01/15".
\d{1,2} One or two digits. The month — matches both "1" and "01". Same logic for the day.

What it matches 2025-01-15 2025/1/5 2024-12-31 01-15-2025 Jan 15, 2025

Notice this pattern doesn't use anchors, so it can find dates inside longer text. If you want to validate that the entire cell is only a date, add ^ at the start and $ at the end.

✓Tip

Split a date into year, month, and day. Use capture groups with REGEXEXTRACT return_mode 2: =REGEXEXTRACT(A2,"(\d{4})[-/](\d{1,2})[-/](\d{1,2})",2) — this spills the year, month, and day into three separate cells.

Pattern 5 — Structured Code / ID Format: ABC-1234 or XX-999

^ [A-Z] {2,4} - \d {3,5} $

[A-Z]{2,4} Two to four uppercase letters. The prefix — like "ABC", "XY", or "PROD".
- A literal dash. The separator between the letters and numbers.
\d{3,5} Three to five digits. The numeric part — like "1234" or "99001".

What it matches ABC-1234 XY-999 PROD-55001 abc-1234 A-12 ABCDE-1234

This pattern is case-sensitive by default. "abc-1234" doesn't match because [A-Z] only allows uppercase. To accept both cases, add (?i) at the start of the pattern, or change it to [A-Za-z].

Pattern 6 — Dollar Amount Format: $1,250.00

\$ \d [\d,] * \. \d {2}

\$ A literal dollar sign. We escape it because $ normally means "end of text" in regex. The backslash says "I mean the actual $ character."
\d At least one digit. This ensures there's a number after the dollar sign — so "$" alone doesn't match.
[\d,]* More digits or commas, zero or more times. This handles the optional thousands separator. "1,250" and "50" both work.
\. A literal dot. The decimal point. Escaped because . matches anything in regex.
\d{2} Exactly two digits. The cents — like "00" or "99".

What it matches $1,250.00 $49.99 $0.50 1250.00 $49 $1,250

How to Build Any Pattern — The 3-Step Process

Every pattern in this section was built the same way. Here's the process I use.

Look at the data and describe the format in plain English. For a phone number: "three digits, a dash, three digits, a dash, four digits." That sentence IS your pattern — you just need to translate each word.
Translate each piece into regex. "Three digits" = \d{3}. "A dash" = -. "Four digits" = \d{4}. Connect them in order.
Decide: validate or extract? If you're validating a whole cell, add ^ and $ anchors. If you're extracting from longer text, leave them out.

★Note

You don't need to memorize patterns. Bookmark this section and come back whenever you need one. Or describe what you're looking for in plain English to an AI tool — it can generate the regex pattern for you. Then test it at regex101.com (PCRE2 flavor) before putting it into Excel.

Quick Reference — Common Patterns

Here are patterns for the most common tasks. Copy them directly into your REGEX formulas.

What You're Looking For	Pattern	Notes
Contains any digit	\d	Use with REGEXTEST
Only digits (whole cell)	^\d+$	Anchors ensure full match
Only letters (whole cell)	^[A-Za-z]+$	No digits, spaces, or symbols
Phone: 415-555-1234	\d{3}-\d{3}-\d{4}	Add anchors for validation
Phone: (415) 555-1234	$\d{3}$\s\d{3}-\d{4}	Escape the parentheses
Email address	[\w\.\-]+@[\w\.\-]+\.\w{2,}	Add anchors for validation
Date: YYYY-MM-DD	\d{4}[-/]\d{1,2}[-/]\d{1,2}	Handles both - and /
Dollar amount: $1,250.00	\$\d[\d,]*\.\d{2}	Escape the $ and .
ZIP code (US, 5-digit)	^\d{5}$	Add `(-\d{4})?` for ZIP+4
Hashtag	#\w+	Use return_mode 1 for all matches
URL (http/https)	https?://[\w\.\-/]+	The `?` makes the "s" optional
Remove extra spaces	\s{2,}	Replace with single space in REGEXREPLACE
Non-numeric characters	[^0-9]	Replace with "" to keep only digits
First and Last name	(\w+)\s(\w+)	Use $2, $1 in REGEXREPLACE to swap

Real-World Scenarios — REGEX Combined with Other Functions

Now you know the patterns and the functions. In this section, we put them together to solve real data problems — the kind of stuff you actually deal with at work.

Each case shows you the scenario, the formula, and how it works step by step.

CASE 1 Validate Emails and Flag Invalid Rows with IF

📋 The problem: You have a list of email addresses in column A. Some are valid, some are garbage. You need a "Valid" or "Invalid" label in column B so you can clean up the bad ones.

We'll combine REGEXTEST with IF. REGEXTEST gives us TRUE or FALSE, and IF converts that into a readable label.

Click on cell B2 and enter this formula.

=IF(REGEXTEST(A2,"^[\w\.\-]+@[\w\.\-]+\.\w{2,}$"),"Valid","Invalid")

Hit Enter and drag the formula down.

REGEXTEST checks the email pattern (we built this one in the pattern anatomy section). IF takes the TRUE/FALSE and converts it to "Valid" or "Invalid." That's all there is to it.

✓Tip

Take it further: You can replace "Valid" and "Invalid" with anything — emojis like ✅ and ❌, numbers like 1 and 0 for counting, or even another formula.

CASE 2 Filter a Table to Show Only Valid Phone Numbers

📞 The problem: You have a customer list with names in column A and phone numbers in column B. Some phone numbers are in the right format (###-###-####), some are not. You want to filter the table to show only the rows with valid phone numbers.

We'll combine REGEXTEST with FILTER. REGEXTEST tests each phone number, and FILTER uses the results to show only matching rows.

Click on a blank cell where you want the filtered results to appear and enter this formula.

=FILTER(A2:B8,REGEXTEST(B2:B8,"^\d{3}-\d{3}-\d{4}$"),"No valid numbers")

Hit Enter. Excel spills only the rows where the phone number matches the pattern.

REGEXTEST runs against every cell in B2:B8 and creates an array of TRUE/FALSE values. FILTER uses that array and returns only the rows where the result is TRUE. Amit and Ravi got filtered out because their phone numbers don't match the ###-###-#### format.

★Note

AND / OR logic: Use * for AND and + for OR when combining multiple REGEXTEST conditions inside FILTER. For example: =FILTER(A2:C10, REGEXTEST(B2:B10,"\d{3}-\d{3}-\d{4}") * REGEXTEST(C2:C10,"@")) returns rows where column B has a valid phone AND column C has an @ symbol.

CASE 3 Extract and Clean Phone Numbers with LET

🧹 The problem: You have phone numbers in all kinds of messy formats — "(415) 555-1234", "415.555.1234", "415 555 1234", "4155551234". You need all of them in one clean format: 415-555-1234.

This is a two-step job. First, strip everything that's not a digit. Second, insert the dashes in the right places. We'll use LET to keep it readable.

Click on cell B2 and enter this formula.

=LET(digits,REGEXREPLACE(A2,"[^0-9]",""),REGEXREPLACE(digits,"(\d{3})(\d{3})(\d{4})","$1-$2-$3"))

Hit Enter and drag it down.

(415) 555-1234 ➜ 415-555-1234

415.555.1234 ➜ 415-555-1234

4155551234 ➜ 415-555-1234

Let me walk you through what's happening inside this formula.

LET creates a variable called digits. It stores the result of the first REGEXREPLACE — which strips all non-digit characters. So "(415) 555-1234" becomes "4155551234".
The second REGEXREPLACE uses capture groups. The pattern (\d{3})(\d{3})(\d{4}) splits the 10 digits into three groups: first 3, next 3, last 4.
The replacement "$1-$2-$3" puts dashes between the groups. So "4155551234" becomes "415-555-1234".

✓Tip

Why use LET? Without LET, you'd have to nest one REGEXREPLACE inside another, and the formula becomes hard to read. LET gives the intermediate result a name (digits), so the formula reads like a recipe: first clean, then format.

CASE 4 Extract All Email Addresses from a Cell

📧 The problem: You have cells with long text — meeting notes, support tickets, comments — and some of them contain one or more email addresses buried inside. You need to pull all the emails out.

This is where REGEXEXTRACT's return_mode 1 comes in. We've been using the default (mode 0 — first match only). Mode 1 returns all matches.

Enter this formula in the cell next to your text.

=REGEXEXTRACT(A2,"[\w\.\-]+@[\w\.\-]+\.\w{2,}",1)

Hit Enter. If the cell has multiple emails, they spill into separate cells below.

Notice we removed the ^ and $ anchors from the email pattern. That's because we're not validating — we're extracting from inside longer text. The pattern needs to find matches anywhere in the cell.

!Warning

Watch out for #SPILL errors. When return_mode 1 finds multiple matches, the results spill downward. If you drag the formula down for multiple rows, the spill areas overlap and you get a #SPILL error. To fix this, wrap the formula in TEXTJOIN to combine all emails into one cell: =TEXTJOIN(", ",TRUE,REGEXEXTRACT(A2,"[\w\.\-]+@[\w\.\-]+\.\w{2,}",1))

CASE 5 Split a Date into Year, Month, and Day

📅 The problem: You have dates in "2025-01-15" format in column A. You need the year, month, and day in separate columns — B, C, and D.

This is where REGEXEXTRACT's return_mode 2 comes in. Mode 2 returns the capture groups separately. We put parentheses around each part of the date we want to capture.

Enter this formula in cell B2.

=REGEXEXTRACT(A2,"(\d{4})-(\d{1,2})-(\d{1,2})",2)

Hit Enter. The year, month, and day spill into B2, C2, and D2 automatically.

The three pairs of parentheses in the pattern create three capture groups. (\d{4}) captures the year. (\d{1,2}) captures the month. The second (\d{1,2}) captures the day. Return_mode 2 tells REGEXEXTRACT to return each captured group separately — and they spill across columns.

★Note

Remember: REGEXEXTRACT returns text, not numbers. If you need to do math with the year, month, or day, wrap each result in VALUE(). Or wrap the entire formula: =VALUE(REGEXEXTRACT(...)).

CASE 6 Build a Reusable Phone Formatter with LAMBDA

⚡ The problem: You use the phone number cleaning formula from Case 3 all the time. Writing that LET formula every time is annoying. You want to create a custom function you can call like =CLEANPHONE(A2) and have it just work.

LAMBDA lets you turn any formula into a reusable named function. Here's how to set it up.

Go to Formulas Tab ➜ Name Manager ➜ New.
Set the Name to CLEANPHONE.
In the Refers to field, paste this formula.

=LAMBDA(phone,LET(digits,REGEXREPLACE(phone,"[^0-9]",""),REGEXREPLACE(digits,"(\d{3})(\d{3})(\d{4})","$1-$2-$3")))

Click OK to save it.
Now you can use it anywhere in the workbook like this:

=CLEANPHONE(A2)

(415) 555-1234 ➜ 415-555-1234

And that is the moment of joy. You've built your own custom function — one that cleans any phone number format into a consistent output. Share the workbook, and anyone can use =CLEANPHONE() without knowing how it works inside.

✓Tip

The best thing is — this is a one-time setup. Build it once, use it forever. You can create LAMBDA functions for any regex pattern — email validation, date formatting, ID cleanup — and turn your workbook into a toolbox.

Common Mistakes to Avoid with REGEX in Excel

Regex is powerful, but it has a few traps that catch almost everyone — especially when you're starting out. I've listed the most common ones here. Click on each mistake to see what goes wrong and how to fix it.

1 Forgetting to escape special characters +

This is the #1 beginner mistake. In regex, characters like . ( ) $ * + ? have special meaning. If you want to match the actual character, you need a backslash before it.

The most common one: using . to match a literal dot. Without escaping, . matches any character at all — so "a.b" matches "a1b", "a-b", "axb", everything.

✗ Wrong [\w.]+@[\w.]+.\w{2,}

✓ Right [\w\.\-]+@[\w\.\-]+\.\w{2,}

The wrong pattern uses unescaped dots. It might seem to work on most data, but it's technically matching more than it should. The right pattern escapes every dot with \..

2 Using \1 instead of $1 in REGEXREPLACE +

In most regex tools and programming languages, you reference capture groups in the replacement using \1, \2, etc. Excel does it differently. Excel uses $1, $2.

✗ Wrong =REGEXREPLACE(A2,"(\w+)\s(\w+)","\2, \1")

✓ Right =REGEXREPLACE(A2,"(\w+)\s(\w+)","$2, $1")

If your REGEXREPLACE replacement isn't working, this is almost always the reason. Check for \1 and replace it with $1.

3 Not wrapping REGEXEXTRACT in IFERROR +

REGEXEXTRACT throws a #N/A error when it can't find a match. If your data has even one cell without a match, you'll see an ugly error in your results.

✗ Risky =REGEXEXTRACT(A2,"\d+")

✓ Safe =IFERROR(REGEXEXTRACT(A2,"\d+"),"")

Always wrap REGEXEXTRACT in IFERROR unless you're 100% sure every cell will have a match. Replace the "" with whatever fallback you want — "None", 0, or even another formula.

4 Getting #SPILL errors with return_mode 1 +

When you use REGEXEXTRACT with return_mode 1 (all matches), the results spill into multiple cells. If you drag this formula down for multiple rows, the spill areas overlap each other — and Excel gives you a #SPILL error.

The fix: Wrap the formula in TEXTJOIN to combine all matches into a single cell.

✗ Spill problem =REGEXEXTRACT(A2,"#\w+",1)

✓ One cell =TEXTJOIN(", ",TRUE,REGEXEXTRACT(A2,"#\w+",1))

Now all matches go into one cell, separated by commas. No spill, no errors when dragging down.

5 Expecting numbers from REGEXEXTRACT +

REGEXEXTRACT always returns text — even when it extracts digits. So "123" comes back as the text string "123", not the number 123. If you try to do math with it, you'll get unexpected results or errors.

✗ Returns text =REGEXEXTRACT(A2,"\d+") + 10

✓ Converts to number =VALUE(REGEXEXTRACT(A2,"\d+")) + 10

Wrap the result in VALUE() before doing any math.

6 Forgetting that REGEX is case-sensitive by default +

All three REGEX functions and XLOOKUP/XMATCH with match_mode 3 are case-sensitive by default. [A-Z] won't match lowercase letters. "hello" won't match a pattern looking for "Hello".

Two ways to fix it:

Option 1 — Argument =REGEXTEST(A2,"hello",1)
Set case_sensitivity to 1

Option 2 — Inline flag =REGEXTEST(A2,"(?i)hello")
Add (?i) at the start of pattern

Use the (?i) flag for XLOOKUP/XMATCH since they don't have a separate case_sensitivity argument.

7 Using whole-column references like A:A +

Regex matching is more computationally expensive than regular Excel operations. When you use a whole-column reference like A:A, Excel runs the regex engine against over a million cells — most of them empty.

✗ Slow =FILTER(A:B,REGEXTEST(A:A,"\d+"))

✓ Fast =FILTER(A2:B500,REGEXTEST(A2:A500,"\d+"))

Always use specific ranges. Even better — convert your data to an Excel Table and use structured references. They expand automatically as you add rows without scanning empty cells.

8 Missing anchors when validating entire cell values +

REGEXTEST checks if any part of the text matches. If you want to validate that the entire cell is a valid email (not just that it contains one somewhere), you need anchors.

✗ Partial match =REGEXTEST(A2,"[\w\.]+@[\w\.]+\.\w{2,}")
"Some text [email protected] more text" → TRUE

✓ Full match =REGEXTEST(A2,"^[\w\.]+@[\w\.]+\.\w{2,}$")
"Some text [email protected] more text" → FALSE

Add ^ at the start and $ at the end when you're validating. Remove them when you're extracting from inside longer text.

⚠ The Golden Rule

Escape your dots: \. not .
Use $1 not \1 in replacements
Wrap REGEXEXTRACT in IFERROR
Use TEXTJOIN with return_mode 1
VALUE() to convert extracted text to numbers
Add (?i) or set case_sensitivity to 1 for case-insensitive matching
Avoid whole-column references — use specific ranges or Tables
Use ^...$ anchors when validating, remove them when extracting

When to Use REGEX vs. Traditional Functions

REGEX doesn't replace everything. For simple, predictable tasks, the old functions — FIND, LEFT, MID, RIGHT, SUBSTITUTE — still work perfectly fine and are easier to read.

But the moment your data gets messy or the pattern gets complex, REGEX pulls ahead. Here's a side-by-side comparison so you can see exactly where each approach wins.

Task	Old Way	REGEX Way	Winner
Get first 3 characters	=LEFT(A2,3)	Not needed — LEFT is perfect for this.	Old
Replace one word with another	=SUBSTITUTE(A2,"old","new")	=REGEXREPLACE(A2,"old","new")	Either
Check if cell contains a number	=SUMPRODUCT(--ISNUMBER( FIND({0,1,2,3,4,5,6,7,8,9},A2)))>0	=REGEXTEST(A2,"\d")	Regex
Extract a phone number from text	=MID(A2,FIND("-",A2)-3,12) Assumes position — breaks easily	=REGEXEXTRACT(A2, "\d{3}-\d{3}-\d{4}")	Regex
Validate email format	=AND(ISERROR(FIND(" ",A2)), LEN(A2)-LEN(SUBSTITUTE(A2, "@",""))=1, FIND("@",A2)>1, FIND(".",A2,FIND("@",A2)) >FIND("@",A2)+1) 4 conditions, still misses edge cases	=REGEXTEST(A2, "^[\w\.\-]+@[\w\.\-]+\.\w{2,}$")	Regex
Remove all non-numeric characters	Nested SUBSTITUTE for each character — (, ), -, space, dot… One SUBSTITUTE per character	=REGEXREPLACE(A2,"[^0-9]","")	Regex
Swap "First Last" → "Last, First"	=MID(A2,FIND(" ",A2)+1,LEN(A2)) &", "&LEFT(A2,FIND(" ",A2)-1) Breaks if there's no space or multiple spaces	=REGEXREPLACE(A2, "(\w+)\s(\w+)","$2, $1")	Regex
Find text between parentheses	=MID(A2,FIND("(",A2)+1, FIND(")",A2)-FIND("(",A2)-1) Errors if no parentheses exist	=REGEXEXTRACT(A2, "$([^)]+)$",2)	Regex
Replace extra spaces with one space	=TRIM(A2) TRIM handles this well	=REGEXREPLACE(A2,"\s{2,}"," ")	Old
Look up messy country names (USA / United States)	Multiple XLOOKUP calls combined with IFERROR One lookup per spelling variation	=XLOOKUP("USA\|United States", A2:A10,B2:B10,,3)	Regex

The Bottom Line

Here's how I think about it.

Use traditional functions when the task is simple and the data is predictable. Getting the first 5 characters? LEFT. Replacing one known word? SUBSTITUTE. Trimming spaces? TRIM. These are fast, readable, and everyone understands them.

Use REGEX when the pattern is complex, the data is inconsistent, or you'd need 3+ nested functions to do the same job. Validating formats, extracting from messy text, cleaning multiple character types at once, pattern-based lookups — that's REGEX territory.

✓Tip

My rule of thumb: If you're about to nest FIND inside MID inside SUBSTITUTE inside IFERROR — stop and ask yourself if a single REGEX formula could do the same thing. Nine times out of ten, it can. And it'll be easier to read six months later.

★Note

One thing to keep in mind: REGEX functions only work in Excel 365. If you share your workbook with someone on Excel 2021 or earlier, the REGEX formulas will show errors on their end. For shared workbooks where compatibility matters, stick with the traditional functions — or convert your REGEX results to static values before sharing (Copy ➜ Paste Values).

Frequently Asked Questions

Do REGEX functions work in Excel 2021 or Excel 2019? +

No. REGEXTEST, REGEXEXTRACT, and REGEXREPLACE are only available in Microsoft 365 (Excel for Windows, Mac, and Web). They are not available in Excel 2021, 2019, or any earlier version. If you're on an older version and need regex, your only option is VBA with the Microsoft VBScript Regular Expressions 5.5 library — but that's a completely different setup.

What regex flavor does Excel use? +

Excel uses PCRE2 (Perl Compatible Regular Expressions, version 2). This is the same flavor used by many popular tools and programming languages. When testing your patterns on sites like regex101.com, make sure to select PCRE2 from the flavor dropdown on the left — this ensures your pattern behaves exactly the same way in Excel.

Can I use REGEX in Google Sheets too? +

Yes, Google Sheets has had regex functions for years — REGEXMATCH, REGEXEXTRACT, and REGEXREPLACE. The function names and syntax are slightly different from Excel's, but the regex patterns themselves are cross-compatible. So if you learn a pattern for Excel, you can use the same pattern in Google Sheets — and vice versa. The biggest difference is that Google Sheets uses RE2 as its regex flavor, which doesn't support some advanced PCRE2 features like lookaheads and lookbehinds.

How do I make REGEX case-insensitive? +

Two ways. Option 1: Set the case_sensitivity argument to 1 in any of the three REGEX functions. For example, =REGEXTEST(A2,"hello",1) matches "hello", "Hello", and "HELLO". Option 2: Add (?i) at the start of your pattern. For example, =REGEXTEST(A2,"(?i)hello"). Use Option 2 when working with XLOOKUP or XMATCH (match_mode 3), since those functions don't have a separate case_sensitivity argument.

Why does REGEXEXTRACT return text instead of a number? +

REGEXEXTRACT always returns text values — this is by design. Even if it extracts "123", the result is the text string "123", not the number 123. To convert it to a number for calculations, wrap the result in the VALUE() function: =VALUE(REGEXEXTRACT(A2,"\d+")). This applies to all return modes.

Can I use REGEX in conditional formatting rules? +

Not directly — Excel's conditional formatting doesn't accept REGEX functions in its rule builder. But there's a workaround. Create a helper column with a REGEXTEST formula (which returns TRUE/FALSE), and then set your conditional formatting rule to reference that helper column. For example, put =REGEXTEST(A2,"^\d+$") in column B, then use conditional formatting with the formula =B2=TRUE to highlight rows.

What's the #SPILL error I keep getting with REGEXEXTRACT? +

This happens when you use return_mode 1 (all matches). REGEXEXTRACT spills multiple results into cells below the formula. When you drag the formula down for multiple rows, those spill areas overlap — and Excel throws a #SPILL error. The fix: Wrap the formula in TEXTJOIN to combine all matches into a single cell: =TEXTJOIN(", ",TRUE,REGEXEXTRACT(A2,"pattern",1)). You can also wrap the whole thing in IFERROR to handle cells with no matches.

Where can I test my regex pattern before using it in Excel? +

The best free tool is regex101.com. Make sure to set the flavor to PCRE2 (select it from the left sidebar). You can type your pattern and test text, and the site highlights matches in real time, explains each part of your pattern, and shows capture groups. It's the fastest way to debug a regex pattern before putting it into an Excel formula. You can also use the live tester we built earlier in this guide.

Can I use REGEX with XLOOKUP and XMATCH? +

Yes. Both XLOOKUP and XMATCH now support a match_mode of 3 for regex matching. Set the 5th argument of XLOOKUP (or the 3rd argument of XMATCH) to 3, and your lookup value is treated as a regex pattern. For example, =XLOOKUP("USA|United States",A2:A10,B2:B10,"Not found",3) matches either "USA" or "United States" in the lookup range. This is available in Microsoft 365 only.

Is there a performance issue with REGEX on large datasets? +

Regex is more computationally expensive than standard Excel functions like FIND or LEFT. On small to medium datasets (a few thousand rows), you won't notice any difference. On very large datasets (50,000+ rows), complex regex patterns can slow down calculation. To keep things fast: avoid whole-column references like A:A — always use specific ranges like A2:A5000. Convert your data to an Excel Table so structured references expand automatically. And avoid running the same complex regex pattern across dozens of cells if a helper column can do the job once.

In the End

That's everything you need to get started with REGEX in Excel. Three functions, REGEXTEST, REGEXEXTRACT, and REGEXREPLACE, and a handful of building blocks that you can combine to handle almost any text problem.

The patterns might look intimidating at first. But once you understand that every pattern is just a description of what the text looks like, character classes, quantifiers, anchors, it clicks. And once it clicks, you'll wonder how you ever lived without it.

Start small. Try REGEXTEST with a simple [0-9] pattern on your own data. Then move to REGEXEXTRACT. Then try a REGEXREPLACE. Build from there. You don't need to memorize anything, that's what the cheat sheet and pattern library in this guide are for.

Now I want to hear from you, which REGEX function are you going to try first? And what kind of messy data are you dealing with? Let me know in the comments below. And please, don't forget to share this guide with your friends, I'm sure they'll appreciate it.