What Is a Regular Expression? Regex Explained with Examples

Updated May 5, 2026 · 10 min read

Written by Muhammad Abdullah Rauf · Founder, EverydayTools.pro

A regular expression (regex) is a pattern that describes a set of strings. Used in every programming language for search, validation, and text extraction.

If you have ever written code that checks whether an email address looks valid, extracted all phone numbers from a document, or replaced every occurrence of a word in a file — you were doing what regex does, just manually. A regular expression lets you write that logic as a compact pattern that the language engine executes for you.

Regex is built into JavaScript, Python, Java, Ruby, Go, Rust, SQL, and virtually every other general-purpose language. Once you understand the syntax, you carry that skill across every tool and language you use.

Try it now

Test every example in this article with the free regex tester — real-time highlighting, capture group breakdown, and flag controls.

Open Free Regex Tester →

What Is a Regular Expression?

A regular expression is a sequence of characters that forms a search pattern. The pattern can be used to:

  • Test — does this string match the pattern?
  • Search — find all substrings matching the pattern
  • Extract — pull out specific parts of a match (capture groups)
  • Replace — substitute matching text with something else
  • Split — break a string wherever the pattern matches

In most languages, a regex is written between two forward slashes: /pattern/flags. In Python you pass a raw string to re functions: re.search(r"pattern", text).

// Match a 4-digit year anywhere in a string

/\d{4}/

// Match a full US date like 2026-05-05

/^\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])$/

Regex Syntax: Quick-Reference Cheat Sheet

These are the building blocks of every regex pattern. Learn these and you can read or write the vast majority of regular expressions you will encounter.

TokenMeaningExample / Matches
.Any character except newlineh.that, hot, hit
^Start of string (or line in multiline mode)^Hello"Hello world" but not "Say Hello"
$End of string (or line in multiline mode)world$"Hello world" but not "worldwide"
*0 or more of the preceding tokengo*gleggle, gogle, google, gooogle
+1 or more of the preceding tokengo+glegogle, google — not ggle
?0 or 1 of the preceding token (optional)colou?rcolor, colour
{n}Exactly n repetitions\d{4}2026, 1999, 0000
{n,m}Between n and m repetitions\d{2,4}12, 123, 1234
[abc]Character class — matches a, b, or c[aeiou]any vowel
[^abc]Negated class — matches anything except a, b, c[^0-9]any non-digit
[a-z]Range — matches any lowercase letter[A-Z]any uppercase letter
(abc)Capturing group(\d{4})-(\d{2})captures year and month separately
(?:abc)Non-capturing group(?:www\.)?exampleexample.com or www.example.com
a|bAlternation — matches a or bcat|dogcat or dog
\dAny digit [0-9]\d+42, 100, 9
\DAny non-digit\D+hello, abc
\wWord character [a-zA-Z0-9_]\w+hello, foo_bar, test123
\WNon-word character\W+spaces, punctuation
\sWhitespace (space, tab, newline)\s+spaces between words
\SNon-whitespace character\S+any continuous word/number
\bWord boundary\bcat\b"cat" in "the cat sat" but not "concatenate"

Regex Flags

Flags modify how the engine interprets the pattern. They go after the closing slash in JavaScript (/pattern/gi) or as a second argument in Python (re.findall(pattern, text, re.I)).

FlagNameEffect
gGlobalFind all matches, not just the first
iCase-insensitiveMatch regardless of upper/lowercase
mMultiline^ and $ match start/end of each line, not the whole string
sDotall. matches newline characters too (not in all flavors)
uUnicodeEnable full Unicode matching (recommended in JS)
xVerbose (Python)Allow whitespace and comments inside pattern for readability

8 Commonly Used Regex Patterns

These patterns cover the validation tasks developers reach for most often. Copy any pattern into the regex tester to see live matches and explanations.

Email address

/^[^\s@]+@[^\s@]+\.[^\s@]+$/

Matches: Basic email format validation

Catches most invalid formats. Not RFC 5322 complete — use a library for strict validation.

US phone number

/^(\+1)?[\s.-]?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$/

Matches: Matches (555) 867-5309, 555.867.5309, +1 555 867 5309

Flexible formatting. Adjust for international numbers.

URL

/^https?:\/\/[\w\-.]+(:\d+)?(\/[\w\-./?%&=]*)?$/

Matches: Validates http/https URLs

Does not handle all edge cases — use URL() constructor for robust parsing.

IPv4 address

/^(\d{1,3}\.){3}\d{1,3}$/

Matches: Matches 192.168.1.1, 0.0.0.0, 255.255.255.255

Add bounds check for each octet (0–255) in code logic.

Hex color code

/^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/

Matches: Matches #FFF, #ffffff, #1a2b3c

Covers both 3-digit shorthand and 6-digit hex.

Date (YYYY-MM-DD)

/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/

Matches: Matches 2026-05-05, 1999-12-31

Validates month/day ranges. Does not check leap years — validate in code.

Slug (URL-friendly)

/^[a-z0-9]+(?:-[a-z0-9]+)*$/

Matches: Matches my-blog-post, tool-name-2026

Lowercase letters, digits, hyphens only. No leading/trailing hyphens.

Password strength

/^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[\W_]).{8,}$/

Matches: Min 8 chars with uppercase, lowercase, digit, special char

Uses lookaheads. Adjust minimum length and requirements as needed.

Code Examples in JavaScript, Python, and TypeScript

The regex syntax is nearly identical across languages. The main differences are in how you call the function — not the pattern itself.

JSJavaScript
// Test if a string matches a pattern
const email = "user@example.com";
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
console.log(emailRegex.test(email)); // true

// Extract all matches
const text = "Call 555-1234 or 555-5678 for support";
const phones = text.match(/\d{3}-\d{4}/g);
console.log(phones); // ["555-1234", "555-5678"]

// Replace with a function
const result = "2026-05-05".replace(
  /(\d{4})-(\d{2})-(\d{2})/,
  (_, y, m, d) => `${d}/${m}/${y}`
);
console.log(result); // "05/05/2026"
PythonPython
import re

# Search for a pattern
text = "Order placed on 2026-05-05 at 14:30"
match = re.search(r"\d{4}-\d{2}-\d{2}", text)
if match:
    print(match.group())  # "2026-05-05"

# Find all matches
emails = re.findall(r"[^\s@]+@[^\s@]+\.[^\s@]+", text)

# Replace using regex
clean = re.sub(r"\s+", " ", "too   many    spaces")
print(clean)  # "too many spaces"

# Compile for reuse (faster when used many times)
pattern = re.compile(r"\d{3}-\d{4}")
print(pattern.findall("555-1234 and 555-5678"))
TSTypeScript
// Named capture groups (ES2018+)
const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = "2026-05-05".match(dateRegex);

if (match?.groups) {
  const { year, month, day } = match.groups;
  console.log(year, month, day); // "2026" "05" "05"
}

// Validate input before processing
function isValidSlug(s: string): boolean {
  return /^[a-z0-9]+(?:-[a-z0-9]+)*$/.test(s);
}

4 Common Regex Pitfalls

Most regex bugs fall into a small number of categories. Knowing these saves hours of debugging.

1. Catastrophic backtracking

Patterns like (a+)+ on a long non-matching string can hang the engine for seconds or minutes. Avoid nested quantifiers on overlapping patterns. Use atomic groups or possessive quantifiers when available.

// Dangerous: (a+)+ on "aaaaaaaaaaaab"
// Safe alternative: use non-backtracking possessive quantifier or atomic group

2. Forgetting to escape special characters

Characters . * + ? ^ $ { } [ ] | ( ) \ have special meaning. To match a literal dot in a file extension, write \. — not just .

// Wrong: /image.png/  — the dot matches ANY character
// Correct: /image\.png/

3. Anchoring only one end

Use both ^ and $ for full-string validation. Without ^ and $, a pattern like /\d{4}/ matches "abc1234xyz" because it finds 4 digits anywhere inside.

// Wrong: /\d{4}/.test("abc1234xyz") // true — found 4 digits inside
// Correct: /^\d{4}$/.test("1234")     // true — entire string is 4 digits

4. Assuming regex dialects are the same

JavaScript, Python, Java, PCRE, and .NET each have dialect differences. Lookbehinds with variable length, atomic groups, and possessive quantifiers are not universally supported. Always test in the target language.

// JS supports lookbehind in modern engines: /(?<=@)\w+/
// But older JS engines don't — check compatibility

When to Use Regex (and When Not To)

Good use cases for regex

  • ✓ Validating input format (email, phone, zip code)
  • ✓ Extracting tokens from log lines
  • ✓ Find-and-replace in text editors
  • ✓ Parsing flat structured data (CSV fields, key=value pairs)
  • ✓ Searching large text files for patterns
  • ✓ Checking if a string matches a known format

Avoid regex for these

  • ✗ Parsing HTML (use DOMParser or cheerio)
  • ✗ Parsing JSON (use JSON.parse())
  • ✗ Parsing nested structures (use a proper parser)
  • ✗ Complex date arithmetic (use date-fns or Temporal)
  • ✗ Full RFC 5322 email validation
  • ✗ Security-critical input sanitization on its own

Regex Flavors: Language Differences

Most regex patterns are portable across languages, but there are dialect differences worth knowing:

FeatureJavaScriptPython (re)Java / PCRE
Lookbehind✓ (ES2018+)
Named groups✓ (?<name>...)✓ (?P<name>...)✓ (?<name>...)
Atomic groups✗ (use regex module)
Possessive quantifiers✗ (use regex module)
Verbose mode✓ re.X
Multiline defaultNoNoNo

Test Any Regex Pattern — Free, Instant

Paste a pattern and test string into the regex tester. Get real-time match highlighting, capture group breakdown, flag toggles, and a built-in cheat sheet. No account required.

Open Free Regex Tester →

Frequently Asked Questions

What does regex stand for?

Regex is short for regular expression. The term comes from formal language theory — a regular expression describes a regular language (a specific class of patterns that can be matched by a finite automaton). In practice, modern regex engines (like PCRE and JavaScript) support features beyond formal regular languages, but the name stuck.

Is regex case-sensitive by default?

Yes. /hello/ does not match "Hello" unless you add the case-insensitive flag. In JavaScript use /hello/i, in Python pass re.IGNORECASE (or re.I), in SQL use ILIKE or the appropriate flag for your database.

What is the difference between .match() and .test() in JavaScript?

.test(string) returns a boolean — true if the pattern matches, false otherwise. It's fast for simple checks. .match(regex) returns an array of matches (or null if none). Use .test() for validation; use .match() or .exec() when you need the matched text or capture groups.

What does the \b word boundary mean?

\b asserts a position where a word character (\w) meets a non-word character (\W), or the start/end of the string. /\bcat\b/ matches the word "cat" in "the cat sat" but does NOT match "cat" inside "concatenate". It's a zero-width assertion — it matches a position, not a character.

Why does my regex match too much (greedy matching)?

By default, quantifiers like * and + are greedy — they match as many characters as possible. /<.+>/ on "<b>bold</b>" matches the entire string, not just <b>. Make the quantifier lazy by adding ?: /<.+?>/. This tells the engine to match as few characters as possible while still satisfying the pattern.

Should I use regex for parsing HTML or JSON?

No. HTML and JSON are not regular languages — they have nested, recursive structure that regex cannot reliably handle. Use an HTML parser (like DOMParser in the browser, BeautifulSoup in Python) or JSON.parse() instead. Regex is great for flat patterns like emails, phone numbers, and log lines.

Related Articles