Skip to content

Regex for HTML Tag

This regex matches both opening and closing HTML tags, including self-closing tags and tags with attributes. It captures the tag name and handles attributes with various quoting styles. While useful for simple HTML extraction and analysis, regex should not be used for full HTML parsing due to the complexity and nesting of real-world HTML documents.

Pattern flags: g
<\/?([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>
Test this pattern in the Regex Tester →

What is the regex pattern for HTML Tag?

The regex pattern for HTML Tag is <\/?([a-zA-Z][a-zA-Z0-9]*)\b[^>]*> with the g flag. This regex matches both opening and closing HTML tags, including self-closing tags and tags with attributes. It captures the tag name and handles attributes with various quoting styles. While useful for simple HTML extraction and analysis, regex should not be used for full HTML parsing due to the complexity and nesting of real-world HTML documents. This pattern is commonly used for html tag extraction and tag stripping.

Test Examples

Match
<div class="container">
Matches: <div class="container">
Match
</p>
Matches: </p>
Match
<img src='photo.jpg' />
Matches: <img src='photo.jpg' />

Common Uses

Variations

Opening tags only

<([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>

Excludes closing tags

Specific tag

<div\b[^>]*>(.*?)<\/div>

Matches a specific tag with its content

Strip all tags

<[^>]+>

Matches any angle-bracket content for removal

Frequently Asked Questions

Can I parse HTML with regex?

For simple tasks like tag extraction or stripping, regex works fine. For complex HTML parsing with nested elements, attributes, and edge cases, use a proper HTML parser like DOMParser, cheerio, or Beautiful Soup. Regex cannot handle nested structures reliably.

Does this match self-closing tags?

Yes, tags like <br />, <img />, and <input /> are matched because the pattern allows any content before the closing >.

Why is the tag name captured in a group?

The first capture group extracts the tag name (div, p, span, etc.), which is useful for filtering specific tags or building a tag frequency analysis.

Related Patterns

Markdown Link

\[([^\]]+)\]\(([^)]+)\)

Markdown Heading

^(#{1,6})\s+(.+)$

URL

https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]...

Related Reading

Regex Cheat Sheet with Examples for Developers → URL Encoding Special Characters →