Regular Expression Detection#
Regular expressions find data that follows specific patterns, like phone numbers (555-123-4567) or Social Security numbers (123-45-6789). They provide fast and accurate detection for structured data with consistent formatting.
When to Use Regex Detection#
Regex detection works best when:
You have structured data like CSV files or databases
Your data has consistent formatting patterns
You need fast processing of large datasets
You want to find specific patterns like phone numbers, credit cards, or Social Security numbers
Supported Patterns#
Regex detection automatically finds these types of structured data:
Personal Identifiers#
Social Security Numbers: 123-45-6789, 123456789
ZIP Codes: 12345, 12345-6789
Contact Information#
Email Addresses: user@example.com
US Phone Numbers: (555) 123-4567, 555-123-4567, 555.123.4567
Financial Data#
Credit Card Numbers: 4111-1111-1111-1111, 4111111111111111
Technical Identifiers#
IP Addresses: 192.168.1.1, 2001:db8::1
URLs: https://example.com, http://website.org
How Regex Detection Works#
Pattern Matching: Scans text using predefined regular expression patterns
Format Validation: Verifies that detected patterns meet format requirements
Context Filtering: Removes false positives based on surrounding context
Entity Classification: Assigns appropriate entity types to detected patterns
Examples#
# Input CSV data
data = "John,john@email.com,555-123-4567,123-45-6789"
# Regex will detect:
# - "john@email.com" as EMAIL
# - "555-123-4567" as PHONE_NUMBER
# - "123-45-6789" as SSN
# Input text with multiple patterns
text = "Contact us at support@company.com or call (800) 555-0123. Our IP is 192.168.1.100."
# Regex will detect:
# - "support@company.com" as EMAIL
# - "(800) 555-0123" as PHONE_NUMBER
# - "192.168.1.100" as IP_ADDRESS