Supported Entity Types#

Review the available entity types that NeMo Safe Synthesizer can detect and redact in your data.

NeMo Safe Synthesizer uses machine learning models, regular expressions, and custom patterns to detect personally identifiable information (PII) and sensitive data.

Entity Names#

Use the entity names shown in the tables below when configuring PII detection and replacement for best results, as these are the standardized names used across all configuration files, API calls, and transformations. The system has been fine-tuned on the entity types shown in the table below, although the PII replacement component will attempt to classify any arbitrary entity type specified.

Core Entity Types#

The following entity types are available by default:

Personal Identifiers#

Entity Type

Description

Example Values

name

Full person names

“John Smith”, “Maria García”

first_name

Given names

“John”, “Maria”, “Alice”

last_name

Family names

“Smith”, “García”, “Johnson”

Contact Information#

Entity Type

Description

Example Values

email

Email addresses

“john@company.com”, “user@domain.org”

phone_number

Phone numbers

“555-123-4567”, “+1-800-555-0199”

fax_number

Fax numbers

“555-123-4568”

Address Information#

Entity Type

Description

Example Values

address

Complete addresses

“123 Main St, Anytown, CA 90210”

street_address

Street addresses

“123 Main St”, “456 Oak Avenue”

city

City names

“New York”, “Los Angeles”

county

County names

“Harris”, “Maricopa”, “Orange”

state

State/province names

“California”, “NY”

postcode

Postal/ZIP codes

“90210”, “12345”

country

Country names

“United States”, “Canada”

Government Identifiers#

Entity Type

Description

Example Values

ssn

Social Security Numbers

“123-45-6789”

national_id

National ID numbers

“AB123456C”

tax_id

Tax identification numbers

“12-3456789”

certificate_license_number

Certificate/license numbers

“LIC123456”, “CERT-789012”

Financial Information#

Entity Type

Description

Example Values

credit_debit_card

Payment card numbers

“4111-1111-1111-1111”

cvv

Card verification values

“123”, “4567”

pin

Personal identification numbers

“1234”, “5678”

account_number

Bank account numbers

“1234567890”

bank_routing_number

Bank routing numbers

“123456789”

swift_bic

SWIFT/BIC codes

“CHASUS33”, “DEUTDEFF”

iban

International Bank Account Numbers

“GB29 NWBK 6016 1331 9268 19”

Technical Identifiers#

Entity Type

Description

Example Values

url

Web URLs

“https://example.com”

ipv4

IPv4 addresses

“192.168.1.1”

ipv6

IPv6 addresses

“2001:db8::1”

mac_address

Hardware MAC addresses

“00:1B:44:11:3A:B7”

api_key

API keys and tokens

“sk_test_123abc…”

user_name

Usernames

“jsmith”, “user123”

password

Passwords

“MyP@ssw0rd!”, “secret123”

http_cookie

HTTP Cookies

“sessionId=abc123”

device_identifier

Device IDs

“iPhone12,1”, “SM-G975F”

Vehicle Identifiers#

Entity Type

Description

Example Values

vehicle_identifier

VIN numbers

“1HGCM82633A123456”

license_plate

License plates

“ABC-1234”, “CA 1ABC123”

Medical Information#

Entity Type

Description

Example Values

medical_record_number

Medical record numbers

“MRN123456”, “H123456789”

health_plan_beneficiary_number

Insurance IDs

“INS-123456789”, “BCBS-987654321”

biometric_identifier

Biometric data references

“FP-123456”, “DNA-SAMPLE-789”

Geographic and Temporal Information#

Entity Type

Description

Example Values

latitude

Latitude coordinates

“37.7749”, “40.7128”

longitude

Longitude coordinates

“-122.4194”, “-74.0060”

coordinate

Coordinate pairs

“(37.7749, -122.4194)”

Other Identifiers#

Entity Type

Description

Example Values

unique_identifier

Generic unique IDs

“ID123456”, “UUID-abc-def”

customer_id

Customer identifiers

“CUST001”, “C-123456”

employee_id

Employee identifiers

“EMP001”, “E-789012”

Quasi-Identifiers#

Quasi-identifiers are attributes that may not directly identify individuals but can be combined with other data for identification:

Entity Type

Description

Example Values

date

Date values

“2023-01-15”, “01/15/2023”

date_time

Date and time values

“2023-01-15 14:30:00”, “01/15/2023 2:30 PM”

date_of_birth

Birth dates

“1985-03-15”, “March 15, 1985”

time

Time values

“14:30:00”, “2:30 PM”

age

Ages

“18”, “72”

blood_type

Blood type information

“A+”, “O-”, “AB+”

gender

Gender information

“male”, “female”, “non-binary”

sexuality

Sexual orientation

“heterosexual”, “gay”, “lesbian”

political_view

Political affiliations

“Democrat”, “Republican”, “Independent”

race_ethnicity

Race & ethnicity information

“Asian”, “Caucasian”, “Hispanic”

religious_belief

Religious affiliations

“Christian”, “Muslim”, “Jewish”

language

Language preferences

“English”, “Spanish”, “Mandarin”

education_level

Education level

“Bachelor’s Degree”, “High School”, “PhD”

occupation

Professional titles

“Software Engineer”, “Manager”, “Director”

employment_status

Employment information

“Full-time”, “Part-time”, “Unemployed”

company_name

Organization names

“ACME Corp”, “Tech Solutions Inc”