Olivier van der Toorn

Below are the regular expressions we have used to categorize TXT records in the
paper “On the Pitfalls of Finding Security Issues in DNS TXT Records”. This
paper is currently under submission.

A label can have multiple regular expressions attached to it, the sum of the
number of records match make up the count for the label.

The regular expressions are in the form where they can directly be
copy-paste-ed into OpenINTEL’s Impala interface. The txt_text column holds
the TXT record of a domain, if any.

The “Unclassified” category is ommitted in the table since records are counted
as “unclassified” only when they do not match any of the regular expressions.

Category Label Comment Regular Expression
Email SenderID txt_text IREGEXP('.*spf2[.]0.*')
Email SPF txt_text IREGEXP('.*(spf[v13.: “=-]|redirect=_spf[.]yandex[.]net|include:servers[.]mcsv[.]net).*')
Email DMARC txt_text IREGEXP('.*(_dmarc|p=|v=DMARC1).*')
Email DKIM txt_text IREGEXP('^.*(v[=-]dkim(1){0,1}[ ;"])|(k=rsa[ ;"])|(o=[-~])|(a mx [-~]all)|(dkim=(all){0,1}).*')
Email Mail Keywords txt_text IREGEXP('.*(mailconf=|forward-email=|autodiscover[.]|[.]_domainkey[".]).*')
Email Mail Keywords txt_text IREGEXP('^“Alt e-mail retur med fejl')
Verification Verification Keywords txt_text IREGEXP('.*(verification|validation|verify|certification|enroll).*')
Verification Verification Keywords txt_text IREGEXP('^"(uvmid=|wp-noop://|heroku-rileygrey)')
Verification Verification Keywords txt_text IREGEXP('.*(domain=|ms=|v=msv1|mscid|idcf-dns-token=|zoho[.]_domain|m1-shop@|emktownership=|$id: |amazonses| TXT |bhosted).*')
Verification Verification Keywords txt_text IREGEXP('.*(sendinblue-code:|(v=){0,1}ppe-[a-z0-9]{20}|ppkey-|mailru|flexbe([.]com){0,1}:|pax8validate|inst=|loaderio=|This is a Vistaprint website).*')
Verification Verification Keywords txt_text IREGEXP('^“shopify”$')
Patterns Pattern Keywords IPv4 addresses txt_text IREGEXP('^"[0-9]{1,4}[.][0-9]{1,4}[.][0-9]{1,4}[.][0-9]{1,4}"$')
Patterns Pattern Keywords Dates txt_text IREGEXP('^"[\t\n\r ]*20[0-9]{10}')
Patterns Pattern Keywords txt_text IREGEXP('.*(bio=[a-z0-9]{40}|i=).*')
Patterns Pattern Keywords txt_text IREGEXP('.*[0-9a-z]\|.*') AND NOT txt_text IREGEXP('.*[.]exe.*')
Patterns Pattern Keywords txt_text IREGEXP('.*DYNAMIC [0-9]{1,4}[.][0-9]{1,4}[.][0-9]{1,4}[.][0-9]{1,4}.*')
Patterns Pattern Keywords txt_text IREGEXP('.*(NETORGFT|EXPIRED|ALIAS for|@a99|Tel:|Fax:|Email:|0eamtc|U0P1O89566).*')
Patterns Pattern Keywords Begin txt_text IREGEXP('^"(mgrtl|S0J0A87263)')
Patterns Pattern Keywords Begin txt_text IREGEXP('^"[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}')
Patterns Pattern Keywords Begin txt_text IREGEXP('^"[a-z0-9]{4}_[a-z0-9]{3}_[a-z0-9]{3,4}_.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9]{8}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9]{10}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9]{26}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9]{27}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9]{32}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9=/+]{43}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9=/+-]{44}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9=/+-]{88}"$') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Base N txt_text IREGEXP('^"[a-z0-9]{32}') AND NOT txt_text IREGEXP('^“MAl.*')
Encoded Hash txt_text IREGEXP('^"[\t\n\r ]*ca3-')
Encoded Account-hash txt_text IREGEXP(‘ed6”$')
Crypto Coins OA1 OpenAlias records txt_text IREGEXP(’.*oa1:.*')
Crypto Coins JWT JSON Web Token txt_text IREGEXP('^"[\t\n\r ]*eyJ')
Miscellaneous HTTP txt_text IREGEXP('^“p\|https')
Miscellaneous HTTP txt_text IREGEXP('.*http[a-z0-9.:/]+.(com|nl|net|org).*')
Miscellaneous Google txt_text IREGEXP(‘google.com”$')
Miscellaneous Google txt_text IREGEXP('^“google.*')
Miscellaneous Misc Keywords txt_text IREGEXP('^"{0,1}&quot’)
Miscellaneous Misc Keywords txt_text IREGEXP('^"(test|3600|tsdomain)"$')
Miscellaneous Misc Keywords txt_text IREGEXP('^"[\t\n\r ]*(email=|Contact:|BDO|DBO|Posthuset|Pohjois-Karjalan Tietotekniikkakeskus Oy|ipc|skat|Digimediatoimisto Fox Oy|abss|suomen|Valtion Teknillinen|smfundet|rgin -|bpv kraatz oy|System:|Terveystalo|3x34|hello|onninen|coolsoft[.]name|Landsorganisationen i Danmark|[.]tlb|esnord|Territory and related)')
Miscellaneous Misc Keywords txt_text IREGEXP('^"(mail.|include:|hello|international|domain|last |hes=|dzc=|as=|www|mailkey=|main-host|Zone|client).*')
Miscellaneous Misc Keywords txt_text IREGEXP('.*(Survey of Finland|Second Level Domain Registry|vacan|bmandsskole).*')
Miscellaneous Misc Keywords txt_text IREGEXP('.*(docusign|citrix|firebase|pardot|bhosted|abuse|hosting|registered|disallow|hostmaster|gv-|www=|.tkw|.tsm|in-addr.arpa|ID Prof BV|_acme-challenge|@=).*')
Miscellaneous Hosting txt_text IREGEXP('^“Subdomains, mail addresses, and other services are available')
Miscellaneous Hosting txt_text IREGEXP('^“kasserver.com”')
Miscellaneous Hosting txt_text IREGEXP('.*(hosted by|hosted at|hostby|powered by|generated by|managed by).*')
Miscellaneous Hosting txt_text IREGEXP('.*DNSTool.*')
Miscellaneous Hosting txt_text IREGEXP('^“www.poutapilvi.fi')
Miscellaneous Hosting txt_text IREGEXP('^“Webomstilling via shili.danavl.net')
Miscellaneous Hosting txt_text IREGEXP('.*DomainQuadrat.*')
Miscellaneous Hosting txt_text IREGEXP('^“DNS-Service by D-N-S.DK')
Miscellaneous Hosting txt_text IREGEXP('.*RegistrationTek[.]com.*')
Miscellaneous Hosting txt_text IREGEXP('.*ibm[.]com.*')
Miscellaneous Hosting txt_text IREGEXP('^“SPF records generated')
Miscellaneous Hosting txt_text IREGEXP('^“Fujitsu Services')
Miscellaneous Hosting txt_text IREGEXP('^“DDNS')
Miscellaneous Hosting txt_text IREGEXP('^"[\t\n\r ]*Zone hosted on')
Miscellaneous Hosting txt_text IREGEXP('^“Kundenserver')
Miscellaneous Domain Status txt_text IREGEXP('^"[\r\t\n ]*This domain name uses a disabled')
Miscellaneous Domain Status txt_text IREGEXP('^“Domain For Sale')
Miscellaneous Domain Status txt_text IREGEXP('^“Last Installed:')
Miscellaneous Domain Status txt_text IREGEXP('^“RCS Revision:')
Miscellaneous Domain Status txt_text IREGEXP(' certified quality ‘)
Miscellaneous Domain Status txt_text IREGEXP('^“domain gesperrt’)
Miscellaneous Domain Status txt_text IREGEXP('^“No data”$')
Miscellaneous Domain Status txt_text IREGEXP('^“notokenfound')
Miscellaneous Domain Status txt_text IREGEXP('^“Missing entries:')
Miscellaneous Domain Status txt_text IREGEXP('.*zone has been disabled.*')
Miscellaneous Advertising txt_text IREGEXP('^“BSH Hausgerate GmbH')
Miscellaneous Advertising txt_text IREGEXP('^“Breum Data')
Miscellaneous Advertising txt_text IREGEXP('^“Free DNS from Registration Technologies')
Miscellaneous Advertising txt_text IREGEXP('^“Zoner -')
Miscellaneous Advertising txt_text IREGEXP(‘Vex Net’)
Miscellaneous Advertising txt_text IREGEXP('.*DNS service.*')
Miscellaneous Advertising txt_text IREGEXP('.*admin contacts.*')
Miscellaneous Advertising txt_text IREGEXP('.*Kevin Krkosska.*')
Miscellaneous Advertising txt_text IREGEXP('^“Liaison:.*')
Miscellaneous Advertising txt_text IREGEXP('^“DigiCert.*')
Miscellaneous Advertising txt_text IREGEXP('.*Werbung und Technik.*')
Miscellaneous Advertising txt_text IREGEXP('^“brandstudio')
Miscellaneous Advertising txt_text IREGEXP('^“SOL4 IT')
Miscellaneous Advertising txt_text IREGEXP('^“MCI Canada')
Miscellaneous Advertising txt_text IREGEXP(‘microsoftonline[.]com”$')
Miscellaneous Advertising txt_text IREGEXP(’.*SwissSign.*')
Miscellaneous Advertising txt_text IREGEXP('.*Code Consulting.*')
Miscellaneous Advertising txt_text IREGEXP('^“Please check contact details')
Miscellaneous Advertising txt_text IREGEXP('^“Sanoma')
Miscellaneous Advertising txt_text IREGEXP('^"[\t\n\r ]*WMTECH')
Miscellaneous Advertising txt_text IREGEXP('^"[\t\n\r ]*X-NIC')
Other Executables txt_text IREGEXP('[.]exe[^a-z]')
Other No mail txt_text IREGEXP('^“this domain has no mail')
Other Private keys txt_text IREGEXP('.*BEGIN.*PRIVATE.*')
Other Single char txt_text IREGEXP('^”."$')
Other Empty txt_text IREGEXP('^""$')
Other Cmd txt_text IREGEXP('.*(wget |curl |{ :;}).*')
Other Base64 mail txt_text IREGEXP('^“MAl.*')
Other JavaScript txt_text IREGEXP('.*<script.*')
Other BEGIN txt_text IREGEXP('.*–BEGIN.*')
Other BEGIN txt_text IREGEXP('.*BEGIN.*CERTIFICATE.*')
Other BEGIN txt_text IREGEXP('.*BEGIN.*PUBLIC.*')
Other BEGIN txt_text IREGEXP('.*BEGIN.*PGP.*')