How spam filters generally work

AREA TESTED LOCALE DESCRIPTION OF TEST TEST NAME DEFAULT SCORES
(local, net, with bayes, with bayes+net)
MORE INFO
(additional wiki docs)
body Generic Test for Unsolicited Bulk Email GTUBE 1000.000
full Listed in Razor2 (http://razor.sf.net/) RAZOR2_CHECK 0 0.150 0 1.511
body Razor2 gives confidence level above 50% RAZOR2_CF_RANGE_51_100 0 1.485 0 0.056
full Listed in DCC (http://rhyolite.com/anti-spam/dcc/) DCC_CHECK 0 1.373 0 2.169
full Listed in Pyzor (http://pyzor.sf.net/) PYZOR_CHECK 0 2.041 0 3.451
body Incorporates a tracking ID number TRACKER_ID 1.825 1.064 1.818 0.555
body Weird repeated double-quotation marks WEIRD_QUOTING 1.353 1.966 1.774 2.000
rawbody Extra blank lines in base64 encoding MIME_BASE64_BLANKS 0.693 0.819 1.391 1.469
rawbody base64 attachment does not have a file name MIME_BASE64_NO_NAME 0.022 0 0.017 0.000
rawbody Message text disguised using base64 encoding MIME_BASE64_TEXT 1.780 0.110 1.403 0.298
rawbody MIME section missing boundary MIME_MISSING_BOUNDARY 0 0.247 0.224 0
body Multipart message mostly text/html MIME MIME_HTML_MOSTLY 1.540 0.285 0.713 1.023
body Message only has text/html MIME parts MIME_HTML_ONLY 1.204 1.158 1.156 0.177
rawbody Quoted-printable line longer than 76 chars MIME_QP_LONG_LINE 0 0.000 0.105 0.039
rawbody MIME filename does not match content MIME_SUSPECT_NAME 0.100
body HTML and text parts are different MPART_ALT_DIFF 1.837 1.505 1.823 0.066
body Character set indicates a foreign language CHARSET_FARAWAY 3.200
body Message written in an undesired language UNWANTED_LANGUAGE_BODY 2.800
body Body includes 8 consecutive 8-bit characters BODY_8BITS 1.500
body Body contains a ROT13-encoded email address EMAIL_ROT13 2.720 1.474 2.934 3.105
body Message body has 70-80% blank lines BLANK_LINES_70_80 1.668 1.127 0.745 1.515
body Message body has 80-90% blank lines BLANK_LINES_80_90 0.046 0 0.216 0
body Message body has 90-100% blank lines BLANK_LINES_90_100 1.490 1.750 1.877 1.996
body Message body has many words used only once UNIQUE_WORDS 3.109 2.549 1.639 2.273
body Message body mentions many internet domains DOMAIN_RATIO 2.552 1.360 2.534 3.176
header Did not pass through any untrusted hosts ALL_TRUSTED -2.400 -2.820 -2.867 -3.300
header NJABL: sender is confirmed open relay RCVD_IN_NJABL_RELAY 0 0.934 0 1.397
header NJABL: dialup sender did non-local SMTP RCVD_IN_NJABL_DUL 0 1.655 0 0.088
header NJABL: sender is confirmed spam source RCVD_IN_NJABL_SPAM 0 1.051 0 1.841
header NJABL: sent through multi-stage open relay RCVD_IN_NJABL_MULTI 1
header NJABL: sender is an open formmail RCVD_IN_NJABL_CGI 1
header NJABL: sender is an open proxy RCVD_IN_NJABL_PROXY 0 1.026 0 0.438
header SORBS: sender is open HTTP proxy server RCVD_IN_SORBS_HTTP 0 0 0 0.043
header SORBS: sender is open proxy server RCVD_IN_SORBS_MISC 0 0 0 0.338
header SORBS: sender is open SMTP relay RCVD_IN_SORBS_SMTP 0 1.597 0 2.493
header SORBS: sender is open SOCKS proxy server RCVD_IN_SORBS_SOCKS 0 1.847 0 2.054
header SORBS: sender is a abuseable web server RCVD_IN_SORBS_WEB 0 0 0 0.007
header SORBS: sender demands to never be tested RCVD_IN_SORBS_BLOCK 1
header SORBS: sender is on a hijacked network RCVD_IN_SORBS_ZOMBIE 0 0.819 0 0
header SORBS: sent directly from dynamic IP address RCVD_IN_SORBS_DUL 0 0.137 0 1.987
header Received via a relay in Spamhaus SBL RCVD_IN_SBL 0 1.050 0 0.107
header Received via a relay in Spamhaus XBL RCVD_IN_XBL 0 2.511 0 3.076
header Envelope sender in dsn.rfc-ignorant.org DNS_FROM_RFC_DSN 1
header Envelope sender in postmaster.rfc-ignorant.org DNS_FROM_RFC_POST 0 1.376 0 1.614
header Envelope sender in abuse.rfc-ignorant.org DNS_FROM_RFC_ABUSE 0 0.374 0 0
header Envelope sender in whois.rfc-ignorant.org DNS_FROM_RFC_WHOIS 0 0.492 0 0.296
header Envelope sender in bogusmx.rfc-ignorant.org DNS_FROM_RFC_BOGUSMX 0 1.463 0 2.630
header Received via a relay in list.dsbl.org RCVD_IN_DSBL 0 2.765 0 3.805
header From: sender listed in dnsbl.ahbl.org DNS_FROM_AHBL_RHSBL 0 0.070 0 0.295
header Has Habeas warrant mark and on Infringer List HABEAS_INFRINGER 0 16.0 0 16.0
header Has Habeas warrant mark and on User List HABEAS_USER 0 -8.0 0 -8.0
header Sender is in Bonded Sender Program (trusted relay) RCVD_IN_BSP_TRUSTED 0 -4.3 0 -4.3
header Sender is in Bonded Sender Program (other relay) RCVD_IN_BSP_OTHER 0 -0.1 0 -0.1
header Sender domain is new and very high volume SB_NEW_BULK 1
header Sender IP hosted at NSP has a volume spike SB_NSP_VOLUME_SPIKE 1
header Received via a relay in bl.spamcop.net RCVD_IN_BL_SPAMCOP_NET 0 1.832 0 1.216
header Received via a relay in RSL RCVD_IN_RSL 0 0.677 0 1.720
header Relay in RBL, http://www.mail-abuse.org/rbl/ RCVD_IN_MAPS_RBL 1
header Relay in DUL, http://www.mail-abuse.org/dul/ RCVD_IN_MAPS_DUL 1
header Relay in RSS, http://www.mail-abuse.org/rss/ RCVD_IN_MAPS_RSS 1
header Relay in NML, http://www.mail-abuse.org/nml/ RCVD_IN_MAPS_NML 1
header Envelope sender has no MX or A DNS records NO_DNS_FOR_FROM 0 1.1 0 1.6
header Subject contains a gappy version of ‘cialis’ SUBJECT_DRUG_GAP_C 1.993 1.917 2.501 1.325
header Subject contains a gappy version of ‘levitra’ SUBJECT_DRUG_GAP_L 2.117 2.726 2.181 2.456
header Subject contains a gappy version of ‘phentermine’ SUBJECT_DRUG_GAP_P 0.621 0.765 0.698 1.425
header Subject contains a gappy version of ‘soma’ SUBJECT_DRUG_GAP_S 2.005 0.277 2.920 2.041
header Subject contains a gappy version of ‘valium’ SUBJECT_DRUG_GAP_VA 2.005 1.922 2.934 3.680
header Subject contains a gappy version of ‘viagra’ SUBJECT_DRUG_GAP_VIA 2.659 1.770 3.158 0.253
header Subject contains a gappy version of ‘vicodin’ SUBJECT_DRUG_GAP_VIC 2.560 2.961 2.691 2.868
header Subject contains a gappy version of ‘xanax’ SUBJECT_DRUG_GAP_X 2.538 2.282 2.945 2.512
body Talks about price per dose DRUG_DOSAGE 0.342 0.608 0.405 0.862
body Mentions an E.D. drug DRUG_ED_CAPS 0.122 1.535 0 0.185
body Viagra and other drugs DRUG_ED_COMBO 1.000 0.183 1.415 1.636
body Talks about an E.D. drug using its chemical name DRUG_ED_SILD 1.856 0.421 1.597 1.666
body Mentions Generic Viagra DRUG_ED_GENERIC 1.933 1.181 0 1.128
body Fast Viagra Delivery DRUG_ED_ONLINE 0.553 1.820 1.097 2.300
body Deep discount medications DEEP_DISC_MEDS 2.480 1.211 2.573 2.626
body Online Pharmacy ONLINE_PHARMACY 2.730 0 2.895 0.000
body Attempts to disguise the word ‘viagra’ VIA_GAP_GRA 2.800 3.171 2.886 3.005
body Two or more drugs crammed together into one word DRUGS_SMEAR1 0.515 1.522 0.475 2.351
header Host HELO did not match rDNS: msn.com FAKE_HELO_MSN 1.773 1.456 2.069 2.645
header Host HELO did not match rDNS: mail.com FAKE_HELO_MAIL_COM 1.303 1.972 0.111 0.000
header Host HELO did not match rDNS: email.com FAKE_HELO_EMAIL_COM 0 0 0 1.537
header Host HELO did not match rDNS: eudoramail.com FAKE_HELO_EUDORAMAIL 1.520 0.907 0 0
header Host HELO did not match rDNS: excite.com FAKE_HELO_EXCITE 1.840 2.127 2.127 2.074
header Host HELO did not match rDNS: lycos.com FAKE_HELO_LYCOS 1.410 1.645 0 0.988
header Host HELO did not match rDNS: yahoo.ca FAKE_HELO_YAHOO_CA 1.166 0 0.171 1.116
header Relay HELO’d with suspicious hostname (mail.com) FAKE_HELO_MAIL_COM_DOM 1.920 2.173 2.312 2.108
header Relay HELO’d using suspicious hostname (IP addr 1) HELO_DYNAMIC_IPADDR 3.520 2.754 4.070 4.400
header Relay HELO’d using suspicious hostname (DHCP) HELO_DYNAMIC_DHCP 2.791 0.087 0.958 1.248
header Relay HELO’d using suspicious hostname (HCC) HELO_DYNAMIC_HCC 3.360 1.540 2.451 3.741
header Relay HELO’d using suspicious hostname (ATTBI.com) HELO_DYNAMIC_ATTBI 3.200 3.662 2.760 3.147
header Relay HELO’d using suspicious hostname (Rogers) HELO_DYNAMIC_ROGERS 1.677 0.793 1.888 2.094
header Relay HELO’d using suspicious hostname (Adelphia) HELO_DYNAMIC_ADELPHIA 2.320 1.829 2.389 2.199
header Relay HELO’d using suspicious hostname (T-Dialin) HELO_DYNAMIC_DIALIN 2.320 0.443 2.429 1.755
header Relay HELO’d using suspicious hostname (Hex IP) HELO_DYNAMIC_HEXIP 1.826 1.320 1.453 1.522
header Relay HELO’d using suspicious hostname (Split IP) HELO_DYNAMIC_SPLIT_IP 2.869 0.887 0.992 0.775
header Relay HELO’d using suspicious hostname (YahooBB) HELO_DYNAMIC_YAHOOBB 2.800 2.776 2.572 3.000
header Relay HELO’d using suspicious hostname (OptOnline) HELO_DYNAMIC_OOL 3.120 2.508 3.065 3.182
header Relay HELO’d using suspicious hostname (IP addr 2) HELO_DYNAMIC_IPADDR2 3.271 0.805 2.554 3.496
header Relay HELO’d using suspicious hostname (RR 2) HELO_DYNAMIC_RR2 2.080 1.015 1.678 2.200
header Relay HELO’d using suspicious hostname (Comcast) HELO_DYNAMIC_COMCAST 3.040 3.533 3.217 3.700
header Relay HELO’d using suspicious hostname (Telia) HELO_DYNAMIC_TELIA 0 0 1.216 1.515
header Relay HELO’d using suspicious hostname (VTR) HELO_DYNAMIC_VTR 1.916 0.805 2.013 1.960
header Relay HELO’d using suspicious hostname (Chello.no) HELO_DYNAMIC_CHELLO_NO 1.388 0.226 1.409 1.570
header Relay HELO’d using suspicious hostname (Chello.nl) HELO_DYNAMIC_CHELLO_NL 1.762 0 0.542 0.244
header Relay HELO’d using suspicious hostname (Veloxzone) HELO_DYNAMIC_VELOX 1.680 1.877 1.803 2.003
header Relay HELO’d using suspicious hostname (NTL) HELO_DYNAMIC_NTL 1.340 0.187 1.445 1.732
header Relay HELO’d using suspicious hostname (Home.nl) HELO_DYNAMIC_HOME_NL 1.737 0.635 1.660 1.878
header Message headers are very long HEAD_LONG 2.5
header From: does not include a real name NO_REAL_NAME 0.124 0.178 0.336 0.007
header From: ends in numbers FROM_ENDS_IN_NUMS 0.177 0.516 0.517 0.000
header From: starts with nums FROM_STARTS_WITH_NUMS 1.218 1.492 1.441 0.300
header From: contains numbers mixed in with letters FROM_HAS_MIXED_NUMS 0.107 0.298 0.024 0.000
header From: contains numbers mixed in with letters FROM_HAS_MIXED_NUMS3 1.132 1.113 1.513 1.614
header Uses an address with lots of numbers, at a big ISP ADDR_NUMS_AT_BIGSITE 0.072 0.748 0.112 0.081
header From address is “at something-offers” FROM_OFFERS 1.822 0.861 2.243 1.491
header From: has no local-part before @ sign FROM_NO_USER 1.358 0.344 1.460 0.983
header To: has no local-part before @ sign TO_NO_USER 0.332 0.116 1.615 0.128
header To: is empty TO_EMPTY 0 0 0.164 0.097
header Reply-To: is empty REPLY_TO_EMPTY 1.274 1.410 1.568 1.643
header To: repeats address as real name TO_ADDRESS_EQ_REAL 0 0.470 0.131 0.026
header Valid-looking To “undisclosed-recipients” UNDISC_RECIPS 0.966 1.391 1.295 1.302
header Faked To “Undisclosed-Recipients” FAKED_UNDISC_RECIPS 1.287 0.565 1.431 1.602
header Subject has exclamation mark and question mark PLING_QUERY 0.201 0.857 0.906 0.368
header Subject contains a unique ID SUBJ_HAS_UNIQ_ID 0.899 1.122 0.809 1.339
header Subject contains lots of white space SUBJ_HAS_SPACES 2.240 0.637 1.899 1.175
header Subject is all capitals SUBJ_ALL_CAPS 0.763 0.365 0.257 0.665
header Spam tool Message-Id: (99x9xx99 variant) MSGID_SPAM_99X9XX99 0.500 0.864 1.576 1.442
header Spam tool Message-Id: (alpha-numeric variant) MSGID_SPAM_ALPHA_NUM 2.640 3.004 3.330 3.228
header Spam tool Message-Id: (caps variant) MSGID_SPAM_CAPS 3.500 3.221 3.545 3.791
header Spam tool Message-Id: (letters variant) MSGID_SPAM_LETTERS 2.960 3.151 3.052 2.709
header Spam tool Message-Id: (12-zeroes variant) MSGID_SPAM_ZEROES 1.584 1.763 1.783 1.859
header Message-Id has no hostname MSGID_NO_HOST 0.087 0 0.816 0.140
header Message-Id is fake (in Outlook Express format) MSGID_OUTLOOK_INVALID 2.000 2.290 2.498 2.700
header Message-ID has ALLCAPS@yahoo.com MSGID_YAHOO_CAPS 2.425 0.702 2.442 3.800
header Message-Id for external message added locally MSGID_FROM_MTA_ID 1.440 1.704 1.756 1.723
header Message-Id was added by a hotmail.com relay MSGID_FROM_MTA_HOTMAIL 1.600 1.858 1.987 2.144
header Date header uses unusual Y2K formatting DATE_SPAMWARE_Y2K 2.958 2.888 3.384 3.911
header Invalid Date: header (not RFC 2822) INVALID_DATE 0.011 0.235 0 0.236
header Invalid Date: header (timezone does not exist) INVALID_DATE_TZ_ABSURD 0 0 0.664 0.960
header Invalid date in header (wrong CST timezone) INVALID_TZ_CST 2.044 0.066 0.598 2.873
header Invalid date in header (wrong EST timezone) INVALID_TZ_EST 1.492 2.326 1.672 3.582
header Invalid date in header (wrong GMT/UTC timezone) INVALID_TZ_GMT 1.708 0.636 1.549 0.198
header Date: is 3 to 6 hours before Received: date DATE_IN_PAST_03_06 0.025 0 0.127 0
header Date: is 6 to 12 hours before Received: date DATE_IN_PAST_06_12 0.301 0.211 0.918 0
header Date: is 12 to 24 hours before Received: date DATE_IN_PAST_12_24 0.374 0 0.571 0.703
header Date: is 24 to 48 hours before Received: date DATE_IN_PAST_24_48 0 0.302 0.133 0.089
header Date: is 48 to 96 hours before Received: date DATE_IN_PAST_48_96 0.034 0.257 0.222 0
header Date: is 96 hours or more before Received: date DATE_IN_PAST_96_XX 0.505 1.082 0.979 1.360
header Date: is 3 to 6 hours after Received: date DATE_IN_FUTURE_03_06 1.288 0.072 2.052 0.847
header Date: is 6 to 12 hours after Received: date DATE_IN_FUTURE_06_12 1.040 1.202 1.153 1.300
header Date: is 12 to 24 hours after Received: date DATE_IN_FUTURE_12_24 2.118 2.329 2.863 3.031
header Date: is 24 to 48 hours after Received: date DATE_IN_FUTURE_24_48 2.023 2.046 2.301 2.314
header Date: is 48 to 96 hours after Received: date DATE_IN_FUTURE_48_96 2.080 2.296 2.498 2.689
header Date: is 96 hours or more after Received: date DATE_IN_FUTURE_96_XX 1.393 1.428 1.930 1.962
header Headers contain an unresolved template UNRESOLVED_TEMPLATE 1.324 0.618 1.369 2.866
header Subject contains too many raw illegal characters SUBJ_ILLEGAL_CHARS 2.880 2.854 3.459 2.854
header From contains too many raw illegal characters FROM_ILLEGAL_CHARS 0.861 0.046 0 0.008
header Header contains too many raw illegal characters HEAD_ILLEGAL_CHARS 0.539 2.018 0.961 2.125
header Subject contains an English UCE tag ENGLISH_UCE_SUBJECT 2.080 0.336 2.127 0.110
header Subject contains a Japanese UCE tag JAPANESE_UCE_SUBJECT 0 0 1.665 1.800
header Subject: contains Korean unsolicited email tag KOREAN_UCE_SUBJECT 2.400 2.703 2.469 3.081
header From and To are the same, but not exactly FROM_AND_TO_SAME 0 0.198 0 0
header Received: contains a forged HELO FORGED_RCVD_HELO 0 0.050 0.266 0.000
header Received: HELO and IP do not match, but should RCVD_HELO_IP_MISMATCH 2.799 0.618 1.647 2.178
header Received: contains an IP address used for HELO RCVD_NUMERIC_HELO 0.636 1.531 1.348 1.248
header Received: contains illegal IP address RCVD_ILLEGAL_IP 1.335 1.370 1.588 0.944
header Received by mail server with no name RCVD_BY_IP 0 0.024 0.051 0.067
header Received forged, contains fake AOL relays FORGED_AOL_RCVD 0 0 1.451 0
header Contains forged hostname for a DSL IP in Brazil FORGED_TELESP_RCVD 1.595 0.669 1.468 1.532
header Forged hotmail.com ‘Received:’ header found FORGED_HOTMAIL_RCVD 2.614 2.132 2.150 2.536
header hotmail.com ‘From’ address, but no ‘Received:’ FORGED_HOTMAIL_RCVD2 0.787 1.079 1.415 1.177
header Forged eudoramail.com ‘Received:’ header found FORGED_EUDORAMAIL_RCVD 1.657 0.653 1.130 0.290
header ‘From’ yahoo.com does not match ‘Received’ headers FORGED_YAHOO_RCVD 1.668 2.174 2.095 2.700
header ‘From’ juno.com does not match ‘Received’ headers FORGED_JUNO_RCVD 1.644 1.722 2.018 0.792
header Forged ‘by gw05’ ‘Received:’ header found FORGED_GW05_RCVD 0 0 1.495 1.697
header Character set doesn’t exist NONEXISTENT_CHARSET 0 0 1.411 1.418
header A foreign language charset used in headers CHARSET_FARAWAY_HEADER 3.200
header Sent with ‘X-Priority’ set to high X_PRIORITY_HIGH 0.125 0.093 0.077 0.000
header Sent with ‘X-Msmail-Priority’ set to high X_MSMAIL_PRIORITY_HIGH 0 0.267 0.021 0.000
header Received: says mail sent around the world (HELO) ROUND_THE_WORLD_LOCAL 1.347 0.464 2.351 0.213
header Received: says mail sent around the world (DNS) ROUND_THE_WORLD 0 1.741 0 1.958
header Missing Date: header MISSING_DATE 0 0.019 0.647 0.000
header Missing To: header MISSING_HEADERS 0 0 0.087 0.119
header Similar addresses in recipient list SUSPICIOUS_RECIPS 1.473 1.459 0.820 1.915
header Recipient list is sorted by address SORTED_RECIPS 0.879 1.155 1.759 0.887
header Subject: contains G.a.p.p.y-T.e.x.t GAPPY_SUBJECT 1.365 1.319 2.084 1.343
header Message has X-Library header X_LIBRARY 2.105 1.369 1.863 2.755
header Subject contains “As Seen” SUBJ_AS_SEEN 0.995 1.691 1.214 0.000
header Subject starts with dollar amount SUBJ_DOLLARS 2.449 0.973 1.935 0.054
header Subject contains “For Only” SUBJ_FOR_ONLY 0.646 1.100 1.726 0.044
header Subject contains “FREE” in CAPS SUBJ_FREE_CAP 0.011 0 0.146 0.000
header Subject starts with “Free” SUB_FREE_OFFER 0.055 0.034 0.103 0.000
header Subject GUARANTEED SUBJ_GUARANTEED 1.749 1.302 0.081 0.452
header Subject starts with “Hello” SUB_HELLO 1.405 1.358 0.954 0.007
header Subject includes “life insurance” SUBJ_LIFE_INSURANCE 1.840 2.068 2.184 2.020
header Subject contains “Your Bills” or similar SUBJ_YOUR_DEBT 1.760 2.068 2.035 1.261
header Subject contains “Your Family” SUBJ_YOUR_FAMILY 1.647 0 2.033 0.011
header Subject contains “Your Own” SUBJ_YOUR_OWN 0.872 1.294 1.371 0.000
header Received contains a faked HELO hostname RCVD_FAKE_HELO_DOTCOM 0.899 0.034 0.969 0.424
header To: address appears in Subject ADDRESS_IN_SUBJECT 1.296 1.409 1.866 1.804
header Subject talks about losing pounds SUBJECT_DIET 1.355 0.723 0.059 0.266
header Header has extraneous Content-type:…type= entry EXTRA_MPART_TYPE 0 0.222 0 0
header To header contains ‘recipient’ marker TO_RECIP_MARKER 0 0 1.370 1.539
header Spam tool pattern in MIME boundary MIME_BOUND_DD_DIGITS 3.600 4.230 4.162 4.139
header Spam tool pattern in MIME boundary MIME_BOUND_DIGITS_7 0 0 1.460 0.893
header Spam tool pattern in MIME boundary MIME_BOUND_DIGITS_15 2.674 3.286 3.120 3.400
header Spam tool pattern in MIME boundary MIME_BOUND_MANY_HEX 1.920 2.255 2.590 2.700
header Spam tool pattern in MIME boundary (rfkindy) MIME_BOUND_RKFINDY 2.080 2.347 2.590 2.671
header To: has a malformed address TO_MALFORMED 0.895 2.253 0.455 2.187
header From address is webmail, but starts with a number FROM_NUM_AT_WEBMAIL 1.389 0.258 1.901 1.617
header From webmail service and address ends in numbers FROM_WEBMAIL_END_NUMS6 0.178 0.046 0.389 0.000
header From Address contains FREE ADDR_FREE 0.194 0.078 1.038 1.832
header Sent to a text file TO_TXT 0 0 1.362 1.580
header Involves ‘china.com’ CHINA_HEADER 1.840 1.911 2.312 2.386
header Received line contains spam-sign (lowercase smtp) WITH_LC_SMTP 1.600 0.235 1.862 2.200
header From address has no lower-case characters FROM_NO_LOWER 1.010 1.307 1.650 0.377
header Subject line starts with Buy or Buying SUBJ_BUY 0.565 0.490 0.414 0.000
header Subject is indicative of a Nigerian spam NIGERIAN_SUBJECT1 0 0 0.270 0
header Subject is indicative of a Nigerian spam NIGERIAN_SUBJECT2 1.235 1.765 1.935 2.090
header Message would have been caught by accessdb ACCESSDB 1
header Received headers forged (AM/PM) RCVD_AM_PM 1.558 0.091 1.802 1.927
header Multiple Content-Type headers found HEADER_COUNT_CTYPE 1.198 1.676 1.482 1.771
header Host HELO’d as a big ISP, but had no rDNS NO_RDNS_DOTCOM_HELO 0.025 0.024 0.601 0.016
header X-Originating-IP doesn’t look like IPv4 address X_ORIG_IP_NOT_IPV4 0 1.006 0.081 2.582
header X-Authentication-Warning header looks faked X_AUTH_WARN_FAKED 2.094 2.599 1.654 3.105
header Received header contains faked ‘mr.outblaze.com’ FAKE_OUTBLAZE_RCVD 2.400 2.726 2.867 3.100
header Message is from domain that never sends email FROM_NONSENDING_DOMAIN 1.486 0.308 1.678 0.000
header Subject contains common spam sign (2 numbers) SUBJ_2_NUM_PARENS 1.472 0.276 1.672 2.102
body HTML included in message HTML_MESSAGE 0.001
body Message is 0% to 10% HTML HTML_00_10 0.985 0.138 1.070 1.068
body Message is 10% to 20% HTML HTML_10_20 1.050 0.295 1.350 0.246
body Message is 20% to 30% HTML HTML_20_30 1.241 0.504 0.567 0.226
body Message is 30% to 40% HTML HTML_30_40 0.879 0.056 0.437 0.021
body Message is 40% to 50% HTML HTML_40_50 0.527 0.086 0.052 0.035
body Message is 50% to 60% HTML HTML_50_60 1.053 0.095 0.539 0.087
body Message is 60% to 70% HTML HTML_60_70 0.516 0.027 0 0
body Message is 70% to 80% HTML HTML_70_80 0.151 0 0.039 0
body Message is 80% to 90% HTML HTML_80_90 0.027 0 0.036 0.146
body Message is 90% to 100% HTML HTML_90_100 0.346 0.189 0.043 0.022
body HTML has very strong “shouting” markup HTML_SHOUTING3 0.266 0 0.012 0.019
body HTML has very strong “shouting” markup HTML_SHOUTING4 0.076 0 0.052 0
body HTML has very strong “shouting” markup HTML_SHOUTING5 0.026 0 0.030 0.019
body HTML has very strong “shouting” markup HTML_SHOUTING6 0 0.004 0 0.000
body HTML has very strong “shouting” markup HTML_SHOUTING7 0.450 0.472 0 0.646
body HTML contains text after HTML close tag HTML_TEXT_AFTER_HTML 0.312 0.205 0.032 0.031
body HTML contains text after BODY close tag HTML_TEXT_AFTER_BODY 0.263 0.151 0.752 0.061
body HTML comment is very short HTML_COMMENT_SHORT 0.014 0.625 0 0.000
body HTML message is a saved web page HTML_COMMENT_SAVED_URL 0.528 0.130 0.470 0.146
body HTML conversion tool used by spam HTML_CONVERTED 0 1.204 0.402 1.605
body HTML with embedded plugin object HTML_EMBEDS 0 0.084 0.108 0.207
body HTML contains unsafe auto-executing code HTML_EVENT_UNSAFE 0 0 0.022 0.515
body HTML font size is tiny HTML_FONT_SIZE_TINY 0 0.419 0 0.533
body HTML font size is negative HTML_FONT_SIZE_NONE 0 0.455 1.119 0.033
body HTML font size is large HTML_FONT_SIZE_LARGE 1.387 0.712 0.496 0.153
body HTML font size is huge HTML_FONT_SIZE_HUGE 1.796 1.278 2.265 2.594
body HTML tag for a big font size HTML_FONT_BIG 0 0.232 0 0.142
body HTML tag for a tiny font size HTML_FONT_TINY 2.141 0.471 0.521 0.964
body HTML font color is same as background HTML_FONT_INVISIBLE 0 0.065 0 0.036
body HTML font color similar to background HTML_FONT_LOW_CONTRAST 1.011 0.955 1.017 0.788
body HTML font face is not a word HTML_FONT_FACE_BAD 0 0 0.044 0.037
body HTML font face has excess capital characters HTML_FONT_FACE_CAPS 0 0.804 0.281 0.247
body HTML includes a form which sends mail HTML_FORMACTION_MAILTO 1.840 2.162 1.907 2.353
body HTML: images with 0-400 bytes of words HTML_IMAGE_ONLY_04 3.120 3.094 3.482 3.304
body HTML: images with 400-800 bytes of words HTML_IMAGE_ONLY_08 2.881 1.970 2.730 3.036
body HTML: images with 800-1200 bytes of words HTML_IMAGE_ONLY_12 2.360 1.473 2.741 2.942
body HTML: images with 1200-1600 bytes of words HTML_IMAGE_ONLY_16 1.352 1.279 1.990 1.047
body HTML: images with 1600-2000 bytes of words HTML_IMAGE_ONLY_20 1.567 0.843 1.023 0.446
body HTML: images with 2000-2400 bytes of words HTML_IMAGE_ONLY_24 1.088 1.003 0.787 0.502
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_02 1.729 0 1.125 0.018
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_04 1.038 0.184 0.515 0.105
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_06 0.072 0 0.342 0.131
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_08 0 0.000 0 0.032
body HTML link text says “push here” or similar HTML_LINK_PUSH_HERE 1.627 0.409 1.843 0.873
body Message is 5% to 10% HTML obfuscation HTML_OBFUSCATE_05_10 0.428 0.483 0.563 0.257
body Message is 10% to 20% HTML obfuscation HTML_OBFUSCATE_10_20 0.931 0.732 0.796 0.865
body Message is 20% to 30% HTML obfuscation HTML_OBFUSCATE_20_30 0.997 0.597 0.014 0.000
body Message is 30% to 40% HTML obfuscation HTML_OBFUSCATE_30_40 2.517 1.933 3.005 3.445
body Message is 40% to 50% HTML obfuscation HTML_OBFUSCATE_40_50 2.641 1.746 2.739 3.089
body Message is 50% to 60% HTML obfuscation HTML_OBFUSCATE_50_60 2.635 1.339 2.882 3.325
body Message is 60% to 70% HTML obfuscation HTML_OBFUSCATE_60_70 2.257 0.971 2.432 2.805
body Message is 70% to 80% HTML obfuscation HTML_OBFUSCATE_70_80 2.308 1.334 2.256 2.689
body Message is 80% to 90% HTML obfuscation HTML_OBFUSCATE_80_90 1.600 0.489 1.656 1.939
body Message is 90% to 100% HTML obfuscation HTML_OBFUSCATE_90_100 1.405 0.203 1.657 1.775
body HTML tags used to obfuscate words HTML_BACKHAIR_2 0.144 0 0.032 0
body HTML tags used to obfuscate words HTML_BACKHAIR_4 0 0 0.138 0.058
body HTML tags used to obfuscate words HTML_BACKHAIR_8 1.075 0.569 1.137 0.727
body HTML has many bad attributes in tags HTML_ATTR_BAD 0 0.101 0.609 2.354
body HTML appears to have random attributes in tags HTML_ATTR_UNIQUE 0.441 1.165 1.097 0.000
body Image tag intended to identify you HTML_WEB_BUGS 0.166 0.013 0.311 0.035
body HTML has unbalanced “body” tags HTML_TAG_BALANCE_BODY 0.043 0.389 0.096 0.000
body HTML has unbalanced “head” tags HTML_TAG_BALANCE_HEAD 0.061 0.860 0.033 0.000
body HTML has “marquee” tag HTML_TAG_EXIST_MARQUEE 2.160 1.758 1.840 2.034
body HTML has “tbody” tag HTML_TAG_EXIST_TBODY 1.014 0.233 0.079 0.114
body HTML message is 0% to 10% bad tags HTML_BADTAG_00_10 0 0 0.001 0.000
body HTML message is 10% to 20% bad tags HTML_BADTAG_10_20 0.236 0 0 0
body HTML message is 20% to 30% bad tags HTML_BADTAG_20_30 0 0.169 0.035 0
body HTML message is 30% to 40% bad tags HTML_BADTAG_30_40 0 0.103 0.017 0
body HTML message is 40% to 50% bad tags HTML_BADTAG_40_50 0.002 0 0.000 0.010
body HTML message is 50% to 60% bad tags HTML_BADTAG_50_60 0.864 0.430 1.035 0.153
body HTML message is 60% to 70% bad tags HTML_BADTAG_60_70 1.726 1.127 2.314 1.356
body HTML message is 70% to 80% bad tags HTML_BADTAG_70_80 1.657 0.075 2.087 2.280
body HTML message is 80% to 90% bad tags HTML_BADTAG_80_90 1.861 1.309 1.831 1.911
body HTML message is 90% to 100% bad tags HTML_BADTAG_90_100 0.746 1.192 2.688 2.804
body 0% to 10% of HTML elements are non-standard HTML_NONELEMENT_00_10 0 0 0.001 0.001
body 10% to 20% of HTML elements are non-standard HTML_NONELEMENT_10_20 0.045 0 0.000 0.000
body 20% to 30% of HTML elements are non-standard HTML_NONELEMENT_20_30 0.346 0.070 0 0
body 30% to 40% of HTML elements are non-standard HTML_NONELEMENT_30_40 0 0.012 0.010 0.000
body 40% to 50% of HTML elements are non-standard HTML_NONELEMENT_40_50 0.000
body 50% to 60% of HTML elements are non-standard HTML_NONELEMENT_50_60 1
body 60% to 70% of HTML elements are non-standard HTML_NONELEMENT_60_70 0.237 1.138 0.083 0.001
body 70% to 80% of HTML elements are non-standard HTML_NONELEMENT_70_80 0.488 0.803 1.169 0.000
body 80% to 90% of HTML elements are non-standard HTML_NONELEMENT_80_90 0.016 0.492 0.023 0.000
body 90% to 100% of HTML elements are non-standard HTML_NONELEMENT_90_100 0.011 1.582 0 2.963
body HTML is extremely short HTML_SHORT_LENGTH 0.601 0.713 0.068 0.389
body HTML title contains no text HTML_TITLE_EMPTY 0.022 0.045 0.036 0.004
body HTML title contains “Untitled” HTML_TITLE_UNTITLED 0.222 0.259 0.792 0.000
rawbody Javascript to hide URLs in browser HIDE_WIN_STATUS 0.032 0 0 0.063
rawbody HTML contains needlessly encoded characters ENTITY_DEC_ALPHANUM 0.012 0 2.686 2.716
body List removal information MULTI_REMOVAL_1WORD 1.005 0 0.916 0.802
body Send real mail to be unsubscribed REMOVE_POSTAL 1.520 1.362 1.757 1.900
body Asks you to click below (in capital letters) CLICK_BELOW_CAPS 0.135 0 0 0.112
body Click to be removed CLICK_TO_REMOVE_1 0.050 0 0.192 0.791
body Claims compliance with spam regulations SENT_IN_COMPLIANCE 1.520 1.786 1.850 2.000
body Possible mention of bill 1618 (anti-spam bill) BILL_1618 0.994 1.692 1.798 1.895
body Doesn’t ask any questions NO_QS_ASKED 0 1.196 0 0.000
body Offers a full refund FULL_REFUND 0.853 1.114 0.079 1.272
body No such thing as a free lunch (2) COMPLETELY_FREE 0.086 0 0.840 0.026
body No such thing as a free lunch (3) NO_COST 0.078 0 0.335 0.000
body One hundred percent guaranteed GUARANTEED_100_PERCENT 0.615 0.435 0.669 0.000
body Dear Friend? That’s not very dear! DEAR_FRIEND 0.542 0.766 1.288 0.070
body Contains ‘Dear (something)’ DEAR_SOMETHING 1.059 0.803 1.577 1.578
body Talks about lots of money BILLION_DOLLARS 0.193 1.185 0.407 0.134
body Talks about opting out (lowercase version) OPTING_OUT 0.157 0.494 0.030 0.479
body Talks about opting out (capitalized version) OPTING_OUT_CAPS 0.067 0.026 0.483 0.000
body Get a million email addresses MILLION_EMAIL 0.093 0.417 0.937 0.000
body Gives a lame excuse about why spam was sent EXCUSE_1 0 0 0.074 0.132
body Claims you can be removed from the list EXCUSE_3 0 0.098 0.015 0.116
body Claims you can be removed from the list EXCUSE_4 1.145 1.775 1.443 1.119
body Claims you can be removed from the list EXCUSE_6 1.444 0.734 1.782 1.696
body Claims you can be removed from the list EXCUSE_7 0 0.152 0.010 0.018
body “if you do not wish to receive any more” EXCUSE_10 0.071 0.380 0.039 0.024
body Nobody’s perfect EXCUSE_12 0.153 0 0.354 0.197
body Claims you opted-in or registered EXCUSE_19 0.056 0.357 0.021 0.000
body Claims you have provided permission EXCUSE_23 1.840 2.088 2.312 2.400
body Claims you wanted this ad EXCUSE_24 1.440 1.272 1.874 2.080
body Talks about how to be removed from mailings EXCUSE_REMOVE 0.043 0 0.513 0.310
body Targeted Traffic / Email Addresses TARGETED 0 0.692 1.471 0.480
body Tells you about a strong buy STRONG_BUY 2.880 3.384 3.018 3.117
body Claims to honor removal requests WE_HONOR_ALL 2.063 2.365 1.789 2.029
body Offers a picked stock STOCK_PICK 0.106 0.150 0.041 1.470
body Offers a alert about a stock STOCK_ALERT 2.362 1.782 2.378 2.385
body SEC-mandated penny-stock warning MICRO_CAP_WARNING 1.440 0.760 1.803 1.828
body Not registered investment advisor NOT_ADVISOR 2.160 2.444 2.590 2.700
body Describes some sort of breakthrough SOME_BREAKTHROUGH 0.232 1.921 0.907 1.610
body They have selected you for something SELECTED_YOU 1.485 1.865 1.841 1.897
body Contains mail-in order form MAIL_IN_ORDER_FORM 1.440 0.351 0 0
body University Diplomas UNIVERSITY_DIPLOMAS 2.242 0.523 0 0
body ‘Prestigious Non-Accredited Universities’ PREST_NON_ACCREDITED 1.520 1.394 1.607 1.901
body Claims “cannot be considered spam” CANNOT_BE_SPAM 0 0 1.546 1.769
body Information on growing body parts BODY_ENHANCEMENT 0.151 0.481 0.070 0
body Information on getting larger body parts BODY_ENHANCEMENT2 0.814 0.845 0.109 0
body Impotence cure IMPOTENCE 0.095 0.751 0 0.094
body Information on how to work at home (1) WORK_AT_HOME 0 0 0.325 0.030
body Information on mortgages MORTGAGE_BEST 0.948 0.923 0 0.144
body Looks like mortgage pitch MORTGAGE_PITCH 0.297 0 0.065 0
body Information on mortgage rates MORTGAGE_RATES 0 0.689 0.174 0.202
body Order a report from someone ORDER_REPORT 0 0 1.230 0
rawbody mailto URI includes removal text MAILTO_SUBJ_REMOVE 1.023 0 2.064 0.542
body Includes a link for AOL users to click AOL_USERS_LINK 0 0 0.034 0.109
body Talks about a million North American dollars NA_DOLLARS 2.078 2.193 2.485 2.611
body Mentions millions of (dollar) ((dollar) NN,NNN,NNN.NN) US_DOLLARS_3 0.331 0.411 0.010 0.354
body Talks about millions of dollars MILLION_USD 1.594 1.290 1.535 2.796
rawbody Frontpage used to create the message FRONTPAGE 0.510 0.529 0.595 2.080
body Contains “My wife, Jody” testimonial JODY 0 0 1.326 0
body Doing something with my income YOUR_INCOME 0.674 0.892 0.372 1.092
body Resistance to this spam is futile RESISTANCE_IS_FUTILE 1.520 1.786 1.850 0
body Contains ‘subject to credit approval’ SUBJ_2_CREDIT 0 0.500 0 0.076
body Contains urgent matter URG_BIZ 0.288 0.030 1.064 1.808
body Contains ‘earn (dollar) something per week’ EARN_PER_WEEK 1.360 0.856 1.757 1.896
body Spam is 100% natural?! ALL_NATURAL 2.640 1.828 2.246 1.061
body Money back guarantee MONEY_BACK 2.051 0.037 0.217 0.095
body There is no catch NO_CATCH 0 0 0.127 0
body There is no obligation NO_OBLIGATION 0.905 0.565 1.157 0.830
body You won’t be “disappointed” NO_DISAPPOINTMENT 0 1.498 1.609 0.410
body Serious Enquiries Only SERIOUS_ONLY 0 0 1.664 1.748
body Risk free. Suuurreeee…. RISK_FREE 0.036 0.247 0.135 0.230
body As seen on national TV! AS_SEEN_ON 0.393 0.320 0.613 0.020
body Common pyramid scheme phrase (1) COPY_ACCURATELY 0 0 1.324 0
body Off Shore Scams OFFSHORE_SCAM 0 0.337 0.127 0.144
body Why Pay More? WHY_PAY_MORE 1.249 0 1.713 1.978
body Congratulations – you’ve been scammed? CONGRATULATIONS 0 0 0.486 0.272
body Talks about free mobile phones CELL_PHONE_FREE 1.280 1.476 1.571 0.922
body Talks about cell-phone signal improvement CELL_PHONE_IMPROVE 0.771 0.812 1.655 1.031
body Receive a special offer RECEIVE_OFFER 1.125 0.955 1.446 0.793
body Free express or no-obligation quote FREE_QUOTE_INSTANT 0.211 1.736 0.051 0.001
body Free Membership FREE_MEMBERSHIP 0.492 1.182 1.587 0.873
body Credit Card Offers CREDIT_CARD 0.030 0.896 0.032 0.310
body Without a credit check NO_CREDIT_CHECK 0 0 1.990 0.037
body Avoiding bankruptcy BANKRUPTCY 0.249 1.088 1.112 0.489
body Accepting credit cards ACCEPT_CREDIT_CARDS 0.360 0 1.332 0.399
body Eliminate Bad Credit BAD_CREDIT 1.161 0.252 0.817 0
body Non-secured Credit/Debt NONSECURED_CREDIT 0 0 1.074 0
body Consolidate debt, credit, or bills CONSOLIDATE_DEBT 0.886 0.653 0 0.245
body Home refinancing REFINANCE_YOUR_HOME 1.321 0.394 0.917 0.340
body Home refinancing REFINANCE_NOW 1.611 0 1.191 0.029
body No Purchase Necessary NO_PURCHASE 0 0 0.107 0
body No Medical Exams NO_MEDICAL 1.440 1.656 1.665 0
body No Claim Forms NO_FORMS 1.622 0.973 0.912 0.011
body Requires Initial Investment INITIAL_INVEST 0.433 0.450 1.026 1.230
body Buy Direct BUY_DIRECT 1.502 1.779 1.757 1.663
body Do it Today DO_IT_TODAY 0.036 0.047 0 0
body What are you waiting for WHY_WAIT 2.240 2.060 0.796 0.764
body You can search for anyone YOU_CAN_SEARCH 1.370 0.444 1.246 1.630
body Score with babes! SEDUCTION 1.560 1.356 1.415 1.054
body Invaluable marketing information INVALUABLE_MARKETING 0 0 1.201 0
body Guaranteed Stuff GUARANTEED_STUFF 0.100 0.238 0.403 0.000
body Potential Earnings EARNINGS 0 0 1.642 1.675
body The best Rates THE_BEST_RATE 0 0.550 0 0.000
body Amazing Stuff AMAZING_STUFF 0.949 1.269 0.069 0.102
body Lose Weight Spam DIET_1 0.671 0.365 0.274 0
body Describes weight loss DIET_2 0.545 0 1.034 0.316
body Describes body fat loss DIET_3 1.794 1.061 1.835 2.073
body Reverses Aging REVERSE_AGING 1.919 1.403 2.057 2.150
body Cures Baldness HAIR_LOSS 1.381 2.371 1.428 1.738
body Removes Wrinkles WRINKLES 1.730 2.097 1.917 2.091
body While you Sleep WHILE_YOU_SLEEP 0.858 0.605 1.786 0.000
body If only it were that easy RICH 0 0.451 0 0.000
body Who really wins? YOU_WON 0.144 0.269 0 0.579
body Talks about Hidden Charges HIDDEN_CHARGES 0.046 0.961 0 0.000
body Freedom of a financial nature FIN_FREE 1.365 0.015 1.865 0.788
body Stock Disclaimer Statement FORWARD_LOOKING 1.840 2.162 2.120 2.200
body Mail guarantees satisfaction SATIS_GUAR 0.884 0 0.825 0.081
body Offers Extra Cash EXTRA_CASH 0.117 0.987 0.629 0.447
body Get Paid GET_PAID 1.390 1.764 1.466 0.862
body Have you been turned down? BEEN_TURNED_DOWN 1.336 1.266 1.682 1.890
body One Time Rip Off ONE_TIME 0.044 0 0.036 0.619
body Compete for your business COMPETE 1.600 1.791 1.804 2.050
body Meet Singles MEET_SINGLES 1.600 0 1.076 1.172
body Join Millions of Americans JOIN_MILLIONS 0.036 0.640 0.999 0.448
body Be your own boss BE_BOSS 1.512 0.145 1.847 1.648
body Multi Level Marketing mentioned ML_MARKETING 0.049 0 0.103 0
body Claims to be Legal ITS_LEGAL 0.186 1.109 0.432 0.264
body Confidentiality on all orders CONFIDENTIAL_ORDER 1.920 1.196 1.889 1.266
body Save big money SAVE_THOUSANDS 0.929 1.889 0.717 0.031
body Claims you registered with a partner MARKETING_PARTNERS 2.025 0.718 2.405 1.401
body Free Preview FREE_PREVIEW 1.612 0.376 1.887 1.851
body Domain name containing a “4u” variant DOMAIN_4U2 1.508 1.783 1.935 1.588
body Contains ‘free access’ with capitals FREE_ACCESS 0 0 0.253 0
body Contains ‘free sample’ with capitals FREE_SAMPLE 0.089 0.168 0.223 0.941
body Lowest Price LOW_PRICE 0.885 0 0.206 0
body People just leave money laying around UNCLAIMED_MONEY 1.263 1.703 1.945 1.584
body Message seems to contain rot13ed address OBSCURED_EMAIL 2.720 3.194 3.186 3.132
body Mentions their affiliate partners OUR_AFFILIATE_PARTNERS 0 0 0.041 1.443
body Talks about exercise with an exclamation! BANG_EXERCISE 1.450 1.993 1.662 1.442
body Talks about more with an exclamation! BANG_MORE 0.287 0 0.294 0
body Talks about Oprah with an exclamation! BANG_OPRAH 0.666 0.212 1.717 1.975
body Talks about quotes with an exclamation! BANG_QUOTE 1.680 1.880 1.942 1.964
body Talks about ‘acting now’ with capitals ACT_NOW_CAPS 0.222 0 0.426 0.093
body Talks about ‘starting now’ with capitals START_NOW_CAPS 1.280 1.499 1.124 0.857
body Talks about a bigger drive for sex MORE_SEX 2.240 1.762 2.287 2.422
body Something is emphatically guaranteed BANG_GUAR 0.297 0 0.254 0
body See for yourself SEE_FOR_YOURSELF 0.544 0.381 0.591 0.044
body Possible porn – Free Porn FREE_PORN 0.794 0.023 1.937 0.000
body Possible porn – Cum Shot CUM_SHOT 0.355 1.732 0.943 0
body Possible porn – Pay Site PAY_SITE 0 0 1.850 1.900
body Possible porn – Live Porn LIVE_PORN 0.040 0.360 0.019 0.000
body Possible porn – Hardcore Porn HARDCORE_PORN 1.520 0.665 1.850 0.684
body Possible porn – Hot, Nasty, Wild, Young HOT_NASTY 0.765 0.586 0.967 0.088
body Possible porn – Best, Largest, Most Porn BEST_PORN 0.566 0.263 0.044 0
body Possible porn – Nasty Girls NASTY_GIRLS 0.350 0.439 0.022 2.196
body Possible porn – Amateur Porn AMATEUR_PORN 1.397 0.769 1.615 1.744
body Possible porn – Celebrity Porn PORN_CELEBRITY 0.675 1.569 0.319 0.038
body Possible porn – Adult Web Sites SOMETHING_FOR_ADULTS 1.433 1.513 1.614 0.006
body Possible porn – various types of feline PORN_15 1.680 1.974 2.035 2.168
body Possible porn – nasty, dirty, little etc. PORN_16 0.907 0.462 1.305 0.017
body Thousands or millions of pictures, movies, etc. LOTS_OF_STUFF 0.839 0.029 0 0.000
body Attempts to disguise porn words DISGUISE_PORN 1.490 1.835 0.798 0.030
uri URL uses words/phrases which indicate porn (sex) PORN_URL_SEX 1.865 1.427 1.817 0.011
uri URL uses words/phrases which indicate porn (slut) PORN_URL_SLUT 0.941 1.022 0.194 0.094
uri URL uses words/phrases which indicate porn (misc) PORN_URL_MISC 1.728 0.573 1.767 1.620
header Subject indicates sexually-explicit content SUBJECT_SEXUAL 2.160 2.538 2.775 2.900
header Bulk email fingerprint (eGroups) found RATWARE_EGROUPS 2.180 2.701 2.552 2.805
header Bulk email fingerprint (hash 2) found RATWARE_HASH_2 0.039 0 0.085 0.037
header Bulk email fingerprint (hash 2 v2) found RATWARE_HASH_2_V2 1.798 1.319 1.767 0.980
header Bulk email fingerprint (jpfree) found RATWARE_JPFREE 0 0 1.942 2.100
uri Bulk email fingerprint (StormPost) found RATWARE_STORM_URI 1.920 1.518 2.405 2.295
header X-Mailer has malformed Outlook Express version RATWARE_OE_MALFORMED 2.160 2.407 2.522 2.588
header Bulk email fingerprint (‘esmtp’ Received) found RATWARE_RCVD_LC_ESMTP 1.745 1.474 2.122 2.083
header Bulk email fingerprint (Mozilla malformed) found RATWARE_MOZ_MALFORMED 1.594 0.990 1.752 0.558
rawbody Contains a hashbuster in Send-Safe format RATWARE_HASH_DASH 1.133 0.947 1.500 1.646
header Bulk email fingerprint (netIP) found RATWARE_NETIP 0.439 1.033 2.312 2.286
header Bulk email fingerprint (Gecko faked) found RATWARE_GECKO_BUILD 0 0.826 0.784 1.385
header Headers are in order found in spam (MTSRIX) HDR_ORDER_MTSRIX 0.417 0.391 0.192 1.057
header Headers are in order found in spam (TRIMRS) HDR_ORDER_TRIMRS 2.320 2.674 2.220 2.199
header Bulk email fingerprint (bonus space) found RCVD_BONUS_SPC_DATE 1.371 0.904 1.575 1.872
header Bulk email fingerprint (X-Message-Info) found X_MESSAGE_INFO 3.600 4.187 4.162 4.244
header Bulk email fingerprint (Received PF) found RATWARE_RCVD_PF 2.880 3.384 3.608 3.867
header Bulk email fingerprint (Received @) found RATWARE_RCVD_AT 2.550 1.011 2.691 3.415
uri Uses a numeric IP address in URL NUMERIC_HTTP_ADDR 1.565 1.572 1.872 2.135
uri Uses a dotted-decimal IP address in URL NORMAL_HTTP_TO_IP 0.104 0.080 0.830 0.028
uri Uses %-escapes inside a URL’s hostname HTTP_ESCAPED_HOST 0.034 0.094 0 0.477
uri Uses control sequences inside a URL hostname HTTP_CTRL_CHARS_HOST 1.440 1.670 1.757 1.900
uri Completely unnecessary %-escapes inside a URL HTTP_EXCESSIVE_ESCAPES 0 0.645 0 0.151
uri Dotted-decimal IP address followed by CGI IP_LINK_PLUS 0.211 0.024 0.192 0.232
uri URL of page called “remove” REMOVE_PAGE 0.081 0.604 0 0.191
uri Includes a link to a likely spammer email MAILTO_TO_SPAM_ADDR 0 0 0.106 0
uri Includes a ‘remove’ email address MAILTO_TO_REMOVE 0.886 0 0.065 0.116
uri Uses non-standard port number for HTTP WEIRD_PORT 0 0.507 0.228 0.109
uri URL contains username and (optional) password USERPASS 0.429 0.561 1.319 0.268
uri Filename is just a ‘\#’; probably a JS trick URI_IS_POUND 0 0.333 0 0
uri Includes a link to a likely spammer domain BARGAIN_URL 1.503 1.520 1.686 1.833
uri Contains an URL in the BIZ top-level domain BIZ_TLD 2.167 0.527 2.434 2.288
uri Contains an URL in the INFO top-level domain INFO_TLD 1.717 0.481 1.686 0.000
uri Has Yahoo Redirect URI YAHOO_RD_REDIR 1.237 1.083 1.366 1.642
uri Has Yahoo Redirect URI YAHOO_DRS_REDIR 1.911 0.911 1.956 0.984
uri Message has link to company offers URI_OFFERS 1.328 0.252 1.460 0.770
uri Message has URI 4you URI_4YOU 1.027 1.812 0.898 1.966
uri Contains URI to a document hosted at ‘terra.es’ TERRA_ES 1.367 0.816 1.746 2.612
uri Contains an URL-encoded hostname (HTTP77) HTTP_77 1.514 0.605 1.812 1.981
uri Contains a URI with an affiliate ID code URI_AFFILIATE 2.243 0 1.808 2.052
header Message has HTTP redirector URI URI_REDIRECTOR 0 0 0.031 0.011
body Bayesian spam probability is 0 to 1% BAYES_00 0 0 -1.665 -2.599
body Bayesian spam probability is 1 to 5% BAYES_05 0 0 -0.925 -0.413
body Bayesian spam probability is 5 to 20% BAYES_20 0 0 -0.730 -1.951
body Bayesian spam probability is 20 to 40% BAYES_40 0 0 -0.276 -1.096
body Bayesian spam probability is 40 to 60% BAYES_50 0 0 1.567 0.001
body Bayesian spam probability is 60 to 80% BAYES_60 0 0 3.515 1.0
body Bayesian spam probability is 80 to 95% BAYES_80 0 0 3.608 2.0
body Bayesian spam probability is 95 to 99% BAYES_95 0 0 3.514 3.0
body Bayesian spam probability is 99 to 100% BAYES_99 0 0 4.070 3.5
body es Claims you can be removed in Spanish REMOVE_ES_01 1
body es Claims you can be removed in Spanish REMOVE_ES_02 1
body es Claims you can be removed in Spanish REMOVE_ES_03 1
body es Claims you can be removed in Spanish REMOVE_ES_04 1
body es If you send an email you will be OptOut REMOVE_ES_05 1
body es Claims you can opt-out REMOVE_ES_06 1
body es Claims you can opt-out REMOVE_ES_07 1
body es Claims you can opt-out REMOVE_ES_08 1
body es If you want to subscribe… SUBSCRIBE_ES_01 1
body es Claims not to be spam in Spanish EXCUSE_ES_01 1
body es Someone fell free to send you a message in Spanish EXCUSE_ES_02 1
body es Someone requested an spammer to spam you in Spanish EXCUSE_ES_03 1
body es El correo como alternativa comercial EXCUSE_ES_05 1
body es Mensaje enviado por error EXCUSE_ES_06 1
body es No se puede considerar spam EXCUSE_ES_07 1
body es Para dejar de fumar DEJAR_DE_FUMAR_ES 1
body es NOS CHILLAN PARA DECIR QUE ES GRATIS GRATIS_ES 1.4
body es Nos animan a contestar si estamos interesados INTERESADO_ES 1
body es Dice cumplir con la ley LEY_ORGANICA_ES 2.0
body es Clama cumplir con la normativa SPAM NORMATIVA_SPAM_ES 2.0
body es No existe legislación en Chile contra el SPAM LEY_CHILE_ES_01 1
body es Clama cumplir con la legislación chilena LEY_CHILE_ES_02 1
body es Inmigración legal (?) a los Estados Unidos TARJETA_VERDE_ES 1
body es Promocion especial. PROMOCION_ES 1
body es Alta en buscadores hispanos. ALTA_BUSCADORES_ES 1
body es IMPERATIVOS/EXCLAMACIONES EN MAYUSCULAS. EXCLAMACION_ES 1
body es Presentación de un nuevo producto. PRESENTAMOS_ES 1
body es Pago contra reembolso. CONTRA_REEMBOLSO_ES 1
body es Para hacer su pedido. PEDIDO_ES 1
body es Haga click aqui. CLICK_ES 1
body es Los regalos no existen, salvo de nuestros amigos. REGALO_ES 1
body es Pueden ser ganadores. GANADORES_ES_01 1
body es Ha sido ganador. GANADORES_ES_02 1
body es Porno gratis. PORNO_GRATIS_ES 1
body es Mas informacion. MAS_INFORMACION_ES 1
body es Informacion y reserva INFORMACION_RESERVA_ES 1
body es Conviertete en Spammer. REENVIA_ES 1
body es No nos envían más spam… seguro que no. NO_MAS_MAIL_1_ES 1
body es No recibirá este spam otra vez… seguro que no. NO_MAS_MAIL_2_ES 1
body es Las direcciones fueron obtenidas de internet. COLECTOR_DE_MAILS_ES 1
header Contains valid Hashcash token (20 bits) HASHCASH_20 -0.500
header Contains valid Hashcash token (21 bits) HASHCASH_21 -0.700
header Contains valid Hashcash token (22 bits) HASHCASH_22 -1.000
header Contains valid Hashcash token (23 bits) HASHCASH_23 -2.000
header Contains valid Hashcash token (24 bits) HASHCASH_24 -3.000
header Contains valid Hashcash token (25 bits) HASHCASH_25 -4.000
header Contains valid Hashcash token (>25 bits) HASHCASH_HIGH -5.000
header Hashcash token already spent in another mail HASHCASH_2SPEND 0.100
header SPF: sender matches SPF record SPF_PASS -0.001
header SPF: sender does not match SPF record (fail) SPF_FAIL 0 0.001 0 0.875
header SPF: sender does not match SPF record (softfail) SPF_SOFTFAIL 0.500 0.842 0.500 0.500
header SPF: HELO matches SPF record SPF_HELO_PASS -0.001
header SPF: HELO does not match SPF record (fail) SPF_HELO_FAIL 0 0.405 0 0.001
header SPF: HELO does not match SPF record (softfail) SPF_HELO_SOFTFAIL 0 1.002 0 3.140
body Contains an URL listed in the SBL blocklist URIBL_SBL 0 0.629 0 0.996
body Contains an URL listed in the SC SURBL blocklist URIBL_SC_SURBL 0 3.897 0 4.263
body Contains an URL listed in the WS SURBL blocklist URIBL_WS_SURBL 0 0.539 0 1.462
body Contains an URL listed in the PH SURBL blocklist URIBL_PH_SURBL 0 0.839 0 2.000
body Contains an URL listed in the OB SURBL blocklist URIBL_OB_SURBL 0 1.996 0 3.213
body Contains an URL listed in the AB SURBL blocklist URIBL_AB_SURBL 0 2.007 0 0.417
header From: address is in the auto white-list AWL 1
header From: address is in the user’s black-list USER_IN_BLACKLIST 100.000
header From: address is in the user’s white-list USER_IN_WHITELIST -100.000
header From: address is in the default white-list USER_IN_DEF_WHITELIST -15.000
header User is listed in ‘blacklist_to’ USER_IN_BLACKLIST_TO 10.000
header User is listed in ‘whitelist_to’ USER_IN_WHITELIST_TO -6.000
header User is listed in ‘more_spam_to’ USER_IN_MORE_SPAM_TO -20.000
header User is listed in ‘all_spam_to’ USER_IN_ALL_SPAM_TO -100.000

 

Like
Like Love Haha Wow Sad Angry
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments