Home / Tips / Web and Server / How spam filters generally work

How spam filters generally work

AREA TESTED LOCALE DESCRIPTION OF TEST TEST NAME DEFAULT SCORES
(local, net, with bayes, with bayes+net)
MORE INFO
(additional wiki docs)
body Generic Test for Unsolicited Bulk Email GTUBE 1000.000
full Listed in Razor2 (http://razor.sf.net/) RAZOR2_CHECK 0 0.150 0 1.511
body Razor2 gives confidence level above 50% RAZOR2_CF_RANGE_51_100 0 1.485 0 0.056
full Listed in DCC (http://rhyolite.com/anti-spam/dcc/) DCC_CHECK 0 1.373 0 2.169
full Listed in Pyzor (http://pyzor.sf.net/) PYZOR_CHECK 0 2.041 0 3.451
body Incorporates a tracking ID number TRACKER_ID 1.825 1.064 1.818 0.555
body Weird repeated double-quotation marks WEIRD_QUOTING 1.353 1.966 1.774 2.000
rawbody Extra blank lines in base64 encoding MIME_BASE64_BLANKS 0.693 0.819 1.391 1.469
rawbody base64 attachment does not have a file name MIME_BASE64_NO_NAME 0.022 0 0.017 0.000
rawbody Message text disguised using base64 encoding MIME_BASE64_TEXT 1.780 0.110 1.403 0.298
rawbody MIME section missing boundary MIME_MISSING_BOUNDARY 0 0.247 0.224 0
body Multipart message mostly text/html MIME MIME_HTML_MOSTLY 1.540 0.285 0.713 1.023
body Message only has text/html MIME parts MIME_HTML_ONLY 1.204 1.158 1.156 0.177
rawbody Quoted-printable line longer than 76 chars MIME_QP_LONG_LINE 0 0.000 0.105 0.039
rawbody MIME filename does not match content MIME_SUSPECT_NAME 0.100
body HTML and text parts are different MPART_ALT_DIFF 1.837 1.505 1.823 0.066
body Character set indicates a foreign language CHARSET_FARAWAY 3.200
body Message written in an undesired language UNWANTED_LANGUAGE_BODY 2.800
body Body includes 8 consecutive 8-bit characters BODY_8BITS 1.500
body Body contains a ROT13-encoded email address EMAIL_ROT13 2.720 1.474 2.934 3.105
body Message body has 70-80% blank lines BLANK_LINES_70_80 1.668 1.127 0.745 1.515
body Message body has 80-90% blank lines BLANK_LINES_80_90 0.046 0 0.216 0
body Message body has 90-100% blank lines BLANK_LINES_90_100 1.490 1.750 1.877 1.996
body Message body has many words used only once UNIQUE_WORDS 3.109 2.549 1.639 2.273
body Message body mentions many internet domains DOMAIN_RATIO 2.552 1.360 2.534 3.176
header Did not pass through any untrusted hosts ALL_TRUSTED -2.400 -2.820 -2.867 -3.300
header NJABL: sender is confirmed open relay RCVD_IN_NJABL_RELAY 0 0.934 0 1.397
header NJABL: dialup sender did non-local SMTP RCVD_IN_NJABL_DUL 0 1.655 0 0.088
header NJABL: sender is confirmed spam source RCVD_IN_NJABL_SPAM 0 1.051 0 1.841
header NJABL: sent through multi-stage open relay RCVD_IN_NJABL_MULTI 1
header NJABL: sender is an open formmail RCVD_IN_NJABL_CGI 1
header NJABL: sender is an open proxy RCVD_IN_NJABL_PROXY 0 1.026 0 0.438
header SORBS: sender is open HTTP proxy server RCVD_IN_SORBS_HTTP 0 0 0 0.043
header SORBS: sender is open proxy server RCVD_IN_SORBS_MISC 0 0 0 0.338
header SORBS: sender is open SMTP relay RCVD_IN_SORBS_SMTP 0 1.597 0 2.493
header SORBS: sender is open SOCKS proxy server RCVD_IN_SORBS_SOCKS 0 1.847 0 2.054
header SORBS: sender is a abuseable web server RCVD_IN_SORBS_WEB 0 0 0 0.007
header SORBS: sender demands to never be tested RCVD_IN_SORBS_BLOCK 1
header SORBS: sender is on a hijacked network RCVD_IN_SORBS_ZOMBIE 0 0.819 0 0
header SORBS: sent directly from dynamic IP address RCVD_IN_SORBS_DUL 0 0.137 0 1.987
header Received via a relay in Spamhaus SBL RCVD_IN_SBL 0 1.050 0 0.107
header Received via a relay in Spamhaus XBL RCVD_IN_XBL 0 2.511 0 3.076
header Envelope sender in dsn.rfc-ignorant.org DNS_FROM_RFC_DSN 1
header Envelope sender in postmaster.rfc-ignorant.org DNS_FROM_RFC_POST 0 1.376 0 1.614
header Envelope sender in abuse.rfc-ignorant.org DNS_FROM_RFC_ABUSE 0 0.374 0 0
header Envelope sender in whois.rfc-ignorant.org DNS_FROM_RFC_WHOIS 0 0.492 0 0.296
header Envelope sender in bogusmx.rfc-ignorant.org DNS_FROM_RFC_BOGUSMX 0 1.463 0 2.630
header Received via a relay in list.dsbl.org RCVD_IN_DSBL 0 2.765 0 3.805
header From: sender listed in dnsbl.ahbl.org DNS_FROM_AHBL_RHSBL 0 0.070 0 0.295
header Has Habeas warrant mark and on Infringer List HABEAS_INFRINGER 0 16.0 0 16.0
header Has Habeas warrant mark and on User List HABEAS_USER 0 -8.0 0 -8.0
header Sender is in Bonded Sender Program (trusted relay) RCVD_IN_BSP_TRUSTED 0 -4.3 0 -4.3
header Sender is in Bonded Sender Program (other relay) RCVD_IN_BSP_OTHER 0 -0.1 0 -0.1
header Sender domain is new and very high volume SB_NEW_BULK 1
header Sender IP hosted at NSP has a volume spike SB_NSP_VOLUME_SPIKE 1
header Received via a relay in bl.spamcop.net RCVD_IN_BL_SPAMCOP_NET 0 1.832 0 1.216
header Received via a relay in RSL RCVD_IN_RSL 0 0.677 0 1.720
header Relay in RBL, http://www.mail-abuse.org/rbl/ RCVD_IN_MAPS_RBL 1
header Relay in DUL, http://www.mail-abuse.org/dul/ RCVD_IN_MAPS_DUL 1
header Relay in RSS, http://www.mail-abuse.org/rss/ RCVD_IN_MAPS_RSS 1
header Relay in NML, http://www.mail-abuse.org/nml/ RCVD_IN_MAPS_NML 1
header Envelope sender has no MX or A DNS records NO_DNS_FOR_FROM 0 1.1 0 1.6
header Subject contains a gappy version of ‘cialis’ SUBJECT_DRUG_GAP_C 1.993 1.917 2.501 1.325
header Subject contains a gappy version of ‘levitra’ SUBJECT_DRUG_GAP_L 2.117 2.726 2.181 2.456
header Subject contains a gappy version of ‘phentermine’ SUBJECT_DRUG_GAP_P 0.621 0.765 0.698 1.425
header Subject contains a gappy version of ‘soma’ SUBJECT_DRUG_GAP_S 2.005 0.277 2.920 2.041
header Subject contains a gappy version of ‘valium’ SUBJECT_DRUG_GAP_VA 2.005 1.922 2.934 3.680
header Subject contains a gappy version of ‘viagra’ SUBJECT_DRUG_GAP_VIA 2.659 1.770 3.158 0.253
header Subject contains a gappy version of ‘vicodin’ SUBJECT_DRUG_GAP_VIC 2.560 2.961 2.691 2.868
header Subject contains a gappy version of ‘xanax’ SUBJECT_DRUG_GAP_X 2.538 2.282 2.945 2.512
body Talks about price per dose DRUG_DOSAGE 0.342 0.608 0.405 0.862
body Mentions an E.D. drug DRUG_ED_CAPS 0.122 1.535 0 0.185
body Viagra and other drugs DRUG_ED_COMBO 1.000 0.183 1.415 1.636
body Talks about an E.D. drug using its chemical name DRUG_ED_SILD 1.856 0.421 1.597 1.666
body Mentions Generic Viagra DRUG_ED_GENERIC 1.933 1.181 0 1.128
body Fast Viagra Delivery DRUG_ED_ONLINE 0.553 1.820 1.097 2.300
body Deep discount medications DEEP_DISC_MEDS 2.480 1.211 2.573 2.626
body Online Pharmacy ONLINE_PHARMACY 2.730 0 2.895 0.000
body Attempts to disguise the word ‘viagra’ VIA_GAP_GRA 2.800 3.171 2.886 3.005
body Two or more drugs crammed together into one word DRUGS_SMEAR1 0.515 1.522 0.475 2.351
header Host HELO did not match rDNS: msn.com FAKE_HELO_MSN 1.773 1.456 2.069 2.645
header Host HELO did not match rDNS: mail.com FAKE_HELO_MAIL_COM 1.303 1.972 0.111 0.000
header Host HELO did not match rDNS: email.com FAKE_HELO_EMAIL_COM 0 0 0 1.537
header Host HELO did not match rDNS: eudoramail.com FAKE_HELO_EUDORAMAIL 1.520 0.907 0 0
header Host HELO did not match rDNS: excite.com FAKE_HELO_EXCITE 1.840 2.127 2.127 2.074
header Host HELO did not match rDNS: lycos.com FAKE_HELO_LYCOS 1.410 1.645 0 0.988
header Host HELO did not match rDNS: yahoo.ca FAKE_HELO_YAHOO_CA 1.166 0 0.171 1.116
header Relay HELO’d with suspicious hostname (mail.com) FAKE_HELO_MAIL_COM_DOM 1.920 2.173 2.312 2.108
header Relay HELO’d using suspicious hostname (IP addr 1) HELO_DYNAMIC_IPADDR 3.520 2.754 4.070 4.400
header Relay HELO’d using suspicious hostname (DHCP) HELO_DYNAMIC_DHCP 2.791 0.087 0.958 1.248
header Relay HELO’d using suspicious hostname (HCC) HELO_DYNAMIC_HCC 3.360 1.540 2.451 3.741
header Relay HELO’d using suspicious hostname (ATTBI.com) HELO_DYNAMIC_ATTBI 3.200 3.662 2.760 3.147
header Relay HELO’d using suspicious hostname (Rogers) HELO_DYNAMIC_ROGERS 1.677 0.793 1.888 2.094
header Relay HELO’d using suspicious hostname (Adelphia) HELO_DYNAMIC_ADELPHIA 2.320 1.829 2.389 2.199
header Relay HELO’d using suspicious hostname (T-Dialin) HELO_DYNAMIC_DIALIN 2.320 0.443 2.429 1.755
header Relay HELO’d using suspicious hostname (Hex IP) HELO_DYNAMIC_HEXIP 1.826 1.320 1.453 1.522
header Relay HELO’d using suspicious hostname (Split IP) HELO_DYNAMIC_SPLIT_IP 2.869 0.887 0.992 0.775
header Relay HELO’d using suspicious hostname (YahooBB) HELO_DYNAMIC_YAHOOBB 2.800 2.776 2.572 3.000
header Relay HELO’d using suspicious hostname (OptOnline) HELO_DYNAMIC_OOL 3.120 2.508 3.065 3.182
header Relay HELO’d using suspicious hostname (IP addr 2) HELO_DYNAMIC_IPADDR2 3.271 0.805 2.554 3.496
header Relay HELO’d using suspicious hostname (RR 2) HELO_DYNAMIC_RR2 2.080 1.015 1.678 2.200
header Relay HELO’d using suspicious hostname (Comcast) HELO_DYNAMIC_COMCAST 3.040 3.533 3.217 3.700
header Relay HELO’d using suspicious hostname (Telia) HELO_DYNAMIC_TELIA 0 0 1.216 1.515
header Relay HELO’d using suspicious hostname (VTR) HELO_DYNAMIC_VTR 1.916 0.805 2.013 1.960
header Relay HELO’d using suspicious hostname (Chello.no) HELO_DYNAMIC_CHELLO_NO 1.388 0.226 1.409 1.570
header Relay HELO’d using suspicious hostname (Chello.nl) HELO_DYNAMIC_CHELLO_NL 1.762 0 0.542 0.244
header Relay HELO’d using suspicious hostname (Veloxzone) HELO_DYNAMIC_VELOX 1.680 1.877 1.803 2.003
header Relay HELO’d using suspicious hostname (NTL) HELO_DYNAMIC_NTL 1.340 0.187 1.445 1.732
header Relay HELO’d using suspicious hostname (Home.nl) HELO_DYNAMIC_HOME_NL 1.737 0.635 1.660 1.878
header Message headers are very long HEAD_LONG 2.5
header From: does not include a real name NO_REAL_NAME 0.124 0.178 0.336 0.007
header From: ends in numbers FROM_ENDS_IN_NUMS 0.177 0.516 0.517 0.000
header From: starts with nums FROM_STARTS_WITH_NUMS 1.218 1.492 1.441 0.300
header From: contains numbers mixed in with letters FROM_HAS_MIXED_NUMS 0.107 0.298 0.024 0.000
header From: contains numbers mixed in with letters FROM_HAS_MIXED_NUMS3 1.132 1.113 1.513 1.614
header Uses an address with lots of numbers, at a big ISP ADDR_NUMS_AT_BIGSITE 0.072 0.748 0.112 0.081
header From address is “at something-offers” FROM_OFFERS 1.822 0.861 2.243 1.491
header From: has no local-part before @ sign FROM_NO_USER 1.358 0.344 1.460 0.983
header To: has no local-part before @ sign TO_NO_USER 0.332 0.116 1.615 0.128
header To: is empty TO_EMPTY 0 0 0.164 0.097
header Reply-To: is empty REPLY_TO_EMPTY 1.274 1.410 1.568 1.643
header To: repeats address as real name TO_ADDRESS_EQ_REAL 0 0.470 0.131 0.026
header Valid-looking To “undisclosed-recipients” UNDISC_RECIPS 0.966 1.391 1.295 1.302
header Faked To “Undisclosed-Recipients” FAKED_UNDISC_RECIPS 1.287 0.565 1.431 1.602
header Subject has exclamation mark and question mark PLING_QUERY 0.201 0.857 0.906 0.368
header Subject contains a unique ID SUBJ_HAS_UNIQ_ID 0.899 1.122 0.809 1.339
header Subject contains lots of white space SUBJ_HAS_SPACES 2.240 0.637 1.899 1.175
header Subject is all capitals SUBJ_ALL_CAPS 0.763 0.365 0.257 0.665
header Spam tool Message-Id: (99x9xx99 variant) MSGID_SPAM_99X9XX99 0.500 0.864 1.576 1.442
header Spam tool Message-Id: (alpha-numeric variant) MSGID_SPAM_ALPHA_NUM 2.640 3.004 3.330 3.228
header Spam tool Message-Id: (caps variant) MSGID_SPAM_CAPS 3.500 3.221 3.545 3.791
header Spam tool Message-Id: (letters variant) MSGID_SPAM_LETTERS 2.960 3.151 3.052 2.709
header Spam tool Message-Id: (12-zeroes variant) MSGID_SPAM_ZEROES 1.584 1.763 1.783 1.859
header Message-Id has no hostname MSGID_NO_HOST 0.087 0 0.816 0.140
header Message-Id is fake (in Outlook Express format) MSGID_OUTLOOK_INVALID 2.000 2.290 2.498 2.700
header Message-ID has ALLCAPS@yahoo.com MSGID_YAHOO_CAPS 2.425 0.702 2.442 3.800
header Message-Id for external message added locally MSGID_FROM_MTA_ID 1.440 1.704 1.756 1.723
header Message-Id was added by a hotmail.com relay MSGID_FROM_MTA_HOTMAIL 1.600 1.858 1.987 2.144
header Date header uses unusual Y2K formatting DATE_SPAMWARE_Y2K 2.958 2.888 3.384 3.911
header Invalid Date: header (not RFC 2822) INVALID_DATE 0.011 0.235 0 0.236
header Invalid Date: header (timezone does not exist) INVALID_DATE_TZ_ABSURD 0 0 0.664 0.960
header Invalid date in header (wrong CST timezone) INVALID_TZ_CST 2.044 0.066 0.598 2.873
header Invalid date in header (wrong EST timezone) INVALID_TZ_EST 1.492 2.326 1.672 3.582
header Invalid date in header (wrong GMT/UTC timezone) INVALID_TZ_GMT 1.708 0.636 1.549 0.198
header Date: is 3 to 6 hours before Received: date DATE_IN_PAST_03_06 0.025 0 0.127 0
header Date: is 6 to 12 hours before Received: date DATE_IN_PAST_06_12 0.301 0.211 0.918 0
header Date: is 12 to 24 hours before Received: date DATE_IN_PAST_12_24 0.374 0 0.571 0.703
header Date: is 24 to 48 hours before Received: date DATE_IN_PAST_24_48 0 0.302 0.133 0.089
header Date: is 48 to 96 hours before Received: date DATE_IN_PAST_48_96 0.034 0.257 0.222 0
header Date: is 96 hours or more before Received: date DATE_IN_PAST_96_XX 0.505 1.082 0.979 1.360
header Date: is 3 to 6 hours after Received: date DATE_IN_FUTURE_03_06 1.288 0.072 2.052 0.847
header Date: is 6 to 12 hours after Received: date DATE_IN_FUTURE_06_12 1.040 1.202 1.153 1.300
header Date: is 12 to 24 hours after Received: date DATE_IN_FUTURE_12_24 2.118 2.329 2.863 3.031
header Date: is 24 to 48 hours after Received: date DATE_IN_FUTURE_24_48 2.023 2.046 2.301 2.314
header Date: is 48 to 96 hours after Received: date DATE_IN_FUTURE_48_96 2.080 2.296 2.498 2.689
header Date: is 96 hours or more after Received: date DATE_IN_FUTURE_96_XX 1.393 1.428 1.930 1.962
header Headers contain an unresolved template UNRESOLVED_TEMPLATE 1.324 0.618 1.369 2.866
header Subject contains too many raw illegal characters SUBJ_ILLEGAL_CHARS 2.880 2.854 3.459 2.854
header From contains too many raw illegal characters FROM_ILLEGAL_CHARS 0.861 0.046 0 0.008
header Header contains too many raw illegal characters HEAD_ILLEGAL_CHARS 0.539 2.018 0.961 2.125
header Subject contains an English UCE tag ENGLISH_UCE_SUBJECT 2.080 0.336 2.127 0.110
header Subject contains a Japanese UCE tag JAPANESE_UCE_SUBJECT 0 0 1.665 1.800
header Subject: contains Korean unsolicited email tag KOREAN_UCE_SUBJECT 2.400 2.703 2.469 3.081
header From and To are the same, but not exactly FROM_AND_TO_SAME 0 0.198 0 0
header Received: contains a forged HELO FORGED_RCVD_HELO 0 0.050 0.266 0.000
header Received: HELO and IP do not match, but should RCVD_HELO_IP_MISMATCH 2.799 0.618 1.647 2.178
header Received: contains an IP address used for HELO RCVD_NUMERIC_HELO 0.636 1.531 1.348 1.248
header Received: contains illegal IP address RCVD_ILLEGAL_IP 1.335 1.370 1.588 0.944
header Received by mail server with no name RCVD_BY_IP 0 0.024 0.051 0.067
header Received forged, contains fake AOL relays FORGED_AOL_RCVD 0 0 1.451 0
header Contains forged hostname for a DSL IP in Brazil FORGED_TELESP_RCVD 1.595 0.669 1.468 1.532
header Forged hotmail.com ‘Received:’ header found FORGED_HOTMAIL_RCVD 2.614 2.132 2.150 2.536
header hotmail.com ‘From’ address, but no ‘Received:’ FORGED_HOTMAIL_RCVD2 0.787 1.079 1.415 1.177
header Forged eudoramail.com ‘Received:’ header found FORGED_EUDORAMAIL_RCVD 1.657 0.653 1.130 0.290
header ‘From’ yahoo.com does not match ‘Received’ headers FORGED_YAHOO_RCVD 1.668 2.174 2.095 2.700
header ‘From’ juno.com does not match ‘Received’ headers FORGED_JUNO_RCVD 1.644 1.722 2.018 0.792
header Forged ‘by gw05’ ‘Received:’ header found FORGED_GW05_RCVD 0 0 1.495 1.697
header Character set doesn’t exist NONEXISTENT_CHARSET 0 0 1.411 1.418
header A foreign language charset used in headers CHARSET_FARAWAY_HEADER 3.200
header Sent with ‘X-Priority’ set to high X_PRIORITY_HIGH 0.125 0.093 0.077 0.000
header Sent with ‘X-Msmail-Priority’ set to high X_MSMAIL_PRIORITY_HIGH 0 0.267 0.021 0.000
header Received: says mail sent around the world (HELO) ROUND_THE_WORLD_LOCAL 1.347 0.464 2.351 0.213
header Received: says mail sent around the world (DNS) ROUND_THE_WORLD 0 1.741 0 1.958
header Missing Date: header MISSING_DATE 0 0.019 0.647 0.000
header Missing To: header MISSING_HEADERS 0 0 0.087 0.119
header Similar addresses in recipient list SUSPICIOUS_RECIPS 1.473 1.459 0.820 1.915
header Recipient list is sorted by address SORTED_RECIPS 0.879 1.155 1.759 0.887
header Subject: contains G.a.p.p.y-T.e.x.t GAPPY_SUBJECT 1.365 1.319 2.084 1.343
header Message has X-Library header X_LIBRARY 2.105 1.369 1.863 2.755
header Subject contains “As Seen” SUBJ_AS_SEEN 0.995 1.691 1.214 0.000
header Subject starts with dollar amount SUBJ_DOLLARS 2.449 0.973 1.935 0.054
header Subject contains “For Only” SUBJ_FOR_ONLY 0.646 1.100 1.726 0.044
header Subject contains “FREE” in CAPS SUBJ_FREE_CAP 0.011 0 0.146 0.000
header Subject starts with “Free” SUB_FREE_OFFER 0.055 0.034 0.103 0.000
header Subject GUARANTEED SUBJ_GUARANTEED 1.749 1.302 0.081 0.452
header Subject starts with “Hello” SUB_HELLO 1.405 1.358 0.954 0.007
header Subject includes “life insurance” SUBJ_LIFE_INSURANCE 1.840 2.068 2.184 2.020
header Subject contains “Your Bills” or similar SUBJ_YOUR_DEBT 1.760 2.068 2.035 1.261
header Subject contains “Your Family” SUBJ_YOUR_FAMILY 1.647 0 2.033 0.011
header Subject contains “Your Own” SUBJ_YOUR_OWN 0.872 1.294 1.371 0.000
header Received contains a faked HELO hostname RCVD_FAKE_HELO_DOTCOM 0.899 0.034 0.969 0.424
header To: address appears in Subject ADDRESS_IN_SUBJECT 1.296 1.409 1.866 1.804
header Subject talks about losing pounds SUBJECT_DIET 1.355 0.723 0.059 0.266
header Header has extraneous Content-type:…type= entry EXTRA_MPART_TYPE 0 0.222 0 0
header To header contains ‘recipient’ marker TO_RECIP_MARKER 0 0 1.370 1.539
header Spam tool pattern in MIME boundary MIME_BOUND_DD_DIGITS 3.600 4.230 4.162 4.139
header Spam tool pattern in MIME boundary MIME_BOUND_DIGITS_7 0 0 1.460 0.893
header Spam tool pattern in MIME boundary MIME_BOUND_DIGITS_15 2.674 3.286 3.120 3.400
header Spam tool pattern in MIME boundary MIME_BOUND_MANY_HEX 1.920 2.255 2.590 2.700
header Spam tool pattern in MIME boundary (rfkindy) MIME_BOUND_RKFINDY 2.080 2.347 2.590 2.671
header To: has a malformed address TO_MALFORMED 0.895 2.253 0.455 2.187
header From address is webmail, but starts with a number FROM_NUM_AT_WEBMAIL 1.389 0.258 1.901 1.617
header From webmail service and address ends in numbers FROM_WEBMAIL_END_NUMS6 0.178 0.046 0.389 0.000
header From Address contains FREE ADDR_FREE 0.194 0.078 1.038 1.832
header Sent to a text file TO_TXT 0 0 1.362 1.580
header Involves ‘china.com’ CHINA_HEADER 1.840 1.911 2.312 2.386
header Received line contains spam-sign (lowercase smtp) WITH_LC_SMTP 1.600 0.235 1.862 2.200
header From address has no lower-case characters FROM_NO_LOWER 1.010 1.307 1.650 0.377
header Subject line starts with Buy or Buying SUBJ_BUY 0.565 0.490 0.414 0.000
header Subject is indicative of a Nigerian spam NIGERIAN_SUBJECT1 0 0 0.270 0
header Subject is indicative of a Nigerian spam NIGERIAN_SUBJECT2 1.235 1.765 1.935 2.090
header Message would have been caught by accessdb ACCESSDB 1
header Received headers forged (AM/PM) RCVD_AM_PM 1.558 0.091 1.802 1.927
header Multiple Content-Type headers found HEADER_COUNT_CTYPE 1.198 1.676 1.482 1.771
header Host HELO’d as a big ISP, but had no rDNS NO_RDNS_DOTCOM_HELO 0.025 0.024 0.601 0.016
header X-Originating-IP doesn’t look like IPv4 address X_ORIG_IP_NOT_IPV4 0 1.006 0.081 2.582
header X-Authentication-Warning header looks faked X_AUTH_WARN_FAKED 2.094 2.599 1.654 3.105
header Received header contains faked ‘mr.outblaze.com’ FAKE_OUTBLAZE_RCVD 2.400 2.726 2.867 3.100
header Message is from domain that never sends email FROM_NONSENDING_DOMAIN 1.486 0.308 1.678 0.000
header Subject contains common spam sign (2 numbers) SUBJ_2_NUM_PARENS 1.472 0.276 1.672 2.102
body HTML included in message HTML_MESSAGE 0.001
body Message is 0% to 10% HTML HTML_00_10 0.985 0.138 1.070 1.068
body Message is 10% to 20% HTML HTML_10_20 1.050 0.295 1.350 0.246
body Message is 20% to 30% HTML HTML_20_30 1.241 0.504 0.567 0.226
body Message is 30% to 40% HTML HTML_30_40 0.879 0.056 0.437 0.021
body Message is 40% to 50% HTML HTML_40_50 0.527 0.086 0.052 0.035
body Message is 50% to 60% HTML HTML_50_60 1.053 0.095 0.539 0.087
body Message is 60% to 70% HTML HTML_60_70 0.516 0.027 0 0
body Message is 70% to 80% HTML HTML_70_80 0.151 0 0.039 0
body Message is 80% to 90% HTML HTML_80_90 0.027 0 0.036 0.146
body Message is 90% to 100% HTML HTML_90_100 0.346 0.189 0.043 0.022
body HTML has very strong “shouting” markup HTML_SHOUTING3 0.266 0 0.012 0.019
body HTML has very strong “shouting” markup HTML_SHOUTING4 0.076 0 0.052 0
body HTML has very strong “shouting” markup HTML_SHOUTING5 0.026 0 0.030 0.019
body HTML has very strong “shouting” markup HTML_SHOUTING6 0 0.004 0 0.000
body HTML has very strong “shouting” markup HTML_SHOUTING7 0.450 0.472 0 0.646
body HTML contains text after HTML close tag HTML_TEXT_AFTER_HTML 0.312 0.205 0.032 0.031
body HTML contains text after BODY close tag HTML_TEXT_AFTER_BODY 0.263 0.151 0.752 0.061
body HTML comment is very short HTML_COMMENT_SHORT 0.014 0.625 0 0.000
body HTML message is a saved web page HTML_COMMENT_SAVED_URL 0.528 0.130 0.470 0.146
body HTML conversion tool used by spam HTML_CONVERTED 0 1.204 0.402 1.605
body HTML with embedded plugin object HTML_EMBEDS 0 0.084 0.108 0.207
body HTML contains unsafe auto-executing code HTML_EVENT_UNSAFE 0 0 0.022 0.515
body HTML font size is tiny HTML_FONT_SIZE_TINY 0 0.419 0 0.533
body HTML font size is negative HTML_FONT_SIZE_NONE 0 0.455 1.119 0.033
body HTML font size is large HTML_FONT_SIZE_LARGE 1.387 0.712 0.496 0.153
body HTML font size is huge HTML_FONT_SIZE_HUGE 1.796 1.278 2.265 2.594
body HTML tag for a big font size HTML_FONT_BIG 0 0.232 0 0.142
body HTML tag for a tiny font size HTML_FONT_TINY 2.141 0.471 0.521 0.964
body HTML font color is same as background HTML_FONT_INVISIBLE 0 0.065 0 0.036
body HTML font color similar to background HTML_FONT_LOW_CONTRAST 1.011 0.955 1.017 0.788
body HTML font face is not a word HTML_FONT_FACE_BAD 0 0 0.044 0.037
body HTML font face has excess capital characters HTML_FONT_FACE_CAPS 0 0.804 0.281 0.247
body HTML includes a form which sends mail HTML_FORMACTION_MAILTO 1.840 2.162 1.907 2.353
body HTML: images with 0-400 bytes of words HTML_IMAGE_ONLY_04 3.120 3.094 3.482 3.304
body HTML: images with 400-800 bytes of words HTML_IMAGE_ONLY_08 2.881 1.970 2.730 3.036
body HTML: images with 800-1200 bytes of words HTML_IMAGE_ONLY_12 2.360 1.473 2.741 2.942
body HTML: images with 1200-1600 bytes of words HTML_IMAGE_ONLY_16 1.352 1.279 1.990 1.047
body HTML: images with 1600-2000 bytes of words HTML_IMAGE_ONLY_20 1.567 0.843 1.023 0.446
body HTML: images with 2000-2400 bytes of words HTML_IMAGE_ONLY_24 1.088 1.003 0.787 0.502
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_02 1.729 0 1.125 0.018
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_04 1.038 0.184 0.515 0.105
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_06 0.072 0 0.342 0.131
body HTML has a low ratio of text to image area HTML_IMAGE_RATIO_08 0 0.000 0 0.032
body HTML link text says “push here” or similar HTML_LINK_PUSH_HERE 1.627 0.409 1.843 0.873
body Message is 5% to 10% HTML obfuscation HTML_OBFUSCATE_05_10 0.428 0.483 0.563 0.257
body Message is 10% to 20% HTML obfuscation HTML_OBFUSCATE_10_20 0.931 0.732 0.796 0.865
body Message is 20% to 30% HTML obfuscation HTML_OBFUSCATE_20_30 0.997 0.597 0.014 0.000
body Message is 30% to 40% HTML obfuscation HTML_OBFUSCATE_30_40 2.517 1.933 3.005 3.445
body Message is 40% to 50% HTML obfuscation HTML_OBFUSCATE_40_50 2.641 1.746 2.739 3.089
body Message is 50% to 60% HTML obfuscation HTML_OBFUSCATE_50_60 2.635 1.339 2.882 3.325
body Message is 60% to 70% HTML obfuscation HTML_OBFUSCATE_60_70 2.257 0.971 2.432 2.805
body Message is 70% to 80% HTML obfuscation HTML_OBFUSCATE_70_80 2.308 1.334 2.256 2.689
body Message is 80% to 90% HTML obfuscation HTML_OBFUSCATE_80_90 1.600 0.489 1.656 1.939
body Message is 90% to 100% HTML obfuscation HTML_OBFUSCATE_90_100 1.405 0.203 1.657 1.775
body HTML tags used to obfuscate words HTML_BACKHAIR_2 0.144 0 0.032 0
body HTML tags used to obfuscate words HTML_BACKHAIR_4 0 0 0.138 0.058
body HTML tags used to obfuscate words HTML_BACKHAIR_8 1.075 0.569 1.137 0.727
body HTML has many bad attributes in tags HTML_ATTR_BAD 0 0.101 0.609 2.354
body HTML appears to have random attributes in tags HTML_ATTR_UNIQUE 0.441 1.165 1.097 0.000
body Image tag intended to identify you HTML_WEB_BUGS 0.166 0.013 0.311 0.035
body HTML has unbalanced “body” tags HTML_TAG_BALANCE_BODY 0.043 0.389 0.096 0.000
body HTML has unbalanced “head” tags HTML_TAG_BALANCE_HEAD 0.061 0.860 0.033 0.000
body HTML has “marquee” tag HTML_TAG_EXIST_MARQUEE 2.160 1.758 1.840 2.034
body HTML has “tbody” tag HTML_TAG_EXIST_TBODY 1.014 0.233 0.079 0.114
body HTML message is 0% to 10% bad tags HTML_BADTAG_00_10 0 0 0.001 0.000
body HTML message is 10% to 20% bad tags HTML_BADTAG_10_20 0.236 0 0 0
body HTML message is 20% to 30% bad tags HTML_BADTAG_20_30 0 0.169 0.035 0
body HTML message is 30% to 40% bad tags HTML_BADTAG_30_40 0 0.103 0.017 0
body HTML message is 40% to 50% bad tags HTML_BADTAG_40_50 0.002 0 0.000 0.010
body HTML message is 50% to 60% bad tags HTML_BADTAG_50_60 0.864 0.430 1.035 0.153
body HTML message is 60% to 70% bad tags HTML_BADTAG_60_70 1.726 1.127 2.314 1.356
body HTML message is 70% to 80% bad tags HTML_BADTAG_70_80 1.657 0.075 2.087 2.280
body HTML message is 80% to 90% bad tags HTML_BADTAG_80_90 1.861 1.309 1.831 1.911
body HTML message is 90% to 100% bad tags HTML_BADTAG_90_100 0.746 1.192 2.688 2.804
body 0% to 10% of HTML elements are non-standard HTML_NONELEMENT_00_10 0 0 0.001 0.001
body 10% to 20% of HTML elements are non-standard HTML_NONELEMENT_10_20 0.045 0 0.000 0.000
body 20% to 30% of HTML elements are non-standard HTML_NONELEMENT_20_30 0.346 0.070 0 0
body 30% to 40% of HTML elements are non-standard HTML_NONELEMENT_30_40 0 0.012 0.010 0.000
body 40% to 50% of HTML elements are non-standard HTML_NONELEMENT_40_50 0.000
body 50% to 60% of HTML elements are non-standard HTML_NONELEMENT_50_60 1
body 60% to 70% of HTML elements are non-standard HTML_NONELEMENT_60_70 0.237 1.138 0.083 0.001
body 70% to 80% of HTML elements are non-standard HTML_NONELEMENT_70_80 0.488 0.803 1.169 0.000
body 80% to 90% of HTML elements are non-standard HTML_NONELEMENT_80_90 0.016 0.492 0.023 0.000
body 90% to 100% of HTML elements are non-standard HTML_NONELEMENT_90_100 0.011 1.582 0 2.963
body HTML is extremely short HTML_SHORT_LENGTH 0.601 0.713 0.068 0.389
body HTML title contains no text HTML_TITLE_EMPTY 0.022 0.045 0.036 0.004
body HTML title contains “Untitled” HTML_TITLE_UNTITLED 0.222 0.259 0.792 0.000
rawbody Javascript to hide URLs in browser HIDE_WIN_STATUS 0.032 0 0 0.063
rawbody HTML contains needlessly encoded characters ENTITY_DEC_ALPHANUM 0.012 0 2.686 2.716
body List removal information MULTI_REMOVAL_1WORD 1.005 0 0.916 0.802
body Send real mail to be unsubscribed REMOVE_POSTAL 1.520 1.362 1.757 1.900
body Asks you to click below (in capital letters) CLICK_BELOW_CAPS 0.135 0 0 0.112
body Click to be removed CLICK_TO_REMOVE_1 0.050 0 0.192 0.791
body Claims compliance with spam regulations SENT_IN_COMPLIANCE 1.520 1.786 1.850 2.000
body Possible mention of bill 1618 (anti-spam bill) BILL_1618 0.994 1.692 1.798 1.895
body Doesn’t ask any questions NO_QS_ASKED 0 1.196 0 0.000
body Offers a full refund FULL_REFUND 0.853 1.114 0.079 1.272
body No such thing as a free lunch (2) COMPLETELY_FREE 0.086 0 0.840 0.026
body No such thing as a free lunch (3) NO_COST 0.078 0 0.335 0.000
body One hundred percent guaranteed GUARANTEED_100_PERCENT 0.615 0.435 0.669 0.000
body Dear Friend? That’s not very dear! DEAR_FRIEND 0.542 0.766 1.288 0.070
body Contains ‘Dear (something)’ DEAR_SOMETHING 1.059 0.803 1.577 1.578
body Talks about lots of money BILLION_DOLLARS 0.193 1.185 0.407 0.134
body Talks about opting out (lowercase version) OPTING_OUT 0.157 0.494 0.030 0.479
body Talks about opting out (capitalized version) OPTING_OUT_CAPS 0.067 0.026 0.483 0.000
body Get a million email addresses MILLION_EMAIL 0.093 0.417 0.937 0.000
body Gives a lame excuse about why spam was sent EXCUSE_1 0 0 0.074 0.132
body Claims you can be removed from the list EXCUSE_3 0 0.098 0.015 0.116
body Claims you can be removed from the list EXCUSE_4 1.145 1.775 1.443 1.119
body Claims you can be removed from the list EXCUSE_6 1.444 0.734 1.782 1.696
body Claims you can be removed from the list EXCUSE_7 0 0.152 0.010 0.018
body “if you do not wish to receive any more” EXCUSE_10 0.071 0.380 0.039 0.024
body Nobody’s perfect EXCUSE_12 0.153 0 0.354 0.197
body Claims you opted-in or registered EXCUSE_19 0.056 0.357 0.021 0.000
body Claims you have provided permission EXCUSE_23 1.840 2.088 2.312 2.400
body Claims you wanted this ad EXCUSE_24 1.440 1.272 1.874 2.080
body Talks about how to be removed from mailings EXCUSE_REMOVE 0.043 0 0.513 0.310
body Targeted Traffic / Email Addresses TARGETED 0 0.692 1.471 0.480
body Tells you about a strong buy STRONG_BUY 2.880 3.384 3.018 3.117
body Claims to honor removal requests WE_HONOR_ALL 2.063 2.365 1.789 2.029
body Offers a picked stock STOCK_PICK 0.106 0.150 0.041 1.470
body Offers a alert about a stock STOCK_ALERT 2.362 1.782 2.378 2.385
body SEC-mandated penny-stock warning MICRO_CAP_WARNING 1.440 0.760 1.803 1.828
body Not registered investment advisor NOT_ADVISOR 2.160 2.444 2.590 2.700
body Describes some sort of breakthrough SOME_BREAKTHROUGH 0.232 1.921 0.907 1.610
body They have selected you for something SELECTED_YOU 1.485 1.865 1.841 1.897
body Contains mail-in order form MAIL_IN_ORDER_FORM 1.440 0.351 0 0
body University Diplomas UNIVERSITY_DIPLOMAS 2.242 0.523 0 0
body ‘Prestigious Non-Accredited Universities’ PREST_NON_ACCREDITED 1.520 1.394 1.607 1.901
body Claims “cannot be considered spam” CANNOT_BE_SPAM 0 0 1.546 1.769
body Information on growing body parts BODY_ENHANCEMENT 0.151 0.481 0.070 0
body Information on getting larger body parts BODY_ENHANCEMENT2 0.814 0.845 0.109 0
body Impotence cure IMPOTENCE 0.095 0.751 0 0.094
body Information on how to work at home (1) WORK_AT_HOME 0 0 0.325 0.030
body Information on mortgages MORTGAGE_BEST 0.948 0.923 0 0.144
body Looks like mortgage pitch MORTGAGE_PITCH 0.297 0 0.065 0
body Information on mortgage rates MORTGAGE_RATES 0 0.689 0.174 0.202
body Order a report from someone ORDER_REPORT 0 0 1.230 0
rawbody mailto URI includes removal text MAILTO_SUBJ_REMOVE 1.023 0 2.064 0.542
body Includes a link for AOL users to click AOL_USERS_LINK 0 0 0.034 0.109
body Talks about a million North American dollars NA_DOLLARS 2.078 2.193 2.485 2.611
body Mentions millions of (dollar) ((dollar) NN,NNN,NNN.NN) US_DOLLARS_3 0.331 0.411 0.010 0.354
body Talks about millions of dollars MILLION_USD 1.594 1.290 1.535 2.796
rawbody Frontpage used to create the message FRONTPAGE 0.510 0.529 0.595 2.080
body Contains “My wife, Jody” testimonial JODY 0 0 1.326 0
body Doing something with my income YOUR_INCOME 0.674 0.892 0.372 1.092
body Resistance to this spam is futile RESISTANCE_IS_FUTILE 1.520 1.786 1.850 0
body Contains ‘subject to credit approval’ SUBJ_2_CREDIT 0 0.500 0 0.076
body Contains urgent matter URG_BIZ 0.288 0.030 1.064 1.808
body Contains ‘earn (dollar) something per week’ EARN_PER_WEEK 1.360 0.856 1.757 1.896
body Spam is 100% natural?! ALL_NATURAL 2.640 1.828 2.246 1.061
body Money back guarantee MONEY_BACK 2.051 0.037 0.217 0.095
body There is no catch NO_CATCH 0 0 0.127 0
body There is no obligation NO_OBLIGATION 0.905 0.565 1.157 0.830
body You won’t be “disappointed” NO_DISAPPOINTMENT 0 1.498 1.609 0.410
body Serious Enquiries Only SERIOUS_ONLY 0 0 1.664 1.748
body Risk free. Suuurreeee…. RISK_FREE 0.036 0.247 0.135 0.230
body As seen on national TV! AS_SEEN_ON 0.393 0.320 0.613 0.020
body Common pyramid scheme phrase (1) COPY_ACCURATELY 0 0 1.324 0
body Off Shore Scams OFFSHORE_SCAM 0 0.337 0.127 0.144
body Why Pay More? WHY_PAY_MORE 1.249 0 1.713 1.978
body Congratulations – you’ve been scammed? CONGRATULATIONS 0 0 0.486 0.272
body Talks about free mobile phones CELL_PHONE_FREE 1.280 1.476 1.571 0.922
body Talks about cell-phone signal improvement CELL_PHONE_IMPROVE 0.771 0.812 1.655 1.031
body Receive a special offer RECEIVE_OFFER 1.125 0.955 1.446 0.793
body Free express or no-obligation quote FREE_QUOTE_INSTANT 0.211 1.736 0.051 0.001
body Free Membership FREE_MEMBERSHIP 0.492 1.182 1.587 0.873
body Credit Card Offers CREDIT_CARD 0.030 0.896 0.032 0.310
body Without a credit check NO_CREDIT_CHECK 0 0 1.990 0.037
body Avoiding bankruptcy BANKRUPTCY 0.249 1.088 1.112 0.489
body Accepting credit cards ACCEPT_CREDIT_CARDS 0.360 0 1.332 0.399
body Eliminate Bad Credit BAD_CREDIT 1.161 0.252 0.817 0
body Non-secured Credit/Debt NONSECURED_CREDIT 0 0 1.074 0
body Consolidate debt, credit, or bills CONSOLIDATE_DEBT 0.886 0.653 0 0.245
body Home refinancing REFINANCE_YOUR_HOME 1.321 0.394 0.917 0.340
body Home refinancing REFINANCE_NOW 1.611 0 1.191 0.029
body No Purchase Necessary NO_PURCHASE 0 0 0.107 0
body No Medical Exams NO_MEDICAL 1.440 1.656 1.665 0
body No Claim Forms NO_FORMS 1.622 0.973 0.912 0.011
body Requires Initial Investment INITIAL_INVEST 0.433 0.450 1.026 1.230
body Buy Direct BUY_DIRECT 1.502 1.779 1.757 1.663
body Do it Today DO_IT_TODAY 0.036 0.047 0 0
body What are you waiting for WHY_WAIT 2.240 2.060 0.796 0.764
body You can search for anyone YOU_CAN_SEARCH 1.370 0.444 1.246 1.630
body Score with babes! SEDUCTION 1.560 1.356 1.415 1.054
body Invaluable marketing information INVALUABLE_MARKETING 0 0 1.201 0
body Guaranteed Stuff GUARANTEED_STUFF 0.100 0.238 0.403 0.000
body Potential Earnings EARNINGS 0 0 1.642 1.675
body The best Rates THE_BEST_RATE 0 0.550 0 0.000
body Amazing Stuff AMAZING_STUFF 0.949 1.269 0.069 0.102
body Lose Weight Spam DIET_1 0.671 0.365 0.274 0
body Describes weight loss DIET_2 0.545 0 1.034 0.316
body Describes body fat loss DIET_3 1.794 1.061 1.835 2.073
body Reverses Aging REVERSE_AGING 1.919 1.403 2.057 2.150
body Cures Baldness HAIR_LOSS 1.381 2.371 1.428 1.738
body Removes Wrinkles WRINKLES 1.730 2.097 1.917 2.091
body While you Sleep WHILE_YOU_SLEEP 0.858 0.605 1.786 0.000
body If only it were that easy RICH 0 0.451 0 0.000
body Who really wins? YOU_WON 0.144 0.269 0 0.579
body Talks about Hidden Charges HIDDEN_CHARGES 0.046 0.961 0 0.000
body Freedom of a financial nature FIN_FREE 1.365 0.015 1.865 0.788
body Stock Disclaimer Statement FORWARD_LOOKING 1.840 2.162 2.120 2.200
body Mail guarantees satisfaction SATIS_GUAR 0.884 0 0.825 0.081
body Offers Extra Cash EXTRA_CASH 0.117 0.987 0.629 0.447
body Get Paid GET_PAID 1.390 1.764 1.466 0.862
body Have you been turned down? BEEN_TURNED_DOWN 1.336 1.266 1.682 1.890
body One Time Rip Off ONE_TIME 0.044 0 0.036 0.619
body Compete for your business COMPETE 1.600 1.791 1.804 2.050
body Meet Singles MEET_SINGLES 1.600 0 1.076 1.172
body Join Millions of Americans JOIN_MILLIONS 0.036 0.640 0.999 0.448
body Be your own boss BE_BOSS 1.512 0.145 1.847 1.648
body Multi Level Marketing mentioned ML_MARKETING 0.049 0 0.103 0
body Claims to be Legal ITS_LEGAL 0.186 1.109 0.432 0.264
body Confidentiality on all orders CONFIDENTIAL_ORDER 1.920 1.196 1.889 1.266
body Save big money SAVE_THOUSANDS 0.929 1.889 0.717 0.031
body Claims you registered with a partner MARKETING_PARTNERS 2.025 0.718 2.405 1.401
body Free Preview FREE_PREVIEW 1.612 0.376 1.887 1.851
body Domain name containing a “4u” variant DOMAIN_4U2 1.508 1.783 1.935 1.588
body Contains ‘free access’ with capitals FREE_ACCESS 0 0 0.253 0
body Contains ‘free sample’ with capitals FREE_SAMPLE 0.089 0.168 0.223 0.941
body Lowest Price LOW_PRICE 0.885 0 0.206 0
body People just leave money laying around UNCLAIMED_MONEY 1.263 1.703 1.945 1.584
body Message seems to contain rot13ed address OBSCURED_EMAIL 2.720 3.194 3.186 3.132
body Mentions their affiliate partners OUR_AFFILIATE_PARTNERS 0 0 0.041 1.443
body Talks about exercise with an exclamation! BANG_EXERCISE 1.450 1.993 1.662 1.442
body Talks about more with an exclamation! BANG_MORE 0.287 0 0.294 0
body Talks about Oprah with an exclamation! BANG_OPRAH 0.666 0.212 1.717 1.975
body Talks about quotes with an exclamation! BANG_QUOTE 1.680 1.880 1.942 1.964
body Talks about ‘acting now’ with capitals ACT_NOW_CAPS 0.222 0 0.426 0.093
body Talks about ‘starting now’ with capitals START_NOW_CAPS 1.280 1.499 1.124 0.857
body Talks about a bigger drive for sex MORE_SEX 2.240 1.762 2.287 2.422
body Something is emphatically guaranteed BANG_GUAR 0.297 0 0.254 0
body See for yourself SEE_FOR_YOURSELF 0.544 0.381 0.591 0.044
body Possible porn – Free Porn FREE_PORN 0.794 0.023 1.937 0.000
body Possible porn – Cum Shot CUM_SHOT 0.355 1.732 0.943 0
body Possible porn – Pay Site PAY_SITE 0 0 1.850 1.900
body Possible porn – Live Porn LIVE_PORN 0.040 0.360 0.019 0.000
body Possible porn – Hardcore Porn HARDCORE_PORN 1.520 0.665 1.850 0.684
body Possible porn – Hot, Nasty, Wild, Young HOT_NASTY 0.765 0.586 0.967 0.088
body Possible porn – Best, Largest, Most Porn BEST_PORN 0.566 0.263 0.044 0
body Possible porn – Nasty Girls NASTY_GIRLS 0.350 0.439 0.022 2.196
body Possible porn – Amateur Porn AMATEUR_PORN 1.397 0.769 1.615 1.744
body Possible porn – Celebrity Porn PORN_CELEBRITY 0.675 1.569 0.319 0.038
body Possible porn – Adult Web Sites SOMETHING_FOR_ADULTS 1.433 1.513 1.614 0.006
body Possible porn – various types of feline PORN_15 1.680 1.974 2.035 2.168
body Possible porn – nasty, dirty, little etc. PORN_16 0.907 0.462 1.305 0.017
body Thousands or millions of pictures, movies, etc. LOTS_OF_STUFF 0.839 0.029 0 0.000
body Attempts to disguise porn words DISGUISE_PORN 1.490 1.835 0.798 0.030
uri URL uses words/phrases which indicate porn (sex) PORN_URL_SEX 1.865 1.427 1.817 0.011
uri URL uses words/phrases which indicate porn (slut) PORN_URL_SLUT 0.941 1.022 0.194 0.094
uri URL uses words/phrases which indicate porn (misc) PORN_URL_MISC 1.728 0.573 1.767 1.620
header Subject indicates sexually-explicit content SUBJECT_SEXUAL 2.160 2.538 2.775 2.900
header Bulk email fingerprint (eGroups) found RATWARE_EGROUPS 2.180 2.701 2.552 2.805
header Bulk email fingerprint (hash 2) found RATWARE_HASH_2 0.039 0 0.085 0.037
header Bulk email fingerprint (hash 2 v2) found RATWARE_HASH_2_V2 1.798 1.319 1.767 0.980
header Bulk email fingerprint (jpfree) found RATWARE_JPFREE 0 0 1.942 2.100
uri Bulk email fingerprint (StormPost) found RATWARE_STORM_URI 1.920 1.518 2.405 2.295
header X-Mailer has malformed Outlook Express version RATWARE_OE_MALFORMED 2.160 2.407 2.522 2.588
header Bulk email fingerprint (‘esmtp’ Received) found RATWARE_RCVD_LC_ESMTP 1.745 1.474 2.122 2.083
header Bulk email fingerprint (Mozilla malformed) found RATWARE_MOZ_MALFORMED 1.594 0.990 1.752 0.558
rawbody Contains a hashbuster in Send-Safe format RATWARE_HASH_DASH 1.133 0.947 1.500 1.646
header Bulk email fingerprint (netIP) found RATWARE_NETIP 0.439 1.033 2.312 2.286
header Bulk email fingerprint (Gecko faked) found RATWARE_GECKO_BUILD 0 0.826 0.784 1.385
header Headers are in order found in spam (MTSRIX) HDR_ORDER_MTSRIX 0.417 0.391 0.192 1.057
header Headers are in order found in spam (TRIMRS) HDR_ORDER_TRIMRS 2.320 2.674 2.220 2.199
header Bulk email fingerprint (bonus space) found RCVD_BONUS_SPC_DATE 1.371 0.904 1.575 1.872
header Bulk email fingerprint (X-Message-Info) found X_MESSAGE_INFO 3.600 4.187 4.162 4.244
header Bulk email fingerprint (Received PF) found RATWARE_RCVD_PF 2.880 3.384 3.608 3.867
header Bulk email fingerprint (Received @) found RATWARE_RCVD_AT 2.550 1.011 2.691 3.415
uri Uses a numeric IP address in URL NUMERIC_HTTP_ADDR 1.565 1.572 1.872 2.135
uri Uses a dotted-decimal IP address in URL NORMAL_HTTP_TO_IP 0.104 0.080 0.830 0.028
uri Uses %-escapes inside a URL’s hostname HTTP_ESCAPED_HOST 0.034 0.094 0 0.477
uri Uses control sequences inside a URL hostname HTTP_CTRL_CHARS_HOST 1.440 1.670 1.757 1.900
uri Completely unnecessary %-escapes inside a URL HTTP_EXCESSIVE_ESCAPES 0 0.645 0 0.151
uri Dotted-decimal IP address followed by CGI IP_LINK_PLUS 0.211 0.024 0.192 0.232
uri URL of page called “remove” REMOVE_PAGE 0.081 0.604 0 0.191
uri Includes a link to a likely spammer email MAILTO_TO_SPAM_ADDR 0 0 0.106 0
uri Includes a ‘remove’ email address MAILTO_TO_REMOVE 0.886 0 0.065 0.116
uri Uses non-standard port number for HTTP WEIRD_PORT 0 0.507 0.228 0.109
uri URL contains username and (optional) password USERPASS 0.429 0.561 1.319 0.268
uri Filename is just a ‘\#’; probably a JS trick URI_IS_POUND 0 0.333 0 0
uri Includes a link to a likely spammer domain BARGAIN_URL 1.503 1.520 1.686 1.833
uri Contains an URL in the BIZ top-level domain BIZ_TLD 2.167 0.527 2.434 2.288
uri Contains an URL in the INFO top-level domain INFO_TLD 1.717 0.481 1.686 0.000
uri Has Yahoo Redirect URI YAHOO_RD_REDIR 1.237 1.083 1.366 1.642
uri Has Yahoo Redirect URI YAHOO_DRS_REDIR 1.911 0.911 1.956 0.984
uri Message has link to company offers URI_OFFERS 1.328 0.252 1.460 0.770
uri Message has URI 4you URI_4YOU 1.027 1.812 0.898 1.966
uri Contains URI to a document hosted at ‘terra.es’ TERRA_ES 1.367 0.816 1.746 2.612
uri Contains an URL-encoded hostname (HTTP77) HTTP_77 1.514 0.605 1.812 1.981
uri Contains a URI with an affiliate ID code URI_AFFILIATE 2.243 0 1.808 2.052
header Message has HTTP redirector URI URI_REDIRECTOR 0 0 0.031 0.011
body Bayesian spam probability is 0 to 1% BAYES_00 0 0 -1.665 -2.599
body Bayesian spam probability is 1 to 5% BAYES_05 0 0 -0.925 -0.413
body Bayesian spam probability is 5 to 20% BAYES_20 0 0 -0.730 -1.951
body Bayesian spam probability is 20 to 40% BAYES_40 0 0 -0.276 -1.096
body Bayesian spam probability is 40 to 60% BAYES_50 0 0 1.567 0.001
body Bayesian spam probability is 60 to 80% BAYES_60 0 0 3.515 1.0
body Bayesian spam probability is 80 to 95% BAYES_80 0 0 3.608 2.0
body Bayesian spam probability is 95 to 99% BAYES_95 0 0 3.514 3.0
body Bayesian spam probability is 99 to 100% BAYES_99 0 0 4.070 3.5
body es Claims you can be removed in Spanish REMOVE_ES_01 1
body es Claims you can be removed in Spanish REMOVE_ES_02 1
body es Claims you can be removed in Spanish REMOVE_ES_03 1
body es Claims you can be removed in Spanish REMOVE_ES_04 1
body es If you send an email you will be OptOut REMOVE_ES_05 1
body es Claims you can opt-out REMOVE_ES_06 1
body es Claims you can opt-out REMOVE_ES_07 1
body es Claims you can opt-out REMOVE_ES_08 1
body es If you want to subscribe… SUBSCRIBE_ES_01 1
body es Claims not to be spam in Spanish EXCUSE_ES_01 1
body es Someone fell free to send you a message in Spanish EXCUSE_ES_02 1
body es Someone requested an spammer to spam you in Spanish EXCUSE_ES_03 1
body es El correo como alternativa comercial EXCUSE_ES_05 1
body es Mensaje enviado por error EXCUSE_ES_06 1
body es No se puede considerar spam EXCUSE_ES_07 1
body es Para dejar de fumar DEJAR_DE_FUMAR_ES 1
body es NOS CHILLAN PARA DECIR QUE ES GRATIS GRATIS_ES 1.4
body es Nos animan a contestar si estamos interesados INTERESADO_ES 1
body es Dice cumplir con la ley LEY_ORGANICA_ES 2.0
body es Clama cumplir con la normativa SPAM NORMATIVA_SPAM_ES 2.0
body es No existe legislación en Chile contra el SPAM LEY_CHILE_ES_01 1
body es Clama cumplir con la legislación chilena LEY_CHILE_ES_02 1
body es Inmigración legal (?) a los Estados Unidos TARJETA_VERDE_ES 1
body es Promocion especial. PROMOCION_ES 1
body es Alta en buscadores hispanos. ALTA_BUSCADORES_ES 1
body es IMPERATIVOS/EXCLAMACIONES EN MAYUSCULAS. EXCLAMACION_ES 1
body es Presentación de un nuevo producto. PRESENTAMOS_ES 1
body es Pago contra reembolso. CONTRA_REEMBOLSO_ES 1
body es Para hacer su pedido. PEDIDO_ES 1
body es Haga click aqui. CLICK_ES 1
body es Los regalos no existen, salvo de nuestros amigos. REGALO_ES 1
body es Pueden ser ganadores. GANADORES_ES_01 1
body es Ha sido ganador. GANADORES_ES_02 1
body es Porno gratis. PORNO_GRATIS_ES 1
body es Mas informacion. MAS_INFORMACION_ES 1
body es Informacion y reserva INFORMACION_RESERVA_ES 1
body es Conviertete en Spammer. REENVIA_ES 1
body es No nos envían más spam… seguro que no. NO_MAS_MAIL_1_ES 1
body es No recibirá este spam otra vez… seguro que no. NO_MAS_MAIL_2_ES 1
body es Las direcciones fueron obtenidas de internet. COLECTOR_DE_MAILS_ES 1
header Contains valid Hashcash token (20 bits) HASHCASH_20 -0.500
header Contains valid Hashcash token (21 bits) HASHCASH_21 -0.700
header Contains valid Hashcash token (22 bits) HASHCASH_22 -1.000
header Contains valid Hashcash token (23 bits) HASHCASH_23 -2.000
header Contains valid Hashcash token (24 bits) HASHCASH_24 -3.000
header Contains valid Hashcash token (25 bits) HASHCASH_25 -4.000
header Contains valid Hashcash token (>25 bits) HASHCASH_HIGH -5.000
header Hashcash token already spent in another mail HASHCASH_2SPEND 0.100
header SPF: sender matches SPF record SPF_PASS -0.001
header SPF: sender does not match SPF record (fail) SPF_FAIL 0 0.001 0 0.875
header SPF: sender does not match SPF record (softfail) SPF_SOFTFAIL 0.500 0.842 0.500 0.500
header SPF: HELO matches SPF record SPF_HELO_PASS -0.001
header SPF: HELO does not match SPF record (fail) SPF_HELO_FAIL 0 0.405 0 0.001
header SPF: HELO does not match SPF record (softfail) SPF_HELO_SOFTFAIL 0 1.002 0 3.140
body Contains an URL listed in the SBL blocklist URIBL_SBL 0 0.629 0 0.996
body Contains an URL listed in the SC SURBL blocklist URIBL_SC_SURBL 0 3.897 0 4.263
body Contains an URL listed in the WS SURBL blocklist URIBL_WS_SURBL 0 0.539 0 1.462
body Contains an URL listed in the PH SURBL blocklist URIBL_PH_SURBL 0 0.839 0 2.000
body Contains an URL listed in the OB SURBL blocklist URIBL_OB_SURBL 0 1.996 0 3.213
body Contains an URL listed in the AB SURBL blocklist URIBL_AB_SURBL 0 2.007 0 0.417
header From: address is in the auto white-list AWL 1
header From: address is in the user’s black-list USER_IN_BLACKLIST 100.000
header From: address is in the user’s white-list USER_IN_WHITELIST -100.000
header From: address is in the default white-list USER_IN_DEF_WHITELIST -15.000
header User is listed in ‘blacklist_to’ USER_IN_BLACKLIST_TO 10.000
header User is listed in ‘whitelist_to’ USER_IN_WHITELIST_TO -6.000
header User is listed in ‘more_spam_to’ USER_IN_MORE_SPAM_TO -20.000
header User is listed in ‘all_spam_to’ USER_IN_ALL_SPAM_TO -100.000

 

Like
Like Love Haha Wow Sad Angry
For any copyright infringement, please contact us

Check Also

openvz,kvm,xen

The differences and the advantages of OpenVZ, Xen, and KVM

Hi Netlyer?! Ever wondering This overview is intended to be just that, this is just …

Come on join the discussion

avatar
  Subscribe  
Notify of