“Unsolicited” is Right

Gabe and I were flipping through our respective spam corpuses (corpii?) today and came across some interesting statistics or examples.

The word "unsolicited" appeared in 144 spams, 0 real emails. The benign "trust" has appeared in 47 spams, 0 reals. "toner" is 181/0, but that one's pretty obvious if you've ever gotten those stupid ink jet refill spams. "success" is 0 for 98. "spacer.gif" is 0 for 317! "serif" has struck out 141 times and gotten no hits, but that pales in comparison to "font" which shows up in 13054 spams (but 288 valid emails, giving it a probability of 0.500). "cash" = 291 spams, 2 real. The best words, all with 0.010 probability of being spam? "usb", "", "beta", "cocoa", "moreinfo.fcgi", "maildrop", "phil", and "stupidity" (among a few hundred others).

One Response to "“Unsolicited” is Right"

