Which of the following antispam filtering techniques would BEST prevent a valid, variable- length
e-mail message containing a heavily weighted spam keyword from being labeled as spam?

Heuristic (rule-based)


Pattern matching

Bayesian (statistical)

Bayesian filtering applies statistical modeling to messages, by performing a frequency analysis on
each word within the message and then evaluating the message as a whole. Therefore, it can
ignore a suspicious keyword if the entire message is withinnormal bounds. Heuristic filtering is less
effective, since new exception rules may need to be defined when a valid message is labeled as
spam. Signature-based filtering is useless against variable- length messages, because the
calculated MD5 hash changes all the time. Finally, pattern matching is actually a degraded rulebased technique, where the rules operate at the word level using wildcards, and not at higher levels.

