Mitigating Spam Using Spatio-Temporal Reputation
In this paper we present Preventive Spatio-Temporal Aggregation (PRESTA), a reputation model that combines spatial and temporal features to produce values that are behavior predictive and useful in partial-knowledge situations. To evaluate its effectiveness, we applied PRESTA in the domain of spam detection. Studying the temporal properties of IP blacklists, we found that 25% of IP addresses once listed on a blacklist were re-listed within 10 days. Further, during our evaluation period over 45% of IPs de-listed were re-listed. By using the IP address assignment hierarchy to define spatial groupings and leveraging these temporal statistics, PRESTA produces reputation values that correctly classify up to 50% of spam email not identified by blacklists alone, while maintaining low false-positive rates. When used in conjunction with blacklists, an average of 93% of spam emails are identified, and we find the system is consistent in maintaining this blockage rate even during periods of decreased blacklist performance. PRESTA spam filtering can be employed as an intermediate filter (perhaps in-network) prior to context-based analysis. Further, our spam detection system is scalable; computation can occur in near real-time and over 500,000 emails can be scored an hour.