Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
CPS Internet of Things
Design
Measurement
Performance
Security
Wikipedia
spatio-temporal reputation
vandalism
collaborative software
content-based access control
Funder
Grant number
License
Copyright date
Distributor
Related resources
Contributor
Abstract

Blatantly unproductive edits undermine the quality of the collaboratively-edited encyclopedia, Wikipedia. They not only disseminate dishonest and offensive content, but force editors to waste time undoing such acts of vandalism. Language- processing has been applied to combat these malicious edits, but as with email spam, these filters are evadable and computationally complex. Meanwhile, recent research has shown spatial and temporal features effective in mitigating email spam, while being lightweight and robust. In this paper, we leverage the spatio-temporal properties of revision metadata to detect vandalism on Wikipedia. An administrative form of reversion called rollback enables the tagging of malicious edits, which are contrasted with nonoffending edits in numerous dimensions. Crucially, none of these features require inspection of the article or revision text. Ultimately, a classifier is produced which flags vandalism at performance comparable to the natural-language efforts we intend to complement (85% accuracy at 50% recall). The classifier is scalable (processing 100+ edits a second) and has been used to locate over 5,000 manually-confirmed incidents of vandalism outside our labeled set.

Advisor
Date of presentation
2010-04-13
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-17T03:39:22.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
EUROSEC '10: Proceedings of the Third European Workshop on System Security. Paris, France. April 13, 2010. (A preliminary version was also published as UPENN-MS-CIS-10-05).
Recommended citation
Collection