Spatio-Temporal Analysis of Revision Metadata and the STiki Anti-Vandalism Tool
Andrew G. West (west.andrew.g)
westand [at] cis [dot] upenn [dot] edu
United States of America
University of Pennsylvania
STiki is an anti-vandalism tool for Wikipedia. Unlike similar tools (in academia), STiki does not rely on natural language processing (NLP) over the article or diff text to locate vandalism. Instead, STiki leverages spatio-temporal properties of revision metadata. The feasibility of utilizing such properties was demonstrated in our prior work, which found they perform comparably to NLP-efforts while being more efficient, robust to evasion, and language independent. STiki is a real-time, on-Wikipedia implementation based on these properties. It consists of, (1) a server-side processing engine that examines revisions, scoring the likelihood each is vandalism, and, (2) a client-side GUI that presents likely vandalism to end-users for definitive classification (and if necessary, reversion on Wikipedia). Our presentation will provide an introduction to spatio-temporal properties, demonstrate the STiki software, and discuss alternative uses for the open-source code. (Note: Existing on-Wikipedia tools, (e.g. Huggle), do utilize metadata. STiki adds a level of quantification and machine-learning not seen in such software).
People and Community (many of the edit classification features are actually reputations, that encode historical behavior of editors, articles, and spatial groupings thereof).
A similar presentation (more academic) is being given at WikiSym, immediately preceeding WikiMania (see below). Thus, I will be in Gdansk and at least partially attending Wikimania.
  • For scheduling purposes, if this presentation were to be accepted, it would be beneficial (for me) if it were to occur early in the programme.

Spatio-Temporal Analysis of Revision Metadata and the STiki Anti-Vandalism Tool.pdf

