Finding Structure in Wikipedia Edit Activity

Ramine Tinati, Markus Luczak-Roesch, Wendy Hall
  • January 2016, ACM (Association for Computing Machinery)
  • DOI: 10.1145/2872518.2891110

Generating additional article links in Wikipedia from matching phrases in edit activities

What is it about?

We apply the temporal data mining approach Transcendental Information Cascades [1,2,3] to the Wikipedia edit stream. The method generates a temporally ordered network of information occurrence, which is then used to infer associations between Wikipedia articles. [1] Luczak-Roesch, M., Tinati, R. and Shadbolt, N., 2015, May. When resources collide: Towards a theory of coincidence in information spaces. In Proceedings of the 24th International Conference on World Wide Web (pp. 1137-1142). ACM. [2] Luczak-Roesch, M., Tinati, R., Van Kleek, M. and Shadbolt, N., 2015, August. From coincidence to purposeful flow? properties of transcendental information cascades. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (pp. 633-638). ACM. [3] Luczak-Roesch, M., Tinati, R. and O'Hara, K., 2017. What an entangled Web we weave: An information-centric approach to socio-technical systems. PeerJ Preprints, 5, p.e2789v1.

Why is it important?

We find that a significant portion of these associations is not present in Wikipedia as links, so we are able to extend the explicit linking structure of Wikipedia. Furthermore, these new associations can be related to real-world events (e.g. political debates), hence, being a suitable starting point to detect potential biases being injected into Wikipedia content.

