Assessing the Bias in Communication Networks Sampled from Twitter
The authors collect and analyze messages exchanged in Twitter using two of the platform's publicly available APIs (the search and stream specifications). They assess the differences between the two samples, and compare the networks of communication reconstructed from them. The empirical context is given by political protests taking place in May 2012: they track online communication around these protests for the period of one month, and reconstruct the network of mentions and re-tweets according to the two samples. They find that the search API over-represents the more central users and does not offer an accurate picture of peripheral activity; they also find that the bias is greater for the network of mentions.