*****  To join INSNA, visit  *****

   Barry Wellman

   NetLab                        FRSC                      INSNA Founder
   Faculty of Information (iSchool)                 611 Bissell Building
   140 St. George St.    University of Toronto    Toronto Canada M5S 3G6          twitter: @barrywellman
                  NSA/CSEC: Canadian and American citizen
   NETWORKED:The New Social Operating System. Lee Rainie & Barry Wellman
   MIT Press        Print $14  Kindle $16

The Parable of Google Flu: Traps in Big Data Analysis

    In February 2013, Google Flu Trends (GFT) made headlines but not for a reason that Google executives or the creators of the flu tracking system would have hoped. Nature reported that GFT was predicting more than double the proportion of doctor visits for influenza-like illness (ILI) than the Centers for Disease Control and Prevention (CDC), which bases its estimates on surveillance reports from laboratories across the United States (1, 2). This happened despite the fact that GFT was built to predict CDC reports. Given that GFT is often held up as an exemplary use of big data (3, 4), what lessons can we draw from this error?

The Parable of Google Flu: Traps in Big Data Analysis
David Lazer, Ryan Kennedy, Gary King, Alessandro Vespignani

Science 14 March 2014:
Vol. 343 no. 6176 pp. 1203-1205

See it on ( , via Papers (

Sums of variables at the onset of chaos

    We explain how specific dynamical properties give rise to the limit distribution of sums of deterministic variables at the transition to chaos via the period-doubling route. We study the sums of successive positions generated by an ensemble of initial conditions uniformly distributed in the entire phase space of a unimodal map as represented by the logistic map. We find that these sums acquire their salient, multiscale, features from the repellor preimage structure that dominates the dynamics toward the attractors along the period-doubling cascade. And we explain how these properties transmit from the sums to their distribution. Specifically, we show how the stationary distribution of sums of positions at the Feigebaum point is built up from those associated with the supercycle attractors forming a hierarchical structure with multifractal and discrete scale invariance properties.

Miguel Angel Fuentes, Alberto Robledo
"Sums of variables at the onset of chaos"
The European Physical Journal B, 87:32 (2014)

See it on ( , via Papers (

The Bursty Dynamics of the Twitter Information Network

    In online social media systems users are not only posting, consuming, and resharing content, but also creating new and destroying existing connections in the underlying social network. While each of these two types of dynamics has individually been studied in the past, much less is known about the connection between the two. How does user information posting and seeking behavior interact with the evolution of the underlying social network structure?
Here, we study ways in which network structure reacts to users posting and sharing content. We examine the complete dynamics of the Twitter information network, where users post and reshare information while they also create and destroy connections. We find that the dynamics of network structure can be characterized by steady rates of change, interrupted by sudden bursts. Information diffusion in the form of cascades of post re-sharing often creates such sudden bursts of new connections, which significantly change users' local network structure. These bursts transform users' networks of followers to become structurally more cohesive as well as more homogenous in terms of follower interests. We also explore the effect of the information content on the dynamics of the network and find evidence that the appearance of new topics and real-world events can lead to significant changes in edge creations and deletions. Lastly, we develop a model that quantifies the dynamics of the
network and the occurrence of these bursts as a function of the information spreading through the network. The model can successfully predict which information diffusion events will lead to bursts in network dynamics.

The Bursty Dynamics of the Twitter Information Network
Seth A. Myers, Jure Leskovec

See it on ( , via Papers (

Geo-located Twitter as proxy for global mobility patterns

    Pervasive presence of location-sharing services made it possible for researchers to gain an unprecedented access to the direct records of human activity in space and time. This article analyses geo-located Twitter messages in order to uncover global patterns of human mobility. Based on a dataset of almost a billion tweets recorded in 2012, we estimate the volume of international travelers by country of residence. Mobility profiles of different nations were examined based on such characteristics as mobility rate, radius of gyration, diversity of destinations, and inflow˙˙outflow balance. Temporal patterns disclose the universally valid seasons of increased international mobility and the particular character of international travels of different nations. Our analysis of the community structure of the Twitter mobility network reveals spatially cohesive regions that follow the regional division of the world. We validate our result using global tourism statistics and mobility
models provided by other authors and argue that Twitter is exceptionally useful for understanding and quantifying global mobility patterns.

Geo-located Twitter as proxy for global mobility patterns

Bartosz Hawelka*, Izabela Sitko, Euro Beinat, Stanislav Sobolevsky, Pavlos Kazakopoulos & Carlo Ratti

Cartography and Geographic Information Science˙˙;

See it on ( , via Papers (

Shock waves on complex networks

    Power grids, road maps, and river streams are examples of infrastructural networks which are highly vulnerable to external perturbations. An abrupt local change of load (voltage, traffic density, or water level) might propagate in a cascading way and affect a significant fraction of the network. Almost discontinuous perturbations can be modeled by shock waves which can eventually interfere constructively and endanger the normal functionality of the infrastructure. We study their dynamics by solving the Burgers equation under random perturbations on several real and artificial directed graphs. Even for graphs with a narrow distribution of node properties (e.g., degree or betweenness), a steady state is reached exhibiting a heterogeneous load distribution, having a difference of one order of magnitude between the highest and average loads. Unexpectedly we find for the European power grid and for finite Watts-Strogatz networks a broad pronounced bimodal distribution for the
loads. To identify the most vulnerable nodes, we introduce the concept of node-basin size, a purely topological property which we show to be strongly correlated to the average load of a node.

See it on ( , via Papers (

Netconomics: Novel Forecasting Techniques from the Combination of Big Data, Network Science and Economics

    The combination of the network theoretic approach with recently available abundant economic data leads to the development of novel analytic and computational tools for modelling and forecasting key economic indicators. The main idea is to introduce a topological component into the analysis, taking into account consistently all higher-order interactions. We present three basic methodologies to demonstrate different approaches to harness the resulting network gain. First, a multiple linear regression optimisation algorithm is used to generate a relational network between individual components of national balance of payment accounts. This model describes annual statistics with a high accuracy and delivers good forecasts for the majority of indicators. Second, an early-warning mechanism for global financial crises is presented, which combines network measures with standard economic indicators. From the analysis of the cross-border portfolio investment network of long-term debt
securities, the proliferation of a wide range of over-the-counter-traded financial derivative products, such as credit default swaps, can be described in terms of gross-market values and notional outstanding amounts, which are associated with increased levels of market interdependence and systemic risk. Third, considering the flow-network of goods traded between G-20 economies, network statistics provide better proxies for key economic measures than conventional indicators. For example, it is shown that a country's gate-keeping potential, as a measure for local power, projects its annual change of GDP generally far better than the volume of its imports or exports.

Netconomics: Novel Forecasting Techniques from the Combination of Big Data, Network Science and Economics
Andreas Joseph, Irena Vodenska, Eugene Stanley, Guanrong Chen

See it on ( , via Papers (

Predicting Scientific Success Based on Coauthorship Networks

    We address the question to what extent the success of scientific articles is due to social influence. Analyzing a data set of over 100000 publications from the field of Computer Science, we study how centrality in the coauthorship network differs between authors who have highly cited papers and those who do not. We further show that a machine learning classifier, based only on coauthorship network centrality measures at time of publication, is able to predict with high precision whether an article will be highly cited five years after publication. By this we provide quantitative insight into the social dimension of scientific publishing - challenging the perception of citations as an objective, socially unbiased measure of scientific success.

Predicting Scientific Success Based on Coauthorship Networks
Emre Sarigöl, Rene Pfitzner, Ingo Scholtes, Antonios Garas, Frank Schweitzer

See it on ( , via Papers (

Correlation of automorphism group size and topological properties with program-size complexity evaluations of graphs and complex networks

    We show that numerical approximations of Kolmogorov complexity (K) of graphs and networks capture some group-theoretic and topological
properties of empirical networks, ranging from metabolic to social
networks, and of small synthetic networks that we have produced. That
K and the size of the group of automorphisms of a graph are correlated
opens up interesting connections to problems in computational
geometry, and thus connects several measures and concepts from
complexity science. We derive these results via two different
Kolmogorov complexity approximation methods applied to the adjacency
matrices of the graphs and networks. The methods used are the
traditional lossless compression approach to Kolmogorov complexity,
and a normalized version of a Block Decomposition Method (BDM) based
on algorithmic probability theory.

Correlation of automorphism group size and topological properties with
program-size complexity evaluations of graphs and complex networks
H. Zenil et al.
Physica A: Statistical Mechanics and its Applications, 2014

Preprint available:

See it on ( , via Papers (

To manage subscriptions, please go to

You can contribute to Complexity Digest selecting one of our topics ( ) and using the "Suggest" button.


SOCNET is a service of INSNA, the professional association for social
network researchers ( To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.