Print

Print


*****  To join INSNA, visit http://www.insna.org  *****

The basic idea of modularity, Q,  is that one compares an actual network to a network with the same degree distribution but otherwise random (Newman and Girvan 2004). Groups (or communities) should have higher densities than could be expected on the basis of random chance, and be separated by cleavages that are sparser than in the randomized network. When tie strength is taken into account, strongly connected nodes are more likely to be in the same group than weakly connected nodes, in line with Granovetter's (1973) well-known argument. Out of multiple possible assignments of nodes to groups that fulfill these requirements, the one with the maximal difference between actual and expected densities-maximal modularity-is arguably the best. 

In larger networks where the number of groups is not known at the start (which includes most empirical networks) there are in fact so many different assignments that comparing all of them is not feasible. There exist numerous algorithms that search heuristically for a good solution in ways that are efficient and effective (for an overview, Fortunato 2010). As most of these algorithms randomly re-assign nodes to other groups in their search for modularity improvement, after each run of community detection a slightly different assignment might be found. Salient communities are robust, but the less salient communities are, and the more the network resembles a random network, the less stable the assignment will be. This is not (only) a limitation of the software, because for non-salient communities with overlaps, different assignments can have (almost) the same maximal modularity value. To detect group overlaps and smaller (sub)groups initially overlooked, a parameter can be fine-tuned, and thereby overcome to some extent the resolution limit inherent to this method, although several challenges remain (Kumpala et al 2007, Fortunato 2010; Traag et al. 2011). 

Random graphs can sometimes exhibit surprisingly high modularity scores due to random fluctuations (Reichardt and Bornholdt 2007), contrary to claims one sometimes encounters that in a random network modularity would equal zero. This makes it difficult to compare modularity scores across different networks. It also implies that to establish significance, one would have to generate large numbers of simulated random networks with the same degree distribution, and compare them with a large number of community detection runs on the empirical network. This does not imply that the modularity approach is flawed, and in fact it has been validated in many test cases (Fortunato 2010), but when interpreting modularity results, one should take these considerations into account.

To study groups changing over time, an effective method was crafted by Mucha cum suis (2010), termed multi-slice modularity, where each slice represents a network for a single wave of observation. An advantage over earlier approaches is that its longitudinal modularity depends not only on the individual slices, like standard modularity discussed above, but also on the temporal stability of the communities.

To detect communities in signed graphs, Vincent Traag and I (2009, in Ph. Rev. E) started out from the assumption that negative ties are to be found in between contending groups rather than within them. To optimize modularity, densities of negative within-group ties are then to be minimized, while densities of positive ties are to be maximized, just as above. This conception generalizes social balance theory, by loosening up the latter's rigid community assignment wherein no negative ties were allowed to exist within communities. The underling theory we flesh out in our paper.

Several algorithms for community detection in graphs with positive ties are in the igraph package of R, like simulated annealing that works well for relatively small graphs. Another option is Louvain (Blondel et. al 2008), which is fast and accurate and can handle larger graphs, http://launchpad.net/Louvain/

I hope this helps,

Jeroen Bruggeman
University of Amsterdam


From: Social Networks Discussion Forum [mailto:[log in to unmask]] On Behalf Of Jesse Michael Fagan
Sent: donderdag 9 februari 2012 1:06
To: [log in to unmask]
Subject: Re: [SOCNET] Girvan-Newman's Q value in Netdraw

***** To join INSNA, visit http://www.insna.org ***** Hi Peyina,

The 'goodness of fit' is not a bad way to interpret Q.

You can fit any grouping of nodes to any network. You can arbitrarily assign nodes to an arbitrary number of groups. For any arbitrary grouping you can calculate a value of Q. Of course, as the size of a network increases attempting all possible combinations of groupings becomes intractable. So you have the Girvan-Newman method (and a great, huge host of other methods) which operate on good assumptions about network structure. The Girvan-Newman method assumes that edges in the network with high betweenness are edges tying communities together.

The Q (modularity) value is based on a null model of network connectivity.  In a random network you would expect the likelihood of two nodes   to be linked to be based on the degree of each of the nodes.

Pab = (degree(a) * degree(b)) / (2 * total edges in network)

Community detection algorithms, generally, attempt to find a way of grouping nodes together such that they are more likely to connect to other members of the group than would be expected based on this null model. Most community detection values attempt to find the highest value of Q possible (although a value greater than 0.7 would probably not be considered 'significant', but most algorithms won't approach that value). You would get a value of Q=0 if there was one community and all nodes belonged to it and a value of Q=1 if there were as many groups as there are nodes and there was a single node in each group. Both Q=0 and Q=1 are not significant. The best values for Q are between Q=0.3 and Q=0.8, but there could be considerable discussion concerning the 'best' values.

Check out Chris's recommendation. It will provide a good survey of methods and uses of modularity in networks. I hope this helps.

-Jesse Fagan


On Wed, Feb 8, 2012 at 4:26 PM, P. Lin <[log in to unmask]> wrote:
***** To join INSNA, visit http://www.insna.org *****
Hi,
 
I'm a PhD student at the University of Washington Information School studying high school social groups and the multiplexity of technology interactions.
 
I'm intrigued by the Q-value in Netdraw's Girvan-Newman subgroup function. According to Hanneman's online book on social network methods, this Q-value is a goodness of fit.
Does anyone know how to interpret the Q value (or goodness of fit) in Netdraw's Girvan-Newman subgroup function?
What is considered a good fit?
 
I got no responses on the UCINET group (posted 2 days ago), and emailed Dr. Hanneman, who kindly suggested I email this listserv.
 
To elaborate, Netdraw has an Analysis>Subgroup function that :
 
a) visualizes the subgroups partitioned by coloring the subgroups in different colors.
b) produces partition information, including a Q value.
 
I looked through the Girvan and Newman (2002) article cited for this algorithm, and couldn't see an explanation of the goodness of fit either.
 
UCINET does not produce this goodness of fit measurement either, which leads me to think that the goodness of fit is specific to Netdraw?
 
The Netdraw output makes sense to me (in the way I'm interpreting my data), and I'm just looking for a way to validate the clusters produced. So, perhaps the Q goodness of fit measure produced could help, though I need some help in knowing how that Q value is produced and what it means.
 
Thank you!
 
 
Peyina
-
Peyina Lin
PhD Candidate
The Information School
University of Washington
http://students.washington.edu/pl3
@peyinalin
 
_____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message.

On Wed, Feb 8, 2012 at 4:44 PM, Weare, Christopher <[log in to unmask]> wrote:
***** To join INSNA, visit http://www.insna.org ***** 
This may have more information than you want, but this paper covers the modularity index very well.
 
Porter, M. A., J.-P. Onnela, et al. (2009). "Communities in Networks." Notices of the American Mathematical Society, Vol. 56, No. 9, 2009.
 
Chris
 
From: Social Networks Discussion Forum [mailto:[log in to unmask]] On Behalf Of P. Lin
Sent: Wednesday, February 08, 2012 3:26 PM
To: [log in to unmask]
Subject: [SOCNET] Girvan-Newman's Q value in Netdraw
 
***** To join INSNA, visit http://www.insna.org ***** 
Hi, 
 
I'm a PhD student at the University of Washington Information School studying high school social groups and the multiplexity of technology interactions. 
 
I'm intrigued by the Q-value in Netdraw's Girvan-Newman subgroup function. According to Hanneman's online book on social network methods, this Q-value is a goodness of fit. 
Does anyone know how to interpret the Q value (or goodness of fit) in Netdraw's Girvan-Newman subgroup function?
What is considered a good fit? 
 
I got no responses on the UCINET group (posted 2 days ago), and emailed Dr. Hanneman, who kindly suggested I email this listserv. 
 
To elaborate, Netdraw has an Analysis>Subgroup function that : 
 
a) visualizes the subgroups partitioned by coloring the subgroups in different colors.
b) produces partition information, including a Q value. 
 
I looked through the Girvan and Newman (2002) article cited for this algorithm, and couldn't see an explanation of the goodness of fit either. 
 
UCINET does not produce this goodness of fit measurement either, which leads me to think that the goodness of fit is specific to Netdraw?
 
The Netdraw output makes sense to me (in the way I'm interpreting my data), and I'm just looking for a way to validate the clusters produced. So, perhaps the Q goodness of fit measure produced could help, though I need some help in knowing how that Q value is produced and what it means.
 
Thank you!
 
 
Peyina
-
Peyina Lin
PhD Candidate
The Information School
University of Washington
http://students.washington.edu/pl3
@peyinalin
 
_____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message. 
_____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message. 

_____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message. 

_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.