***** To join INSNA, visit http://www.insna.org *****
Dear SOCNET community,
I am a research scientist at the University of Chicago, applying social/sexual network data for HIV prevention problems. I read with interest a recent discussion here in response to a question posed by Elly Powers, on modeling centrality scores as an outcome variable. I want to talk through a similar question here.
I am working with a longitudinal (two-wave) Facebook dataset of high-risk gay men in South Chicago. There are 294 participants in both waves of the study, with about 3200 friendships between them in Wave 1, and ~4000 in Wave 2. I have computed a number of graph centrality metrics on the two waves (betweenness/eigenvector centralities, bridging scores, constraint index and effective network size).
We are interested in looking more closely at the top 50 individuals selected by each score (say bridging, for the purpose of argument), who will be invited to our center to receive additional training about HIV prevention methods. The big question is, what factors are associated with someone to become a bridge? What is associated with someone losing their bridging status? One way to answer these questions might be to consider the set of nodes that become a top-50 in Wave 2, that weren’t in that group in Wave 1. Another way might be to consider the set of nodes that were a top-50 bridge in Wave 1, but didn’t retain that status in Wave 2. And we could then consider the determinants of that change in status using a regression model, which is problematic because of the dependency structures within a network dataset such as this, and also the outcome variables.
The question here is different than how Elly posed it – I am interested in predicting a top-50 bridge, which is a binary outcome. Still, the same problems of using a centrality (or bridging) score as an outcome on network data hold. I looked at the ‘tnam’ and ‘lnam’ functions that Philip Leifield recommended, but I also saw that his last suggestion, seconded by Tom Snijders, was to use ERGM’s to “explain the network”.
I could potentially go down the last route, but “explaining the network” itself might not be a direct answer to my question here. What might explaining the network mean in this context (I have worked extensively with the statnet suite of packages, mostly as a tool to simulate dynamic network structure and HIV transmission). As a first pass, I might want to use your temporal network autocorrelation models, with top-50 bridging status as an outcome, and assess the significance of determinants of that status. The various dependency structures, both within the network, and the outcomes, make this tricky, but is there a way to make this approach work? I am happy to provide more details about the problem if you think that would help.