Print

Print


Have a look at

Koskinen, J., Robins, G., Wang, P., & Pattison, P. (2013). Bayesian analysis for partially observed network data, missing ties, attributes and actors. Social Networks, 35, 514-527.

That paper provides an example of fitting an ERGM with 74% of tie variables unobserved, and another example of homophily inference with nearly 30% of the dyads affected by missingness.

Other important papers cited include Handcock and Gile (2010) and Huisman (2009). Smith and Moody’s recent work has been mentioned in other posts.

 

Traditional power analysis as developed by Cohen sits uncomfortably with a lot of network research because of course classical null hypothesis significance testing is typically based on independent observations and we network researchers have dependence, quite explicitly. The usual power analysis question is: are there sufficient observations? If the interest is in network structure (eg through an ergm analysis), then the number of observations is not n, but the number of ties – of the order of n-squared - which I have found usually reassures (but sometimes confuses) most non-networkers asking for a power analysis. So when I am confronted by such requests, I advise that, although standard power analytic techniques are simply not available for most network methods, the 50 (say) actors in a small directed network amount to nearly 2500 observations. The focus on the 2500 rather than the 50 usually quietens the demand and then I can get on with the network analysis.

 

But if you actually want a solid answer to the question of power in the presence of missingness, you may need to do some simulation studies. If you have full data (say at one measurement point), then fit a model to it, and then use the data to create artificially missingness by removing different proportions of nodes at random. Then refit the model (taking into account the missingness, not ignoring it, using either the Koskinen et al or Handcock and Gile approach), and see how well you can recover the original estimates. Or if you don’t have data, assume a plausible model from previous work or elsewhere and simulate multiple data sets. This will give you some idea of how much missingness you can tolerate for the research question you have in mind.

 

You won’t get the best inference if you pretend the missingness is not there, by simply removing all missing nodes from the dataset and proposing the remainder constitute the network. We often do this, and it may be OK in some circumstances, but when you are comparing the same network across time but deliberately not observing some actors at some timepoints to minimise respondent demands, it seems risky simply to leave out the missingness. But even the Handcock and Gile, and Koskinen approaches assume ties missing at random, and this will not be the case in your design, so the inference will not be perfect. The next step in network modelling in the presence of missingness is to develop methods whereby missingness by design can be tolerated. We know this is OK for e.g. snowball sampling, but there are other circumstances where this is problematic. For instance, police look closely at suspects in criminal networks, so the non-observation of non-suspects (among relevant persons) is hardly at random. For those cases, we need a model of the method of observation that can be integrated into the model of the missing data. That is not easy and we are not there yet.

 

There’s a bit more on missing data in my book below.

 

Garry

 

 

Professor Garry Robins

Melbourne School of Psychological Sciences

The University of Melbourne

Victoria 3010

Australia

 

Website: http://psych.unimelb.edu.au/people/garry-robins

Melnet website: http://sna.unimelb.edu.au/

 

Check out my new book: Doing social network research: Network-based research design for social scientists. (http://www.uk.sagepub.com/textbooks/Book241817)

 

 

 

 

 

 

 

 

From: Social Networks Discussion Forum [mailto:[log in to unmask]] On Behalf Of Ian McCulloh
Sent: Tuesday, 24 March 2015 11:19 PM
To: [log in to unmask]
Subject: Power Analysis in SNA

 

***** To join INSNA, visit http://www.insna.org *****

SOCNET Members,

 

Can any of you provide me with a few good papers on handling missing data in network studies?  Specifically, if I'm studying collaboration and some subject opts out of the research, or if I'm conducting weekly surveys and some respondents fail to complete a survey for a given week or we choose to skip certain respondents in certain weeks to reduce burden on the subject.  I'm also looking for a possible power analysis for these research decisions/situations.

 

I am consulting on a project investigating how to improve collaboration and innovation by altering office space layout, building off a paper I published with Kerstin Sailer in the Journal of Social Networks. http://www.sciencedirect.com/science/journal/03788733/34/1

 

For the project, subjects will be evaluated in their current office configuration, then moved to an open plan, designed for greater collaboration.  There are several sources of network data that I will collect.  I will have billing data, which ties subjects to projects that they work on (co-project network).  I will have access to email meta-data (email network).  And I will be able to survey subjects on their interactions.

 

It does not appear that I will be able to survey all respondents each week.  The project lead has asked me about power analysis to quantify the effect of missing respondents.  In other words, how many people do I need to survey to have useful network data.

 

I have explained ergodicity to my research team and how inference can be a problem, given the dependent nature of networks.  I don't really need any more info on that angle.  

 

What I would appreciate is any papers that discuss the statistical effect of missing data in an ergm, stochastic actor-oriented model, or basic SNA.

 

As I recall, similar questions have been raised on this forum before.  I'm sorry I couldn't locate those posts.  

 

I'd also be interested in any papers on similar studies focused on increasing collaboration and innovation through office space design.  I'm already familiar with Kerstin Sailer's excellent work in this area, but would be interested in other efforts.

 

Thanks.  

 

Ian

 

Ian McCulloh, Ph.D.

Johns Hopkins University

_____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message.