Dear Elly,

I'm in support of the last mentioned possibility by Philip. I think it is better to see this as trying to model the network than merely trying to model centrality scores. Potentially you can also see how your variable of interest (religiosity) influences various positional network features, e.g., differentiating between incoming and outgoing centrality, homophily, interaction with reciprocity or with other homophily indicators, etc.=========================================

Tom A.B. Snijdershttp://www.stats.ox.ac.uk/~snijders

On Tue, Mar 17, 2015 at 10:14 AM, Philip Leifeld <[log in to unmask]> wrote:

---------------------- Information from the mail header -----------------------

Sender: Social Networks Discussion Forum <[log in to unmask]>

Poster: Philip Leifeld <[log in to unmask]>

Subject: Re: Centrality as the Dependent Variable?

-------------------------------------------------------------------------------

***** To join INSNA, visit http://www.insna.org *****

Hi Elly,

The centrality values attached to the nodes are not independent from

each other because a high centrality value of one node may imply a high

centrality value of an adjacent node (see measures of degree

assortativity etc.) -- or merely because the network has a given

centralization and increasing the centrality of one node must decrease

the centrality of another node.

If you estimate a regression model, you should therefore specify the

channels of influence/dependence between the nodes as covariates. For

example, for each node you can include the cumulated and/or average

centrality of adjacent nodes (and possibly of indirect friends with path

length 2). Depending on the centrality measure (e.g., eigenvector

centrality), it may also make sense to include the aggregated centrality

scores of structurally similar nodes because being connected to the same

important other nodes makes both nodes in a dyad similarly important.

There may be many other ways in which a node's centrality value depends

on other nodes' values but also potentially just on features of the

network or the local neighborhood of a node in the network.

Regression models where the dependencies are specified like this go by

several names:

- If you estimate linear or generalized linear models and include only

functions of direct neighbors, this is called a "spatial autocorrelation

model" or a "network autocorrelation model". See the work of Patrick

Doreian, for example. There is a nice article in Social Networks on

"Specifying the weight matrix" for these models by Roger Leenders

(http://dx.doi.org/10.1016/S0378-8733(01)00049-1). There is an R

implementation for the linear case in the sna package by Carter Butts

(the lnam function).

- If you include various other dependencies and estimate a binary (=

logit or probit) model, this was termed "autologistic actor attribute

model" (ALAAM) by Galina Daraganova and Garry Robins in their chapter 9

of the ERGM book ("Exponential Random Graph Models for Social Networks")

by Dean Lusher, Johan Koskinen and Garry Robins. I think there is an

implementation in PNet or a related program, but others may know better

than I do. This is kind of a special case of the network autocorrelation

model but with additional dependencies.

- If you include only basic dependencies (= functions of direct

neighbors and their attributes), this is called "multiparametric

spatiotemporal autoregressive model" (m-STAR) in the spatial

econometrics literature (see an article of Jude Hays, Aya Kachi and Rob

Franzese here: http://dx.doi.org/10.1016/j.stamet.2009.11.005).

- Finally, I have created a generalized version of all of this. I call

it "temporal network autocorrelation model" (TNAM) because it's a

spatial *autocorrelation model* like the one specified above (see first

bullet point), but it also includes various kinds of dependencies (hence

the word *"network"*, see second bullet point), and it's possible to

estimate it with *temporal* data/repeated observations of the network

and/or the outcome variable (see bullet point 3). More generally, you

can plug it into any model you would like, including tobit, survival,

linear mixed models etc. I have implemented this in the tnam function in

my R package xergm, along with a number of dependency terms to include.

See here (http://rpackages.ianhowson.com/rforge/xergm/man/tnam.html) and

here (http://rpackages.ianhowson.com/rforge/xergm/man/tnam-terms.html)

for a description. A paper will be available in a few weeks.

There are three potential caveats here:

(1) It may be difficult to specify the dependencies appropriately if the

attributes you are explaining are centralities. It may require you to

think hard about what is causing them and to what extent you want them

to be explained by network/influence terms vs. covariates, but I think

technically it should be a subproblem of the more general models

outlined above.

(2) A cross-sectional analysis does not really allow you to infer

causality (this is tricky enough with longitudinal data). It's

relatively certain that some of your independent variables/model terms

will be partially caused by the centrality of the nodes.

And (3) centrality scores are usually fairly skewed and often also bound

between 0 and 1, so it may be inappropriate to use a linear model. But

this is a general statistical problem, not one that is specific to

network analysis. You may want to consult the literature on beta

regression, Box-Cox transformations etc. TNAM should be able to deal

with this, but you have to find out what model (e.g., GLM with a beta

distribution) would be appropriate for your data.

As an alternative, you may want to consider modeling the network using

an exponential random graph model, rather than modeling centrality

scores, which are merely a function of the network with a huge loss of

information. By explaining the network, you basically explain the

structure including who is central.

Best regards

Philip

Am 17.03.2015 um 02:16 schrieb Elly Power:

> ***** To join INSNA, visit http://www.insna.org *****

> Hello all,

>

> I was hoping I could get some advice on how (or if) I could use

> centrality measures (e.g., eigenvector centrality) as the /dependent/

> variable in some analyses.

>

> I know that we usually think of centrality as an independent variable,

> but it seems reasonable that we might want to predict centrality.

> Personally, I work on religious practice, and I want to understand if

> the nature of someone's religious practice might influence his/her

> centrality.

>

> The issue, of course, is that centrality measures are not independent.

> Does anyone know of any ways to deal with this? Is there anyone who has

> tried to look at this? Any direction would be very much appreciated.

>

> Thanks in advance for all of your suggestions.

>

> - Elly Power

>

> --

> Eleanor A. Power, PhD Candidate

> Department of Anthropology

> Stanford University

> 450 Serra Mall, Bldg 50

> Stanford, CA 94305

> www.stanford.edu/~epower <http://www.stanford.edu/%7Eepower>

> _____________________________________________________________________

> SOCNET is a service of INSNA, the professional association for social

> network researchers (http://www.insna.org). To unsubscribe, send an

> email message to [log in to unmask] containing the line UNSUBSCRIBE

> SOCNET in the body of the message.

_____________________________________________________________________

SOCNET is a service of INSNA, the professional association for social

network researchers (http://www.insna.org). To unsubscribe, send

an email message to [log in to unmask] containing the line

UNSUBSCRIBE SOCNET in the body of the message.