***** To join INSNA, visit http://www.insna.org *****
Dear Elly,

I'm in support of the last mentioned possibility by Philip. I think it is better to see this as trying to model the network than merely trying to model centrality scores. Potentially you can also see how your variable of interest (religiosity) influences various positional network features, e.g., differentiating between incoming and outgoing centrality, homophily, interaction with reciprocity or with other homophily indicators, etc.

Best wishes,
Tom

=========================================
Tom A.B. Snijders
Professor of Statistics and Methodology, Dept of Sociology, University of Groningen
Emeritus Fellow, Nuffield College, University of Oxford
http://www.stats.ox.ac.uk/~snijders

On Tue, Mar 17, 2015 at 10:14 AM, Philip Leifeld wrote:
---------------------- Information from the mail header -----------------------
Subject:      Re: Centrality as the Dependent Variable?
-------------------------------------------------------------------------------

*****  To join INSNA, visit http://www.insna.org  *****

Hi Elly,

The centrality values attached to the nodes are not independent from
each other because a high centrality value of one node may imply a high
centrality value of an adjacent node (see measures of degree
assortativity etc.) -- or merely because the network has a given
centralization and increasing the centrality of one node must decrease
the centrality of another node.

If you estimate a regression model, you should therefore specify the
channels of influence/dependence between the nodes as covariates. For
example, for each node you can include the cumulated and/or average
centrality of adjacent nodes (and possibly of indirect friends with path
length 2). Depending on the centrality measure (e.g., eigenvector
centrality), it may also make sense to include the aggregated centrality
scores of structurally similar nodes because being connected to the same
important other nodes makes both nodes in a dyad similarly important.
There may be many other ways in which a node's centrality value depends
on other nodes' values but also potentially just on features of the
network or the local neighborhood of a node in the network.

Regression models where the dependencies are specified like this go by
several names:

- If you estimate linear or generalized linear models and include only
functions of direct neighbors, this is called a "spatial autocorrelation
model" or a "network autocorrelation model". See the work of Patrick
Doreian, for example. There is a nice article in Social Networks on
"Specifying the weight matrix" for these models by Roger Leenders
(http://dx.doi.org/10.1016/S0378-8733(01)00049-1). There is an R
implementation for the linear case in the sna package by Carter Butts
(the lnam function).

- If you include various other dependencies and estimate a binary (=
logit or probit) model, this was termed "autologistic actor attribute
model" (ALAAM) by Galina Daraganova and Garry Robins in their chapter 9
of the ERGM book ("Exponential Random Graph Models for Social Networks")
by Dean Lusher, Johan Koskinen and Garry Robins. I think there is an
implementation in PNet or a related program, but others may know better
than I do. This is kind of a special case of the network autocorrelation

- If you include only basic dependencies (= functions of direct
neighbors and their attributes), this is called "multiparametric
spatiotemporal autoregressive model" (m-STAR) in the spatial
econometrics literature (see an article of Jude Hays, Aya Kachi and Rob
Franzese here: http://dx.doi.org/10.1016/j.stamet.2009.11.005).

- Finally, I have created a generalized version of all of this. I call
it "temporal network autocorrelation model" (TNAM) because it's a
spatial *autocorrelation model* like the one specified above (see first
bullet point), but it also includes various kinds of dependencies (hence
the word *"network"*, see second bullet point), and it's possible to
estimate it with *temporal* data/repeated observations of the network
and/or the outcome variable (see bullet point 3). More generally, you
can plug it into any model you would like, including tobit, survival,
linear mixed models etc. I have implemented this in the tnam function in
my R package xergm, along with a number of dependency terms to include.
See here (http://rpackages.ianhowson.com/rforge/xergm/man/tnam.html) and
here (http://rpackages.ianhowson.com/rforge/xergm/man/tnam-terms.html)
for a description. A paper will be available in a few weeks.

There are three potential caveats here:

(1) It may be difficult to specify the dependencies appropriately if the
attributes you are explaining are centralities. It may require you to
think hard about what is causing them and to what extent you want them
to be explained by network/influence terms vs. covariates, but I think
technically it should be a subproblem of the more general models
outlined above.

(2) A cross-sectional analysis does not really allow you to infer
causality (this is tricky enough with longitudinal data). It's
relatively certain that some of your independent variables/model terms
will be partially caused by the centrality of the nodes.

And (3) centrality scores are usually fairly skewed and often also bound
between 0 and 1, so it may be inappropriate to use a linear model. But
this is a general statistical problem, not one that is specific to
network analysis. You may want to consult the literature on beta
regression, Box-Cox transformations etc. TNAM should be able to deal
with this, but you have to find out what model (e.g., GLM with a beta
distribution) would be appropriate for your data.

As an alternative, you may want to consider modeling the network using
an exponential random graph model, rather than modeling centrality
scores, which are merely a function of the network with a huge loss of
information. By explaining the network, you basically explain the
structure including who is central.

Best regards

Philip

Am 17.03.2015 um 02:16 schrieb Elly Power:
> ***** To join INSNA, visit http://www.insna.org *****
> Hello all,
>
> I was hoping I could get some advice on how (or if) I could use
> centrality measures (e.g., eigenvector centrality) as the /dependent/
> variable in some analyses.
>
> I know that we usually think of centrality as an independent variable,
> but it seems reasonable that we might want to predict centrality.
> Personally, I work on religious practice, and I want to understand if
> the nature of someone's religious practice might influence his/her
> centrality.
>
> The issue, of course, is that centrality measures are not independent.
> Does anyone know of any ways to deal with this? Is there anyone who has
> tried to look at this? Any direction would be very much appreciated.
>
>
> - Elly Power
>
> --
> Eleanor A. Power, PhD Candidate
> Department of Anthropology
> Stanford University
> 450 Serra Mall, Bldg 50
> Stanford, CA 94305
> www.stanford.edu/~epower <http://www.stanford.edu/%7Eepower>
> _____________________________________________________________________
> SOCNET is a service of INSNA, the professional association for social
> network researchers (http://www.insna.org). To unsubscribe, send an