Print

Print


*****  To join INSNA, visit http://www.insna.org  *****

Agreed, Tom and Michal.

As a minor supplement to Michal's discussion, we published the strategy in
a recent paper, and it can work well even in networks with 10,000 nodes (we
have some simulation studies with up to 10,000 nodes, and an application
with more than 10,000 nodes):

Babkin, Stewart, Long, and Schweinberger (2020+). Large-scale estimation of
random graph models with local dependence. Computational Statistics and
Data Analysis. In press.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.sciencedirect.com_science_article_pii_S0167947320301201&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=rjoAk7mP5rpghx6zSzvnPg1q_3kYNLLg1Wiz3TmOrF8&s=vKVqihWsgv6D7N8ImzOqRqkTW-PbVjaiLn_3omRDkPE&e= 

It is the default method in R package hergm (since 2019), although hergm
needs a small fix (works on R versions < 4, but not 4).
We will release hergm for R version 4 later this summer.

That said, I do agree that it would be preferable to analyze the data as
two-mode network data.

Best,
Michael

On Wed, Jun 10, 2020 at 8:56 PM Michael Schweinberger <
[log in to unmask]> wrote:

> Agreed, Tom and Michal.
>
> Let me add that the strategy Michal mentioned is published in a recent
> paper, and can work well even in networks with 10,000 nodes (we have some
> simulation studies with up to 10,000 actors, and an application with more
> than 10,000 nodes):
>
> Babkin, Stewart, Long, and Schweinberger (2020+). Large-scale estimation
> of random graph models with local dependence. Computational Statistics and
> Data Analysis. In press.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.sciencedirect.com_science_article_pii_S0167947320301201&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=rjoAk7mP5rpghx6zSzvnPg1q_3kYNLLg1Wiz3TmOrF8&s=vKVqihWsgv6D7N8ImzOqRqkTW-PbVjaiLn_3omRDkPE&e= 
>
> In fact, it is the default method in R package hergm (since 2019),
> although hergm needs a small fix (works on R versions < 4, but not 4).
> We will release hergm for R version 4 later this summer.
>
> That said, I do agree that it would be preferable to analyze the data as
> two-mode network data.
>
> Best,
> Michael
>
> On Wed, Jun 10, 2020 at 9:57 AM Michał Bojanowski <[log in to unmask]>
> wrote:
>
>> *****  To join INSNA, visit https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=rjoAk7mP5rpghx6zSzvnPg1q_3kYNLLg1Wiz3TmOrF8&s=TiUZ2RQffl0xEek6Iqb9WNnSX5CvYw7wa8JOqGTDbJE&e=   *****
>>
>> Tom, Zak,
>>
>> I missed Zaks reply that it is a projection of a two-mode network.
>> Then indeed ERGMs (even valued ones) may be problematic for the
>> reasons Tom already mentioned.
>>
>> A strategy that I once tried (not published) with co-authorship data,
>> but applies probably to a range of other co-something networks, is to,
>> as a first step, run some kind of a community detection algorithm or a
>> stochastic block model to find the cliques or clusters and in the
>> second step use the membership-in-clusters as variable for a nodematch
>> term for an ERGM. As result I was able to get models that converged
>> and had a pretty good GOF. What becomes problematic is the
>> interpretation of parameters. So I guess it is primarily useful if
>> your goal is not so much parameter interpretation but network
>> generation/sampling.
>>
>> Best, Michal
>>
>> On Wed, Jun 10, 2020 at 4:39 PM Snijders, T.A.B. <[log in to unmask]>
>> wrote:
>> >
>> > ***** To join INSNA, visit
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=Ia2ziBLGQMy-7Kti1QlGZMhGnNKKKE_8kBwtGtAvp7s&e=
>> *****
>> > Dear Zak,
>> >
>> > I think it will be very difficult indeed to model such a network by an
>> ERGM. Related to this, I think it would be much better to analyze the
>> original two-mode by year network, rather than some one-mode projection.
>> > Furthermore, I do not think that changes (from year to year) in
>> estimated ERGM parameters for such a one-mode or two-mode network are an
>> adequate reflection or operationalisation of "changes in homophily". I do
>> not yet have a clear answer to the question "then what else would be an
>> adequate reflection?".
>> >
>> > Cheers,
>> > Tom
>> >
>> > =========================================
>> > Tom A.B. Snijders
>> > Professor of Statistics and Methodology, Dept of Sociology, University
>> of Groningen
>> > Emeritus Fellow, Nuffield College, University of Oxford
>> >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.stats.ox.ac.uk_-7Esnijders&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=oLomqAWqU7e5HMTyaEC9xcDmK5Ve6mHUSVed0ajhVQc&e=
>> >
>> >
>> > On Wed, Jun 10, 2020 at 1:37 PM Neal, Zachary <[log in to unmask]> wrote:
>> >>
>> >> Dear Tom,
>> >>
>> >> Thanks for the feedback.
>> >>
>> >> The data are constructed via projection of two-mode data (in this
>> case, bill co-sponsorship). But, we've used a null model to identify and
>> retain only significant edges in the projection, so the one-mode network
>> does not contain the kinds of artifacts normally generated by projection.
>> >>
>> >> That said, the network does still contain numerous large cliques,
>> which isn't surprising given the high transitivity. In this case, because
>> the network represents ties between US legislators, its structure is
>> primarily driven by clusters of republicans and democrats. The goal is to
>> estimate both party and gender homophily, examining changes in both over
>> time.
>> >>
>> >> Do you have any suggestions on how to estimate an ERGM on such a
>> network, or is it likely not possible?
>> >>
>> >> Best,
>> >> Zak
>> >>
>> >> –––
>> >> Zachary Neal, PhD
>> >> Associate Professor, Michigan State University
>> >> Web:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.zacharyneal.com&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=XNc-q6uZIkyVsbr_IokhOd3bnTuRV_xdbIIBP8f8M5E&e=
>> >> Twitter: @zpneal
>> >> Zoom: Click here
>> >>
>> >> On Jun 8, 2020, at 11:33 AM, Snijders, T.A.B. <[log in to unmask]>
>> wrote:
>> >>
>> >> ***** To join INSNA, visit
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=Ia2ziBLGQMy-7Kti1QlGZMhGnNKKKE_8kBwtGtAvp7s&e=
>> *****
>> >> Dear Zachary,
>> >>
>> >> It is hard to say something meaningful without further information.
>> >> But a network with 450 nodes and density about 0.1 has average degree
>> 45. That is extremely large and dense for an ERGM to fit well.
>> >> If you say transitivity is in the order of 0.6 then a model without
>> gwesp (or similar) terms is sure to have a poor fit.
>> >> Just as a note, if the network was constructed as a one-mode
>> projection of a two-mode network, then it probably will contain many
>> cliques of order higher than 4, which is not in line with the idea of an
>> ERGM, and is bound to lead to problems in estimation. (I bring this up just
>> because I saw this issue earlier today.)
>> >>
>> >> Cheers,
>> >> Tom
>> >>
>> >> =========================================
>> >> Tom A.B. Snijders
>> >> Professor of Statistics and Methodology, Dept of Sociology, University
>> of Groningen
>> >> Emeritus Fellow, Nuffield College, University of Oxford
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.stats.ox.ac.uk_-7Esnijders&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=oLomqAWqU7e5HMTyaEC9xcDmK5Ve6mHUSVed0ajhVQc&e=
>> >>
>> >>
>> >> On Thu, Jun 4, 2020 at 10:06 PM Michał Bojanowski <
>> [log in to unmask]> wrote:
>> >>>
>> >>> ---------------------- Information from the mail header
>> -----------------------
>> >>> Sender:       Social Networks Discussion Forum <[log in to unmask]>
>> >>> Poster:       =?UTF-8?Q?Micha=C5=82_Bojanowski?= <
>> [log in to unmask]>
>> >>> Subject:      Re: Help with EGRM non-convergence when using GWESP
>> >>>
>> -------------------------------------------------------------------------------
>> >>>
>> >>> *****  To join INSNA, visit
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=Ia2ziBLGQMy-7Kti1QlGZMhGnNKKKE_8kBwtGtAvp7s&e=
>>  *****
>> >>>
>> >>> I should add that what I wrote before should not explain
>> >>> non-convergence per se but rather guide you towards identifying the
>> >>> problem of the model specification vs data. Looking at GOF plots for
>> >>> the most complex model that you fit which converged should help you
>> >>> understand why it stops converging when you add GWESP.
>> >>>
>> >>> ~Michal
>> >>>
>> >>> On Thu, Jun 4, 2020 at 9:54 PM Michał Bojanowski <
>> [log in to unmask]> wrote:
>> >>> >
>> >>> > Zachary,
>> >>> >
>> >>> > > I haven't spent much time looking at model GOF since I don't have
>> a good comparison. The models that include nodematch terms obviously fit
>> better than a null model that only contains the edges term, but that didn't
>> seem particularly informative. If a model without GWESP appears to fit
>> well, would it be acceptable to simply use it and ignore any structural
>> effects.
>> >>> >
>> >>> > I guess the most important question is whether the model without
>> GWESP
>> >>> > accounts well for the ESP distribution. If it does, then you do not
>> >>> > need GWESP term in the model. Folding this onto Goodreau et al
>> >>> > exposition it would mean that the differential homophily you have in
>> >>> > your model accounts for higher density within groups, and that
>> already
>> >>> > also accounts for the amount of transitivity in the network as whole
>> >>> > (with higher density some transitivity will happen within groups "by
>> >>> > accident"). Consequently, there would be not much transitivity left
>> to
>> >>> > "explain" by GWESP on top of the terms you already have in the
>> model.
>> >>> >
>> >>> > Ad whether it is acceptable to go with a model without any
>> structural
>> >>> > (i.e. network endogeneous effects):
>> >>> >
>> >>> > This is of course a matter if it makes sense substantively. From a
>> >>> > purely data-driven standpoint if a model with "demographic" effects
>> >>> > only (attribute-related terms such as dyadcov, nodecov, nodefactor,
>> >>> > nodematch, nodemix etc.) accounts for the network structure well in
>> >>> > the sense of reproducing the important features in the data (degree
>> >>> > distribution, ESP distribution and so on), then I would say yes.
>> >>> >
>> >>> > hth,
>> >>> > Michal
>> >>>
>> >>> _____________________________________________________________________
>> >>> SOCNET is a service of INSNA, the professional association for social
>> >>> network researchers (
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=Ia2ziBLGQMy-7Kti1QlGZMhGnNKKKE_8kBwtGtAvp7s&e=
>> ). To unsubscribe, send
>> >>> an email message to [log in to unmask] containing the line
>> >>> UNSUBSCRIBE SOCNET in the body of the message.
>> >>
>> >> _____________________________________________________________________
>> SOCNET is a service of INSNA, the professional association for social
>> network researchers (
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=Ia2ziBLGQMy-7Kti1QlGZMhGnNKKKE_8kBwtGtAvp7s&e=
>> ). To unsubscribe, send an email message to [log in to unmask]
>> containing the line UNSUBSCRIBE SOCNET in the body of the message.
>> >>
>> >>
>> > _____________________________________________________________________
>> SOCNET is a service of INSNA, the professional association for social
>> network researchers (
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=WUgsy-_Qd1IxZBvtEoIPP-kbjpSBryezOIfpe_ndi4s&s=Ia2ziBLGQMy-7Kti1QlGZMhGnNKKKE_8kBwtGtAvp7s&e=
>> ). To unsubscribe, send an email message to [log in to unmask]
>> containing the line UNSUBSCRIBE SOCNET in the body of the message.
>>
>> _____________________________________________________________________
>> SOCNET is a service of INSNA, the professional association for social
>> network researchers (https://urldefense.proofpoint.com/v2/url?u=http-3A__www.insna.org&d=DwIFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=yQQsvTNAnbvDXGM4nDrXAje4pr0qHX2qIOcCQtJ5k3w&m=rjoAk7mP5rpghx6zSzvnPg1q_3kYNLLg1Wiz3TmOrF8&s=TiUZ2RQffl0xEek6Iqb9WNnSX5CvYw7wa8JOqGTDbJE&e= ). To unsubscribe, send
>> an email message to [log in to unmask] containing the line
>> UNSUBSCRIBE SOCNET in the body of the message.
>>
>

_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.