*****  To join INSNA, visit  *****

Hi Alexander,

Sorry to have not replied sooner - busy busy, as many of us are.  Many
thanks to Peter Hook @ IU for
bringing this to my attention - I seem to have fallen off of SOCNET
some time ago.  Fixed, now.

It all gets quite interesting when you start probing at the data and
relationships between friend groups
and their interests.  And difficult to pin down whether we're seeing
evidence of individual groups - in some
cases, we noted the repeated presence of individual BARS in the
Toronto area - or of larger social
movements / commonalities across individuals who happen to have very
similar interests, but don't know
each other... and perhaps should!.  I find that I myself have more
detailed and more
broadly scoped questions as a result of this work, rather than any
answers.  I find it much more fun, that way!

Now, some concerns...

Something John and I have talked about at length, on several occasions
-- is that this kind of research has the
potential to be rather dangerous to individuals.  I don't think I'm
letting anything 'out of the bag' for folks on the
SOCNET list - any research where you quickly identify subcultural
leanings, drug activity, etc, is likely to have
severe impact on the future lives of young people.  In aggregate -
well, it 'feels' much nicer to present the data in
as anonymous a fashion as possible.

I am - to put it mildly - *extremely hesitant* to conjecture much
about the questions that you're asking.  :)  I would really
enjoy, however, seeing someone else take a stab at this question with
different, less potentially harmful data.  My concern
for the well-being of others and my intellectual curiosity are a bit
at odds, here.

The work done here was largely done, I think, on a moderately
configured G5 system of several years ago.  (It has been a
while - I don't recall us using one of the UltraSPARC machines for
this - except perhaps for some initial database munging -
so most likely we just used a desktop.  The SWI preprocessing, surely,
was done on John's desktop of that day.]  It should be fairly
reproducible on moderate-grade hardware of the present day without
huge issue; having plenty of memory and available
storage will of course be very helpful.

I think you had asked, too, about an implementation of PCA -- or
perhaps it was SVD?  For PCA, there have been numerous
implementations developed for R users recently - Google to the rescue
there.  There has also been some recent research
linking PCA and k-means explicitly - , which I have
only just now
found, seems quite fascinating.  [The underlying matrix operations are
quite similar, so it is good to have someone discussing
that in a formalized way....]   That stuff is right up my alley, and
I'm going to want to dig back into that myself.  ;)

I hope this helps?  I am happy to discuss offline or backchannel, so
as not to clutter the list up with discussion of
past implementation choices for getting work done ;)  [Similarly, I am
sure that John or Sarah would be quite pleased
to talk about these issues.  There is a lot of ground involved... and
we're glad to see someone else thinking about those
same points of interest.]



> From: Social Networks Discussion Forum [mailto:[log in to unmask]] On
> Behalf Of Semenov Alexander
> Sent: Sunday, October 03, 2010 12:34 PM
> To: [log in to unmask]
> Subject: PCA for testing homophily hypothesis?
> ***** To join INSNA, visit ***** Hello everybody,
> I have a question regarding the following article: "The Social Semantics of
> LiveJournal FOAF: Structure and Change from 2004 to 2005"
> (
> I've already addressed it to the authors, but, unfortunatelly they didn't
> respond.
> In short, they used Principle Component Analysis (PCA) to correlate top500
> users with top500 interests from a sample of LJ-users. Their results
> demonstrated, that there is no correlation between them. As far as I
> understand, it means that people's choise for friends is independent from
> their interests. So, can we say, that it means lack of homophily in this set
> of bloggers? And, more general, can we use PCA to make such inference?
> This summer I've been in Essex Summer School and took 2 advanced courses in
> SNA, but as far as I understood it is resourse demanding to run any
> homophily tests on a set of 1200 nodes (not talking about ergm). Am I rite?
> Thanks in advance.
> Best regards,
> Alexander.
> --
> Alexander Semenov.
> MA student
> Faculty of Sociology
> Moscow School of Social and Economic Sciences (MSSES)
> Graduate Student in Sociology at
> State University - Higher School of Economics
> _____________________________________________________________________ SOCNET
> is a service of INSNA, the professional association for social network
> researchers ( To unsubscribe, send an email message to
> [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of
> the message.

SOCNET is a service of INSNA, the professional association for social
network researchers ( To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.