***** To join INSNA, visit http://www.insna.org ***** Hi Alexander, Sorry to have not replied sooner - busy busy, as many of us are. Many thanks to Peter Hook @ IU for bringing this to my attention - I seem to have fallen off of SOCNET some time ago. Fixed, now. It all gets quite interesting when you start probing at the data and relationships between friend groups and their interests. And difficult to pin down whether we're seeing evidence of individual groups - in some cases, we noted the repeated presence of individual BARS in the Toronto area - or of larger social movements / commonalities across individuals who happen to have very similar interests, but don't know each other... and perhaps should!. I find that I myself have more detailed and more broadly scoped questions as a result of this work, rather than any answers. I find it much more fun, that way! Now, some concerns... Something John and I have talked about at length, on several occasions -- is that this kind of research has the potential to be rather dangerous to individuals. I don't think I'm letting anything 'out of the bag' for folks on the SOCNET list - any research where you quickly identify subcultural leanings, drug activity, etc, is likely to have severe impact on the future lives of young people. In aggregate - well, it 'feels' much nicer to present the data in as anonymous a fashion as possible. I am - to put it mildly - *extremely hesitant* to conjecture much about the questions that you're asking. :) I would really enjoy, however, seeing someone else take a stab at this question with different, less potentially harmful data. My concern for the well-being of others and my intellectual curiosity are a bit at odds, here. The work done here was largely done, I think, on a moderately configured G5 system of several years ago. (It has been a while - I don't recall us using one of the UltraSPARC machines for this - except perhaps for some initial database munging - so most likely we just used a desktop. The SWI preprocessing, surely, was done on John's desktop of that day.] It should be fairly reproducible on moderate-grade hardware of the present day without huge issue; having plenty of memory and available storage will of course be very helpful. I think you had asked, too, about an implementation of PCA -- or perhaps it was SVD? For PCA, there have been numerous implementations developed for R users recently - Google to the rescue there. There has also been some recent research linking PCA and k-means explicitly - http://ranger.uta.edu/~chqding/papers/KmeansPCA1.pdf , which I have only just now found, seems quite fascinating. [The underlying matrix operations are quite similar, so it is good to have someone discussing that in a formalized way....] That stuff is right up my alley, and I'm going to want to dig back into that myself. ;) I hope this helps? I am happy to discuss offline or backchannel, so as not to clutter the list up with discussion of past implementation choices for getting work done ;) [Similarly, I am sure that John or Sarah would be quite pleased to talk about these issues. There is a lot of ground involved... and we're glad to see someone else thinking about those same points of interest.] best, --elijah > From: Social Networks Discussion Forum [mailto:[log in to unmask]] On > Behalf Of Semenov Alexander > Sent: Sunday, October 03, 2010 12:34 PM > To: [log in to unmask] > Subject: PCA for testing homophily hypothesis? > > > > ***** To join INSNA, visit http://www.insna.org ***** Hello everybody, > I have a question regarding the following article: "The Social Semantics of > LiveJournal FOAF: Structure and Change from 2004 to 2005" > (http://www.scribd.com/doc/353326/The-Social-Semantics-of-LiveJournal-FOAF-Structure-and-Change-from-2004-to-2005). > I've already addressed it to the authors, but, unfortunatelly they didn't > respond. > In short, they used Principle Component Analysis (PCA) to correlate top500 > users with top500 interests from a sample of LJ-users. Their results > demonstrated, that there is no correlation between them. As far as I > understand, it means that people's choise for friends is independent from > their interests. So, can we say, that it means lack of homophily in this set > of bloggers? And, more general, can we use PCA to make such inference? > This summer I've been in Essex Summer School and took 2 advanced courses in > SNA, but as far as I understood it is resourse demanding to run any > homophily tests on a set of 1200 nodes (not talking about ergm). Am I rite? > Thanks in advance. > Best regards, > Alexander. > > -- > Alexander Semenov. > MA student > Faculty of Sociology > Moscow School of Social and Economic Sciences (MSSES) > http://www.msses.ru/English/index.html > > Graduate Student in Sociology at > State University - Higher School of Economics > http://www.hse.ru/eng > _____________________________________________________________________ SOCNET > is a service of INSNA, the professional association for social network > researchers (http://www.insna.org). To unsubscribe, send an email message to > [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of > the message. _____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message.