***** To join INSNA, visit http://www.insna.org ***** Carl First of all the workings of REGE are not all together clear and you may be attributing an accuracy to the results beyond what is there. In particular the three iterations is left over from the days when computing these values was very slow. You may get rather different results if you increase from the three iterations. However, let us assume that what you have done is correct and the partitions do indeed reflect regular equivalence classes. At the first stage you do not give any information about the relationships between the groups you find and the rest of the network. From a structural point of view these positions must be significant but at the same time you indicate they are marginal. Suppose this was a friendship network and the values of the links represented strength of friendship. If one group have weak links to an individual and another group have say no links to the same individual and further suppose the groups have stronger internal ties and stronger ties to each other (across the groups) than to the outsider. Then REGE will focus on the structural properties of the outsider and place the outsider in a single group before looking at the differences in the two groups. It will then find the two groups because of their different relationship with the outsider. But these two groups may not be two groups since they only have weak links to the outsider. Since REGE does not rank strength but looks for similarity of ties then it has formed the groups on very weak evidence. In this case it is quite legitimate to remove the outsider and look for the structure which represents the patterning without the excluded individual. In other words what you do is completely justifiable provided the groups you remove are really marginal and not just inconvenient. In essence you need to determine if these nodes are really peripheral and if they are then you are OK. If they are not then you really should not do this. Martin Martin Everett -----Original Message----- From: Social Networks Discussion Forum [mailto:[log in to unmask]] On Behalf Of Carl Nordlund Sent: 22 May 2007 15:29 To: [log in to unmask] Subject: Arbitrary removal of nodes in reg eq-analysis? ***** To join INSNA, visit http://www.insna.org ***** Hi all, Having a little dilemma here which I guess others before me have confronted. Being self-taught in everything SNA, I pose my question to this email list, hoping for some tutoring on the subject! I'm currently doing a reg. equivalence analysis on energy flows (energy content in four fuel commodities) between the countries of the world - data is valued, directional with quite a large value span among the flow values. Using the REGE-algorithm in the Ucinet package, 3 iterations, selecting the number of partitions based on an Anova Density check for different number of partitions (as used in Luczkovich et al). When using 99 countries in my dataset, I get an optimal split at 11 partitions (i.e. positions containing role-equivalent actors). Two of these are singleton positions, i.e. containing only singular countries, and two positions contain only two countries each. All these 6 countries are fairly small and uninteresting, covering only 0.27% of total world population, 0.04% of total world GDP, and 0.03% of total flow values in the dataset. Thus, what I would like to do is to remove these 6 countries from my dataset and repeat the analysis with only 93 countries. When doing so, I get an optimal number of positions at 8, the two smallest of these positions containing 3 and 4 countries respectively. I find this 1) much easier to analyze, 2) much easier to visualize (as a reduced/image graph), 3) giving a higher resolution (more partitions) regarding the positions containing the bulk of countries, and 4) removing countries that I feel could "disturb" the REGE algorithm in finding the major positions, removing countries that though might be unique but not very significant with respect to their coverage (as given by share of total flow values and attributional measures such as population and GDP). However: how on earth can I motivate this? Can I just simply argue that "well, first I included these 6 countries, but as these countries resultet in 4 unique positions containing only these countries, I chose to remove these countries from the dataset and try without them - they are so small and insignificant anyhow..."? I could probably find some criteria for removing these based on their attributes, net degrees or similar, but that would not be very scientifically honest now, would it? How have other people done in analyses that yields a bunch of trivial and singleton positions, i.e. positions that only contain 1-2 actors that are of fairly minor importance anyway? Suggestions? (And sorry for using this email list as a classroom here - I have nowhere else to turn to...) Yours, Carl -- Carl Nordlund, BA, PhD student carl.nordlund(at)humecol.lu.se Human Ecology Division, Lund university www.humecol.lu.se _____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message. _____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message.