I am very near the end of the process of doing precisely this.
I have used a relational database system (Foxpro) to store and
clean the data. In my case the matching process has had the
great advantage of most respondents giving the names of their
network alters. It is unclear to me whether that is so in your case,
but obviously that makes an enormous difference. The other
big difference is that your ego-networks are size 4 whereas I have
networks that tend to be size 10-30, and for some people are
even larger than that. I am not sure the following method would
work so well with smaller network size, but it may. Even so, it is
a great deal of work.
One has to determine whether any two persons with the same
name are in fact the same person (e.g. there might be a dozen
Robert Smiths, or there might be Sr., Jr., III or a single name) and
whether persons with similar names might in fact be the same person
(e.g. nicknames, misspellings, married vs. maiden names, etc.)
I handled this in the following manner:
1) wrote and ran a SQL routine that scored every pair of names
for the degree to which they matched, and created a new database
of dyadic name pairs that were possible matches. Since I have about
13K of people this list of possible matching pairs was upwards of 40K
2) sorted the database of persons by firstname and scanned thru
surnames of people with same or very similar firstnames, then sorted
the database by surname and did the same thing for firstnames.
Found a few new possible matches -- particularly because my matching
algorithm was more likely to miss misspellings for very short names --
and added these possibles to the database of matching pairs.
3) Made a list of all nicknames and common alternate names that would
be difficult for my matching algorithm to pick up (e.g. Peggy for Margaret)
and scanned (visually!) every nicknamed person for possible matches
that were again added to the database of possible matches.
4) I wrote and ran another SQL routine that checked each pair for
common network relations with alters or as alters of other people and
made a database of these where each record was the id of each pair
in the dyad, a particular kind of network relationship, and a third person
in the network who was part of that relationship. These relationships
were selected both to show when two persons could not be the same
and also to suggest when they might be the same. For example if BOTH
people were listed as alters by a single other person, then they are NOT
the same person. If one lists person A as a friend and person A lists the
other as a friend, it is likely they are the same person and the difference
is just some variation in how the person listed themself and how their
listed them (e.g. a misspelling or a nickname).
5) by linking the database of these network patterns with the database
of possible matches, and having the gender and age information (and
often other information) in front of me at the same time, I was able to
go through the very long list and quickly assess each pair with the
following triage: (1) clearly different people, (2) extremely likely to be
the same person, (3) unsure. Category 3 was relatively small, amounting
to only a few hundred pairs.
6) I then had to go through category 3 on a case by case basis and
explore their network relationships "by hand" but with databases open.
I often had to re-contact respondents or other informants in the village
to confirm whether or not these were the same persons.
7) ultimately there remains a small percentage of cases that cannot be
resolved with certainty.
Very shortly I will have some idea of how successful I have been using this
strategy to translate ego-centric networks into a "full" community network.
----- Original Message -----
From: "Susan Watkins" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Wednesday, November 07, 2001 7:28 PM
> Does anyone know if it is possible to construct (or approximate) complete
> network data in a village using information about ego-centric networks
> (size 4) that have been collected for all individuals in this village? It
> seems that there should be a matching algorithm that fills in the nxn
> matrix of relations among village members (where n is the number of
> respondents in the village) on the basis of the reported ego-centric
> networks and the characteristics of network partners (e.g. schooling,
> number of children, etc; unfortunately, no exact indentification of
> network partners is available in the data). In principle (and perhaps
> easily in practice, depending on software) this information could be
> sufficient to estimate complete network matrices (or structural
> characteristics of these village matricies, such as density). Since it is
> clear that the village matrices are not completely identified by the
> information provided by ego-centeric networks, the analysis needs to
> incorporate the uncertainty in the estimation of village matrices.
> Has anybody tried to translate from egocentric to full community networks?
> Is there software for this?
> Susan Watkins