***** To join INSNA, visit http://www.insna.org *****
Inspired by the amazing work done by the ppl at Berkman center at
I've been thinking about how to gather bloggosphere data, i.e. the
creation of a (national) network dataset in which each node is a blog
and where the edges/links are the number of directional links (external)
from-to each pair of blogs. I have started working on a php script that
recursively crawls a website, check for external links, and builds a
dataset - this of course has to be combined with a check on the
nationality of the blog (comparing with national IP ranges and/or
language analysis of a sample text).
But perhaps I'm trying to invent the wheel again. Are there any suitable
web crawling software that can do the trick? As I have understood it,
the consulting firm Morningside Analytics helped the Berkman group in
their mapping - judging by the rather large dataset, I assume that they
used some sort of web crawler. Anyone knows anything more about this?
Carl Nordlund, BA, PhD student
Human Ecology Division, Lund university
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.