*****  To join INSNA, visit  *****

We are happy to announce the public availability of a substantial 
collection of blog data for research purposes.  The data is being made 
available by Intelliseek/BlogPulse in conjunction with the 3rd Annual 
Workshop on the Weblogging Ecosystem.  A DVD containing full text from 
nearly 1 million blogs can be requested by filling out the form at the 
workshop homepage:

The release comprises a complete set of weblog posts for three weeks in 
July 2005 (on the order of 10M posts from 1M weblogs). This data set has 
been selected as it spans a period of time during which an event of 
global significance occurred, namely the London bombings.  The data set 
includes the full content of the posts plus metadata in an easy to parse 
XML format. The metadata fields include: date of posting, time of 
posting, author name, title of the post, weblog url, permalink,  
tags/categories, and outlinks classified by type.

Much of the interest in research relating to weblogs involves the 
analysis of large quantities of data. As part of this workshop, we are 
very excited to provide a data set to the research community. The aim is 
to encourage the use of this data to focus the various views and 
analyses of the blogosphere over a common space. This will provide a 
unique opportunity to compare different views of the blogosphere and to 
stimulate interesting discussion and collaboration.

Researchers are welcome to concentrate on whatever aspects of the data 
they are interested in.  Possible topics include:
-  Topic detection and tracking
-  Relation of blog data to other media
-  Social network analysis
-  Qualitative analysis of small scale interactions
-  Sentiment detection
-  Search tools
-  Detection of spam blogs
-  Correlation of weblog events to "real-world" data (e.g. the stock market)
-  Clustering and ontology creation
-  Measures of influence
-  Visualization and mapping of the blogosphere

Please note that we welcome any submissions to the workshop, not just 
those making use of the data.  Feel free to contact the committee with 
any questions you may have.

Eytan Adar, University of Washington
Natalie Glance, Intelliseek & BlogPulse
Matthew Hurst, Intelliseek & BlogPulse

SOCNET is a service of INSNA, the professional association for social
network researchers ( To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.