* Thursday, June 28th at 1pm *

On the campus of Michigan State University, in East Lansing, Michigan

Interaction in Online Communities
From Data Collection to Research Results
A workshop at the Third Annual Communities and Technologies Conference 2007

Workshop weblog:

Conference web site:

To Register:

Featured Panelists:

        Juan Carlos Barahona, MIT
        Brian Butler, University of Pittsburgh
        Dan Cosley, Cornell University
        Hank Green, UIUC/NCSA
        Matthew Hurst, Microsoft
        Dan Huttenlocher, Cornell University
        Bob Kraut, CMU
        Cliff Lampe, Michigan State University
        Cameron Marlow, Yahoo Research
        Paul Resnick, U Michigan

Workshop description:

The scope and complexity of the data from online communities provides
unprecedented insight into how social interaction unfolds in real groups.
Rich longitudinal data on the content and structure of social interaction is
now available, but only if researchers can effectively extract, organize and
make sense of it. This workshop will focus on methods, tools, and techniques
for overcoming the challenges associated with each stage of processing large
scale interaction data from online communities. Processing includes:

       * Data collection: scraping and saving raw data from online
       * Data management: parsing the data into a format that can be queried
       * Dataset and sample construction: extracting subsets of the data
which can be processed by analytical tools
       * Analysis: analyzing the data and producing results

Panelists will describe and demonstrate some or all of these methods in the
context of their research. The workshop will emphasize generally applicable
techniques that participants could apply to other projects. The workshop
will include discussions on:

       1. Methods employed to overcome specific issues with the data during
a particular project, which other researchers might be able to use in their
own work.

       2. General approaches to parsing, managing, and analyzing large-scale
data, which the presenter has found useful in a variety of settings or for a
general class of data.

Participants with experience in this area of research are encouraged to
discuss their own work - either challenges they have overcome which might
help other participants, or challenges that they are facing and would like
to discuss. We would like this workshop to help researchers at all levels
get a sense of how to apply the tools and techniques available for analyzing
this type of data to their own research.

Those who are just getting started are asked to bring their questions.
Panelists and other participants will have a variety of good answers!


Thomas M. Lento is a PhD candidate in the Department of Sociology at Cornell
University, and a contract researcher with the Community and Technologies
Group at Microsoft Research. His research interests focus on social network
topologies, diffusion, contagion, and the spread of rumor in online
networks, particularly weblog and threaded discussion networks. His recent
work examines the effect of social network position on retention in a
weblogging system.

Howard T. Welser is a Post Doctoral Researcher in the Institute for Social
Sciences at Cornell University, and will be, effective September 2007,
Assistant Professor of Sociology at Ohio University. His research
investigates how micro-level processes generate collective outcomes, with
application to status achievement in avocations, development of institutions
and social roles, the emergence of cooperation, and network structure in
computer mediated interaction. His recent work has focused on the
intersection of participation and network structure in online discussion
groups, blogs, and Wikipedia.

Eric Gleave is a sociology graduate student at the University of Washington.
His research projects include developing network methods, the demographic
and structural bases for early modern revolts, simulation studies of
cooperation and corruption, and discerning social roles in online discussion
spaces.

Marc A. Smith is a Research Sociologist at Microsoft Research specializing
in the social organization of online communities. He leads the Community
Technologies Group at MSR. He is the co-editor of Communities in Cyberspace
(Routledge), a collection of essays exploring the ways identity,
interaction, and social order develop in online groups. Smith's research
focuses on the ways group dynamics change when they take place in social
cyberspaces. Many groups in cyberspace produce public goods and organize
themselves in the form of a commons (for related papers see:

