***** To join INSNA, visit http://www.insna.org *****
Mining of large social networks from Twitter using Python
Day: Tuesday 8:00 am - 11:00 am
Workshop Length: 1 session (3 hours)
Attendance Limit: 21-30
This workshop focuses on how to use the Python programming language
for the purpose of mining certain social networking services through
their APIs. The acronym API stands for Application Programming
Interfaces which, roughly speaking, is as a set of routines that a
social networking service may have set up for allowing users to access
specific chunks of data hosted in these social media. Needless to say
that nowadays social media provide easily accessed sources of big data
among which those displaying relational features may supply examples
of large empirical social networks representing structures of action
in the corresponding social media. The reason we are using Python to
implement such data mining tasks is because Python exhibits a number
of advantageous qualities in data gathering, data manipulation and
data visualization and analysis.
Social media allow content that is created in one place can be
dynamically posted and updated on the web. For instance, content
(including texts, photos and videos) can be embedded, dynamically
posted and shared together with certain user information (profile). In
general, since the Twitter API is more open when comes to sharing
information and given the existing restrictions in the Facebook and
LinkedIn APIs, we are going to focus here just on data mining from
Twitter.
The main mining tool in Twitter includes two RESTful APIs. Through the
Twitter REST API methods users may access and interact with core
Twitter data (such as update timelines, status data and user
information) and Twitter Search data. Through the Streaming API
method, users may access streaming tweets in real time as they happen.
In all cases, retrieved data are in the JSON data format. JSON is the
acronym of JavaScript Object Notation, i.e. an open standard format
that uses human-readable text to transmit data objects consisting of
attribute-value pairs.
Twitter users may access the API through an authorization provided by
the OAuth tool, which is an authentication protocol that allows users
to approve their application to act on their behalf without sharing
their password. After getting authorization, users may employ
different API methods for accessing information on tweets (including
the occurrence of hashtags, search terms, embedded media etc.), users,
following relationships (friends, followers), retweets, etc.
From the social network analysis point of view, an example of what we
may obtain is a three-level network composed of the three partial
networks extracted from Twitter data. The lower level is the network
of retweets among Twitter users, the middle level is the network of
following relationships (among a group of Twitter users) and the upper
level is the network of co-occurrences among hashtags (or other
searched topics) included in the tweets sent by these users.
Prerequisites: Very elementary familiarity with Python.
Submitting Instructors: Moses Boudourides and Sergios Lenis
Institution: University of Patras
Email: [log in to unmask]
_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.
|