Print

Print


*****  To join INSNA, visit http://www.insna.org  *****

Also try Uberlink's hyperlinks analytics tool: http://www.uberlink.com/has

It has a user friendly interface and can take multiple seed links.

Thanks,

On Friday, 3 June 2016, Moses Boudourides <[log in to unmask]>
wrote:

> *****  To join INSNA, visit http://www.insna.org  *****
>
> I think the easiest way is through Ruby. The way to do it depends on
> what you're looking for.
>
> If you're just interested in getting pages' content, the simplest way
> is through the open-uri functions
> (http://ruby-doc.org/stdlib-2.2.2/libdoc/open-uri/rdoc/OpenURI.html).
>
> If you want to parse content, there are several options using Ruby
> gems, like the following:
>
> * Nokogiri, which is I guess the most popular
> (http://railscasts.com/episodes/190-screen-scraping-with-nokogiri)
> * Mechanize, which is built on top of Nokogiri
> (http://railscasts.com/episodes/191-mechanize)
> * Hpricot (https://rubygems.org/gems/hpricot)
> * Or just  do a screen scraping with ScrAPI
> (https://rubygems.org/gems/scrapi) and the ScrAPI RailsCast
> (http://railscasts.com/episodes/173-screen-scraping-with-scrapi).
>
> --Moses
>
> On Thu, Jun 2, 2016 at 7:14 PM, Juergen Pfeffer <[log in to unmask]
> <javascript:;>> wrote:
> > ***** To join INSNA, visit http://www.insna.org *****
> >
> > Not the answer you were hoping for, but instead of playing around with
> tools
> > with limitations, I’d recommend finding a CS undergrad. This can be done
> in
> > 1-2 hours with about 15 lines of Python code.
> >
> > Best,
> >
> > Jürgen (former CS undergrad)
> >
> >
> >
> >
> >
> >
> >
> > From: Jennifer Lawlor
> > Sent: Thursday, June 2, 2016 5:59 PM
> > To: [log in to unmask] <javascript:;>
> > Subject: [SOCNET] Web Crawler Recommendations
> >
> >
> >
> > ***** To join INSNA, visit http://www.insna.org *****
> > Hi all,
> >
> > I'm working on a project involving hyperlink networks and I'm looking for
> > some software tools. Can anyone recommend software for web crawling that
> > takes multiple seed links and can output data in a universal format
> (e.g, a
> > .csv)? I'm hoping to avoid writing the code for a crawler from scratch,
> so
> > any advice you can offer about pre-existing software would be really
> > helpful!
> >
> > Best,
> > Jennifer Lawlor
> >
> > --
> >
> > Jennifer Lawlor, MA
> > Graduate Student, Ecological-Community Psychology
> > Michigan State University
> > E-mail: [log in to unmask] <javascript:;>
> >
> > _____________________________________________________________________
> SOCNET
> > is a service of INSNA, the professional association for social network
> > researchers (http://www.insna.org). To unsubscribe, send an email
> message to
> > [log in to unmask] <javascript:;> containing the line UNSUBSCRIBE
> SOCNET in the body of
> > the message.
> > _____________________________________________________________________
> SOCNET
> > is a service of INSNA, the professional association for social network
> > researchers (http://www.insna.org). To unsubscribe, send an email
> message to
> > [log in to unmask] <javascript:;> containing the line UNSUBSCRIBE
> SOCNET in the body of
> > the message.
>
> _____________________________________________________________________
> SOCNET is a service of INSNA, the professional association for social
> network researchers (http://www.insna.org). To unsubscribe, send
> an email message to [log in to unmask] <javascript:;> containing the
> line
> UNSUBSCRIBE SOCNET in the body of the message.
>


-- 

Gohar Feroz Khan, PhD
Assistant Professor
Department of Business Administration
Keimyung University, Daegu, South Korea.

Email: [log in to unmask]; Ph: 82-53-580-6371

----------
Check out my new book on social media analytics
<http://7layersanalytics.com/>!
-----------
Please consider submitting your work to the social media analytics track at
PACIS201 <http://www.pacis2016.org/Page/Index/71>6.
-----------
Social Identities: || Blog <http://gfkhan.wordpress.com/> || Twitter
<https://twitter.com/gfkhan> || LinkedIn
<https://www.linkedin.com/pub/gohar-feroz-khan/7/62b/42> || Research Centre
<http://centreforsocialtech.com/>||

_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.