Print

Print


*****  To join INSNA, visit http://www.insna.org  *****

Hi George,

Thank you for pointing out these publications. They are really 
interesting and we will compare our results. While reading through the 
publications I found the mentioned dataset containing around 15 billion 
hyperlinks, but I do not really understand why our dataset, including 
128 billion hyperlinks should be smaller? Maybe you can also point out 
where I can get the dataset you used for your analyses, so I can dig a 
little bit deeper into this dataset as well.

Thanks a lot for your help,
Robert

Am 18.11.2013 19:03, schrieb George Barnett:
> Robert,
>    You're wrong.  Han Woo Park & I have published a series of papers 
> with over 15 Billion hyperlinks.  Also, you might want to look at a 
> paper by Barnett, et al., in Social Networks and Mining.
>
> Park, H.W., Barnett, G.A. & Chung, C.J. (2011). Structural Changes in 
> the Global Hyperlink Network 2003-2009, /Global Networks/, 11(4), 522-544.
>
> Barnett, G.A., Chung, C.J., & Park, H.W. (2011). Uncovering 
> transnational hyperlink patterns and web-mediated contents: A new 
> approach based on cracking .com domain, /SSCORE (Social Science 
> Computer Research and Evaluation)/, 29 (3), 369-384.
>
>
> Barnett, G.A., & Park, H.W. (2012). Examining the International 
> Internet Using Multiple Measures: New methods for measuring the 
> communication base of globalized cyberspace./Quality and Quantity/. 
> DOI 10.1007/s11135-012-9787-z
>
>
> Barnett, G.A., Ruiz, J., Hammond, J., & Xin, Z. (2013). An examination 
> of the relationship between international telecommunication networks, 
> terrorism and global news coverage. /Social Networks and//Mining/. 
> (DOI) 10.1007/s13278-013-0117-9
>
>
> George
>
>
> George A. Barnett, Ph.D.
>
> Professor & Chair
>
> Department of Communication
>
> University of California, Davis
>
> Davis, CA 95616 USA
>
>
>
> **
>
>
>
>
> On Sun, Nov 17, 2013 at 11:18 PM, Robert Meusel 
> <[log in to unmask] 
> <mailto:[log in to unmask]>> wrote:
>
>     ***** To join INSNA, visit http://www.insna.org *****
>
>     Hi all,
>
>     the Web Data Commons team is happy to announce the publication of
>     a new large hyperlink graph.
>
>     The graph has been extracted from the Common Crawl 2012 web corpus
>     [1] and covers 3.5 billion web pages and 128 billion hyperlinks
>     between these pages. To the best of our knowledge, the graph is
>     the largest hyperlink graph that is available to the public.
>
>     The graph can be downloaded in various formats from
>
>     http://webdatacommons.org/hyperlinkgraph
>
>     We provide initial statistics about the topology of the graph at
>
>     http://webdatacommons.org/hyperlinkgraph/topology.html
>
>     We hope that the graph will be useful for researchers who develop
>
>     ·Search algorithms that rank results based on the hyperlinks
>     between pages.
>
>     ·SPAM detection methods which identity networks of web pages that
>     are published in order to trick search engines.
>
>     ·Graph analysis algorithms and can use the hyperlink graph for
>     testing the scalability and performance of their tools.
>
>     ·Web Science researchers who want to analyze the linking patterns
>     within specific topical domains in order to identify the social
>     mechanisms that govern these domains.
>
>     We want to thanks the Common Crawl project for providing their
>     great web crawl and thus enabling the creation of the WDC
>     Hyperlink Graph.
>
>     The creation of the WDC Hyperlink Graph was supported by the EU
>     research project PlanetData and by Amazon Web Services. We thank
>     your sponsors a lot.
>
>     Best Regards,
>
>     Chris, Oliver & Robert
>
>     [1] http://commoncrawl.org/
>
>     _____________________________________________________________________
>     SOCNET is a service of INSNA, the professional association for
>     social network researchers (http://www.insna.org). To unsubscribe,
>     send an email message to [log in to unmask]
>     <mailto:[log in to unmask]> containing the line UNSUBSCRIBE
>     SOCNET in the body of the message. 
>
>

-- 
Robert Meusel
Chair of Information Systems V
Web-based Systems Group
Universität Mannheim
B6, 26, Room C1.04
D-68159 Mannheim
Phone: +49 621 181 2648
Mail: [log in to unmask]
Web: dws.informatik.uni-mannheim.de


_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.