Print

Print


*****  To join INSNA, visit http://www.insna.org  *****

Robert,

Thank you very much for the link.

For those of you who can't wait to see the network parameters, I collected
some descriptive statistics.
The network has 838,105 nodes and 1,208,613 edges. There are 11,816
connected components, of which the largest one (the GCC) has 730,601 nodes
and 1,104,306 edges. The component sizes have a roughly power law
distribution with the exponent of -0.026.

The GCC has an excellent community structure: 2009 community, modularity
m~0.92. The largest community consists of 106,184 nodes.

One of the communities contains references to Sergey Roldugin, Russian
cellist and close friend of the Russian president. The SVG file showing the
structure of that community can be found at
https://github.com/dzinoviev/PanamaPapers/blob/master/cc32.svg.  (Right
click on the 'Raw' button and select "Save link as...". I believe most
modern Web browsers are capable of displaying SVG files directly.) Node
sizes represent degrees, and colors represent second-level communities.
Numeric node names represent "THE BEARER." (Apparently an anonymous
official.)

If anyone is interested, I can upload Python files for reading the dataset
into the repository.

On Mon, May 9, 2016 at 3:50 PM, Robert Marriott <[log in to unmask]> wrote:

> ***** To join INSNA, visit http://www.insna.org *****
> Valdis Krebs mentioned the Panama Papers a couple weeks ago. Well, the
> full dataset of the Panama Papers, as well as a related offshore entity
> investigation, have just been released by the ICIJ.  I'm not affiliated
> with the releasing organizations in any way, but there's likely to be a
> feeding frenzy on this, so I wanted to be sure the listserv received
> immediate notice once the data became available.
>
> The dataset is already up online in a graphical network format for the
> casual use of the public and media. More interesting for our purposes, the
> whole enchiada is available in multiple csv formats.
>
> https://www.occrp.org/en/panamapapers/database
>
> A word of warning, it is truly enormous; the edgelist file is beyond the
> line limit for excel. Also bear in mind that the dataset is subject to some
> form of GPL-style open data licensing that I'm not familiar with- I
> recommend checking that information before you rev up your preferred
> analysis tool.
>
> In the event that folks haven't been following the news, the Panama Papers
> represent a massive leak of offshore corporate entity information (~320,000
> entities) from the Panamanian law firm Mossack Fonseca. Offshore entities
> of the sort included in the leak can have legitimate or legal purposes, but
> they are primarily seen as a means of tax avoidance or evasion, as well as
> a venue of organized criminal activity. This is particularly the case with
> firms like the one targeted in the leak. Until today, released information
> from the leak had been more selective, but individual disclosures had
> already brought down the Icelandic PM. This seems like a hot potato, but
> the sheer size of the leak is making it difficult for the responsible
> organizations, or the press, to process.
>
> Regards,
>
> Robert Marriott
>
> Penn State University
>
> _____________________________________________________________________
> SOCNET is a service of INSNA, the professional association for social
> network researchers (http://www.insna.org). To unsubscribe, send an email
> message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET
> in the body of the message.




-- 
Dmitry Zinoviev
Professor of Computer Science
Suffolk University, Boston, MA 02114

_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.