***** To join INSNA, visit http://www.insna.org *****
I've been also spending some time with the Panama Papers dataset.
However, in what concerns the network structure that could be
extracted from this data set, it's not yet very clear to me which
relations could be used for that purpose. The relational part of the
dataset is the file all_edges.csv, which contains 3 columns: the first
and the third columns contain node_ids and the second refers to the
following five types of relations that associate a node of the first
column with the corresponding node of the third column:
'intermediary_of', 'officer_of', 'registered_address', 'similar',
'underlying'. Apparently only the fourth type ('similar') is symmetric
(undirected) and all the other four types are obviously directed.
Given that the total number of all relations (edges) is very high
(1265690), I was wondering what sort of aggregations among types of
relations might simplify the complexity of the Panama papers network.
I would appreciate if someone is willing to share any ideas about a
meaningful aggregation scheme for relations. Of course, one could
disregard any sort of relational aggregation and treat the network as
a multilayered (multiplex) one, although the size and the complexity
of the Panama Papers network appear to be rather restraining.
I should add that, following Dmitry Zinoviev's original work on this
dataset, at the moment, I can make a number of computations and
visualizations for parts of the network (I'm using Python's Networkx
and Lightning-Python for interactive visualizations). For instance,
being motivated by what Dmitry is doing, I've managed to analyze the
ego-networks extracted from egos which are nodes of certain type
(officers, intermediaries, addresses, entities) associated with a
particular country and being connected to alters according to a
certain relationship type.
For instance, this is the (symmetric) network in the case that egos
are Greek officers and alters correspond to international entities and
addresses (aggregated by all types of relations-edges):
(This is just an example: I can produce such ego-centric networks for
any country in the Panama Papers data.)
Admittedly, I'm not pleased with the aggregation of relations I'm
doing here (perhaps the inclusion of addresses was redundant too) and,
thus, I would ask for your ideas, comments or suggestions.
On Mon, May 9, 2016 at 10:50 PM, Robert Marriott <[log in to unmask]> wrote:
> ***** To join INSNA, visit http://www.insna.org *****
> Valdis Krebs mentioned the Panama Papers a couple weeks ago. Well, the full
> dataset of the Panama Papers, as well as a related offshore entity
> investigation, have just been released by the ICIJ. I'm not affiliated with
> the releasing organizations in any way, but there's likely to be a feeding
> frenzy on this, so I wanted to be sure the listserv received immediate
> notice once the data became available.
> The dataset is already up online in a graphical network format for the
> casual use of the public and media. More interesting for our purposes, the
> whole enchiada is available in multiple csv formats.
> A word of warning, it is truly enormous; the edgelist file is beyond the
> line limit for excel. Also bear in mind that the dataset is subject to some
> form of GPL-style open data licensing that I'm not familiar with- I
> recommend checking that information before you rev up your preferred
> analysis tool.
> In the event that folks haven't been following the news, the Panama Papers
> represent a massive leak of offshore corporate entity information (~320,000
> entities) from the Panamanian law firm Mossack Fonseca. Offshore entities of
> the sort included in the leak can have legitimate or legal purposes, but
> they are primarily seen as a means of tax avoidance or evasion, as well as a
> venue of organized criminal activity. This is particularly the case with
> firms like the one targeted in the leak. Until today, released information
> from the leak had been more selective, but individual disclosures had
> already brought down the Icelandic PM. This seems like a hot potato, but the
> sheer size of the leak is making it difficult for the responsible
> organizations, or the press, to process.
> Robert Marriott
> Penn State University
> _____________________________________________________________________ SOCNET
> is a service of INSNA, the professional association for social network
> researchers (http://www.insna.org). To unsubscribe, send an email message to
> [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of
> the message.
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.