Print

Print


*****  To join INSNA, visit http://www.insna.org  *****

John--

Determining which, if any, distribution is the best description of the
data is a question that is best answered by doing some sort of model
comparison.  Broadly speaking, there are two types of model
comparison: frequentist goodness-of-fit tests and Bayesian model
selection.

A very common example of a frequentist goodness-of-fit test is the
Kolmogorov-Smirnov test (see Numerical Recipes or
http://en.wikipedia.org/wiki/Kolmogorov-Smirnov).  The output of such
a test is a p-value which is usually used to either reject or not
reject a particular model at a certain significance level.  For
example, you might be interested in determining whether the degree
distribution for the power-grid is well described by a power-law
distribution, in which case you might find that the p-value given by
the Kolmogorov-Smirnov test is p=0.0000000001 (obviously making this
up) and you would reject the power-law distribution as a candidate
model for the degree distribution of the power-grid.  An important
caveat for doing goodness-of-fit tests is that it assumes that you
already know the parameters from some other source.  If you use the
data, in any way, to infer the model parameters, you must use Monte
Carlo hypothesis testing to compute a p-value (e.g.
http://arxiv.org/abs/0706.1062).

Another approach is to use Bayesian model selection.  The advantage of
Bayesian hypothesis testing is that you can say things like "model A
is 10000 times more likely than model B".  For example, you might be
interested in determining whether a power-law distribution or an
exponential distribution is a better description of the power-grid
degree distribution, in which case you might say that the exponential
distribution is 1000 times more likely than the power-law distribution
(obviously making this up).  A great resource for Bayesian data
analysis techniques is the Daniel Sivia's primer "Data Analysis: a
Bayesian tutorial".

I hope that you find this helpful.  Please feel free to contact me if
you have any further questions.

Cheers.
Dean

On Mon, Feb 23, 2009 at 6:36 PM, John McCreery <[log in to unmask]> wrote:
> *****  To join INSNA, visit http://www.insna.org  *****
>
> As I prepare my presentation for Sunbelt, I am checking the networks my data
> reveal for the kinds of properties that I read about in sources that include
> Newman, Barabási, and Watts (2006), The Structure and Dynamics of Networks.
>
>   1. Giant components?  Check
>   2. Giant bicomponents that approach the size of the giant components?
>   Check?
>   3. Right-skewed degree distributions? Check
>
> But then I read, in the introduction to Chapter 3, a discussion of Amaral,
> et. al. (2000), a paper that examines five networks and discovers that none
> have power-law degree distributions.
>
> Instead, all of them are right-skewed but with non-power-law distributions:
>> the power grid and air traffic networks have exponential distributions, the
>> high school and Mormon networks have Gaussian distributions, and the movie
>> actor network has an exponentially truncated power-law distribution.
>
>
> Here my mathematical ignorance blocks further understanding. I want to know
> how to do the calculations to determine which kind of distributions best fit
> my data. My rapid survey of Wikipedia articles on power laws, O
> descriptions, that sort of thing, leaves me with the impression that this is
> a black art; but, I suspect, I am missing something.
>
> Can anyone here direct me to a curve-fitting for dummies primer that will
> shed some light on my problem or some smart person who already knows how to
> do this sort of thing?
>
> Your help will be greatly appreciated.
>
> John McCreery
> The Word Works, Ltd., Yokohama, JAPAN
> Tel. +81-45-314-9324
> [log in to unmask]
> http://www.wordworks.jp/
>
> _____________________________________________________________________
> SOCNET is a service of INSNA, the professional association for social
> network researchers (http://www.insna.org). To unsubscribe, send
> an email message to [log in to unmask] containing the line
> UNSUBSCRIBE SOCNET in the body of the message.
>



-- 
-----------------------------------------------------------------------
R. Dean Malmgren
Ph.D. Candidate
Chemical & Biological Engineering Department
Northwestern University
2145 Sheridan Road, Room E136
Evanston, IL 60208

E-mail:   [log in to unmask]
Phone:    +1 847 491 7231
Fax:      +1 847 491 7070
-----------------------------------------------------------------------

_____________________________________________________________________
SOCNET is a service of INSNA, the professional association for social
network researchers (http://www.insna.org). To unsubscribe, send
an email message to [log in to unmask] containing the line
UNSUBSCRIBE SOCNET in the body of the message.