SOCNET@LISTS.UFL.EDU

View:

 Message: [ First | Previous | Next | Last ] By Topic: [ First | Previous | Next | Last ] By Author: [ First | Previous | Next | Last ] Font: Monospaced Font

Subject:

routines for co-word analysis further extended

From:

Loet Leydesdorff <[log in to unmask]>

Reply-To:

Loet Leydesdorff <[log in to unmask]>

Date:

Sun, 12 Oct 2008 17:19:32 +0200

Content-Type:

text/plain

Parts/Attachments:

 text/plain (55 lines)
 ***** To join INSNA, visit http://www.insna.org ***** Dear colleagues: The routines ti.exe (at http://www.leydesdorff.net/software/ti/index.htm) and fulltext.exe (at http://www.leydesdorff.net/software/fulltext/index.htm) now additionally provide as output a file "words.dbf" (readable in Excel) which contains for all words the following summations: 1. A variable named "Chi_Sq" which provides Chi-square contributions for each of the variables (that is, words); these are defined for word(i) as Ó(i)÷2 = (Observed(ij) - Expected(ij))^2 / Expected(ij). In other words, the sum of the contributions over the column for the variable in each row (Mogoutov et al., 2008); 2. A variable named "ObsExp" which provides the sum of absolute values |Observed - Expected| for the word as a variable summed over the column; 3. A variable named "TfIdf" which use Salton & McGill's (1983: 63) TermFrequency-InverseDocumentFrequency measure (but without Salton's additional + 1; Magerman et al., 2007) defined as follows: WEIGHT(ik) = FREQ(ik) * [log2 (n) - log2 (DOCFREQ(k))]. This function assigns a high degree of importance to terms occurring in only a few documents in the collection; 4. The word frequency within the set. These statistics provide the researcher with opportunities to refine the list of words to be considered. References: Magerman, T., Van Looy, B., & Song, X. (2007). Exploring the feasibility and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications. Paper presented at the 6th Triple Helix Conference, 16-19 May 2007, Singapore. Mogoutov, A., Cambrosio, A., Keating, P., & Mustar, P. (2008). Biomedical innovation at the laboratory, clinical and commercial interface: A new method for mapping research projects, publications and patents in the field of microarrays. Journal of Informetrics (In print); doi:10.1016/j.joi.2008.06.005.   ________________________________ Loet Leydesdorff Amsterdam School of Communications Research (ASCoR) Kloveniersburgwal 48, 1012 CX Amsterdam. Tel. +31-20-525 6598; fax: +31-842239111 [log in to unmask] _____________________________________________________________________ SOCNET is a service of INSNA, the professional association for social network researchers (http://www.insna.org). To unsubscribe, send an email message to [log in to unmask] containing the line UNSUBSCRIBE SOCNET in the body of the message.

Advanced Options

Options

 Log In Get Password Search Archives Subscribe or Unsubscribe