LISTSERV mailing list manager LISTSERV 16.0

Help for R-USERS-L Archives


R-USERS-L Archives

R-USERS-L Archives


R-USERS-L@LISTS.UFL.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

R-USERS-L Home

R-USERS-L Home

R-USERS-L  2018

R-USERS-L 2018

Subject:

Re: R question

From:

"El Rouby,Nihal M" <[log in to unmask]>

Reply-To:

UF R Users List <[log in to unmask]>

Date:

Thu, 12 Jul 2018 12:27:52 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (458 lines)

Thanks Ben and Geraldine-

Your suggestions helped me alot. And the code Ben suggested worked for me.

Sorry for the lack of clarity of my data.

Best,
Nihal

On 7/11/18, 10:21 PM, "UF R Users List on behalf of Toh,Kok Ben" <[log in to unmask] on behalf of [log in to unmask]> wrote:

    Hi Nihal,
    
    It is not possible to read your example data in this format. But I think the problem here is:
    1. dplyr or in general R doesn't like you to have spaces or other special characters in your column name. use 'colnames(d1)' to change them to something with no spaces
    2. you don't refer to the column using $ notation when using dplyr functions: e.g. no 'd1$XXX'
    
    I've generated a fake dataset here:
    df <- data.frame(med = rep(c('A', 'B', 'C'), each=10), 
                     gen = rep(1:3, 10), 
                     rec = c(rep(1:2, 10), rep(3, 10)),
                     val = runif(30))
    
    And here's how you can get the count of unique records for each combination of "med" and "gen":
    df %>% 
      group_by(med, gen) %>% 
      distinct(rec) %>% 
      summarise(n())
    
    Note that I would recommend using 'summarise(n=n())' instead of 'summarise(n())' because the latter will make the column name 'n()', i.e. special characters as column name and it is not very fun. Can also use 'tally()' instead of 'summarise(n())', they do the same thing.
    
    You can also use the tapply function to achieve a slightly different result:
    with(df, tapply(rec, list(med, gen), function (x) length(unique(x))))
    
    This would give you a table in which each row is a level of med, and each column is a level of gen, and each cell is the count of the row-column (med-gen) combination.
    
    Cheers,
    Ben
    
    
    -----Original Message-----
    From: UF R Users List <[log in to unmask]> On Behalf Of El Rouby,Nihal M
    Sent: Wednesday, July 11, 2018 4:59 PM
    To: [log in to unmask]
    Subject: R question
    
    Dear All-
    
    
    
    I have a data with genotype and medication exposure on repeated dates. I’m trying to table the counts of the genotypes for unique individuals in each medication group . I tried  several codes to summarize the data by genotypes and medications, but with no luck
    
    
    
    I used summarize and group_by from dplyr
    
    
    
    output<-d1 %>%
    
    group_by(d1$`Med Order Display Name`,d1$`CYP2C19 Genotype`) %>% distinct(d1$`Record ID`)%>%summarise(n())
    
    
    
    Another code I tried
    
    
    
    with(d1, tapply(d1$`Med Order Display Name`, d1$`CYP2C19 Genotype`, FUN = function(x) length(unique(x))))
    
    
    
    I appreciate your input on a direction I should take.
    
    
    
    
    
    
    
    My example data is
    
    
    
    Record ID
    
    
    
    CYP2C19 Genotype
    
    
    
    Med Order Display Name
    
    
    
    3
    
    
    
    *1/*1
    
    
    
    pantoprazole (PROTONIX) injection 40 mg
    
    
    
    3
    
    
    
    *1/*1
    
    
    
    pantoprazole (PROTONIX) injection 40 mg
    
    
    
    3
    
    
    
    *1/*1
    
    
    
    pantoprazole (PROTONIX) EC tablet 40 mg
    
    
    
    13
    
    
    
    *1/*17
    
    
    
    pantoprazole (PROTONIX) 40 MG Tablet Delayed Release
    
    
    
    13
    
    
    
    *1/*17
    
    
    
    pantoprazole (PROTONIX) 40 MG Tablet Delayed Release
    
    
    
    13
    
    
    
    *1/*17
    
    
    
    pantoprazole (PROTONIX) 40 MG tablet
    
    
    
    13
    
    
    
    *1/*17
    
    
    
    pantoprazole (PROTONIX) 40 MG tablet
    
    
    
    28
    
    
    
    *1/*1
    
    
    
    esomeprazole (NexIUM) capsule 20 mg
    
    
    
    28
    
    
    
    *1/*1
    
    
    
    pantoprazole (PROTONIX) EC tablet 40 mg
    
    
    
    28
    
    
    
    *1/*1
    
    
    
    pantoprazole (PROTONIX) 40 MG tablet
    
    
    
    28
    
    
    
    *1/*1
    
    
    
    esomeprazole (NexIUM) capsule 40 mg
    
    
    
    52
    
    
    
    *1/*1
    
    
    
    NEXIUM 40 MG Capsule Delayed Release
    
    
    
    52
    
    
    
    *1/*1
    
    
    
    NEXIUM 40 MG Capsule Delayed Release
    
    
    
    52
    
    
    
    *1/*1
    
    
    
    esomeprazole (NexIUM) 40 MG Capsule Delayed Release
    
    
    
    52
    
    
    
    *1/*1
    
    
    
    NEXIUM 40 MG PO Capsule Delayed Release
    
    
    
    
    
    I hope I can get an output like that
    
    
    
    
    
    pantoprazole (PROTONIX) injection 40 mg
    
    
    
    pantoprazole (PROTONIX) EC tablet 40 mg
    
    
    
    pantoprazole (PROTONIX) 40 MG Tablet Delayed Release
    
    
    
    pantoprazole (PROTONIX) 40 MG tablet
    
    
    
    esomeprazole (NexIUM) capsule 20 mg
    
    
    
    NEXIUM 40 MG Capsule Delayed Release
    
    
    
    NEXIUM 40 MG PO Capsule Delayed Release
    
    
    
    esomeprazole (NexIUM) 40 MG Capsule Delayed Release
    
    
    
    
    
    *1/*2
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    *1/*17
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    *1/*2
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    inconclusive
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    *2/*2
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    *17/*17
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    xx
    
    
    
    
    
    
    
    
    
    This list strives to be beginner friendly.  However, we still ask that you
    
    PLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=chtnqDhqphE18P0OVQNI_w&m=vLd7KeCQgzSz5llXZZD6dN6PoOutNnq5uFN8XPwW4Tk&s=gMZFv7h1zgLEOWiinJQfRrKr_5-3vLPWgEmG41DucXU&e=
    
    and provide commented, minimal, self-contained, reproducible code.
    
    This list strives to be beginner friendly.  However, we still ask that you
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    

This list strives to be beginner friendly.  However, we still ask that you
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008

ATOM RSS1 RSS2



LISTS.UFL.EDU

CataList Email List Search Powered by the LISTSERV Email List Manager