rstatisticsbioinformaticsbioconductorlimma

Design matrix for edge R analysis?


I would like to compare Q method against L method, and I have considered 2 different contrasts (at the end), but I am not sure which on is correct?

There are 2 different methods (Q and L) and from each one there are 2 biological replicates (L4,L6-L8 and Q3,Q5-Q7), and 2 technical replications from each biological replicate. as below:

design

                    biological_replicate   method
 L4_rep1                              L4       L
 L4_rep2                              L4       L
 L6_L8_rep1                        L6_L8       L
 L6_L8_rep2                        L6_L8       L
 Q3_rep1                              Q3       Q
 Q3_rep2                              Q3       Q
 Q5_Q7_rep1                        Q5_Q7       Q
 Q5_Q7_rep2                        Q5_Q7       Q                       

design$biological_replicate <- factor(design$biological_replicate, levels =  c("L4","L6_L8", "Q3", "Q5_Q7"))

design$method <- factor(design$method, levels = c("L", "Q"))

Group <- factor(paste(design$biological_replicate,design$method,sep="."))

design<- cbind(design,Group)

                    biological_replicate method   Group
L4_rep1                              L4      L    L4.L
L4_rep2                              L4      L    L4.L
L6_L8_rep1                        L6_L8      L L6_L8.L
L6_L8_rep2                        L6_L8      L L6_L8.L
Q3_rep1                              Q3      Q    Q3.Q
Q3_rep2                              Q3      Q    Q3.Q
Q5_Q7_rep1                        Q5_Q7      Q Q5_Q7.Q
Q5_Q7_rep2                        Q5_Q7      Q Q5_Q7.Q

design.matrix <- model.matrix(~0+Group,design)

colnames(design.matrix) <- levels(Group)

design.matrix

L4.L L6_L8.L Q3.Q Q5_Q7.Q
L4_rep1       1       0    0       0
L4_rep2       1       0    0       0
L6_L8_rep1    0       1    0       0
L6_L8_rep2    0       1    0       0
Q3_rep1       0       0    1       0
Q3_rep2       0       0    1       0
Q5_Q7_rep1    0       0    0       1
Q5_Q7_rep2    0       0    0       1
attr(,"assign")
[1] 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Group
[1] "contr.treatment"

my.contrasts_1 <- makeContrasts(QvsL = (Q3.Q+Q5_Q7.Q)/2-(L4.L+L6_L8.L)/2, levels = design.matrix)

my.contrasts_2 <- makeContrasts(QvsL = (Q3.Q+Q5_Q7.Q)-(L4.L+L6_L8.L), levels = design.matrix)

Solution

  • First of all, technical replicates need to be summed for differential expression analysis. You simply add both counts together.

    Once you have summed the technical replicates, your data should look like this:

    Sample <- c("L4", "L6_L8", "Q3", "Q5_Q7")
    Method <- c("L", "L", "Q", "Q")
    df <- data.frame(Sample, Method)
    
      Sample Method
    1     L4      L
    2  L6_L8      L
    3     Q3      Q
    4  Q5_Q7      Q
    

    Designing the matrix is now simple:

    design.matrix <- model.matrix(~0 + Method)
    colnames(design.matrix) <- c("L","Q")
      MethodL MethodQ
    1       1       0
    2       1       0
    3       0       1
    4       0       1
    

    The contrast also becomes simple to generate:

    my_contrast <- makeContrasts(L-Q, levels = design.matrix)
    
    Contrasts
    Levels L - Q
         L     1
         Q    -1
    

    This isn't part of your question, but you have the absolute minimum number of biological replicates, assuming the above table is representative of your actual data. Do not treat technical replicates as biological replicates. You will increase your type 1 error.