rbioinformaticsbioconductorrna-seqlimma

Limma to Compare Bulk RNA Seq using makeContrasts and eBayes


After a day of googling, I've decided that it'd be better to ask the question here.

So the experiment is I have bulk RNA seq data from 3 patients: A, B, C. And their RNA seq data is obtained for pre-treatment, treatment cycle 1, treatment cycle 2, treatment cycle 3.

So in total I have 12 samples of bulk RNA seq:

I want to get a differential gene list between different cycles (i.e. cycle 3 to pretreatment, cycle 3 to cycle 2) using model.matrix(), lmFit(), makeContrasts(), contrasts.fit(), eBayes(), all of which are in the limma package.

Here is my minimal working example.

library(limma)
# Already normalized expression set: rows are genes, columns are the 12 samples
normalized_expression <- matrix(data=sample(1:100), nrow=10, ncol=12)
colnames(normalized_expression) <- c("A.PreTreat", "A.Cycle1", "A.Cycle2", "A.Cycle3", "B.PreTreat", "B.Cycle1", "B.Cycle2", "B.Cycle3", "C.PreTreat", "C.Cycle1", "C.Cycle2", "C.Cycle3")

patient_and_treatment <- factor(colnames(normalized_expression), levels = colnames(normalized_expression))

design.matrix <- model.matrix(~0 + patient_and_treatment)
colnames(design.matrix) <- patient_and_treatment
fit <- lmFit(normalized_expression, design.matrix)

# I want to get a contrast matrix to get differential genes between cycle 3 treatment and pre-treatment in all patients
contrast.matrix <- makeContrasts("A.Cycle3+B.Cycle3+C.Cycle3-A.PreTreat-B.PreTreat-C.PreTreat",
                                 levels = levels(patient_and_treatment))

# Outputs Error of no residual degree of freedom
fit2 <- eBayes( contrasts.fit( fit, contrast.matrix ) )

# Want to run but cannot
summary(decideTests(fit2))

So far I am stuck on no residual degree of freedom error.

I am not even sure if this is the statistically right way in limma to address my question of getting differential gene list between cycle 3 treatment to pre-treatment in all patients.

Any help will be greatly appreciated.

Thanks!


Solution

  • You cannot have 1 observation per group, this makes the regression meaningless as you are fitting each data point to itself.

    Briefly, what you are looking for is common effects observed across all patients, for say Cycle3 compared to PreTreat and so on, set up the model like this:

    library(limma)
    
    metadata = data.frame(
    Patient=gsub("[.][^ ]*","",colnames(normalized_expression)),
    Treatment=gsub("^[A-Z][.]*","",colnames(normalized_expression))
    )
    
       Patient    Treatment
    1        A PreTreat
    2        A   Cycle1
    3        A   Cycle2
    4        A   Cycle3
    5        B PreTreat
    6        B   Cycle1
    7        B   Cycle2
    8        B   Cycle3
    9        C PreTreat
    10       C   Cycle1
    11       C   Cycle2
    12       C   Cycle3
    

    Now specify the model matrix, the Patient term is to account for differences in starting levels between Patients:

    design.matrix <- model.matrix(~0 + Treatment+Patient,data=metadata)
    fit <- lmFit(normalized_expression, design.matrix)
    contrast.matrix <- makeContrasts(TreatmentCycle3-TreatmentPreTreat,
    TreatmentCycle1-TreatmentPreTreat,levels=design.matrix)
    fit2 = contrasts.fit(fit, contrast.matrix)
    fit2 = eBayes(fit2)
    

    You can check that the coefficients give you what you wanted:

    fit2$coefficients
           Contrasts
            TreatmentCycle3 - TreatmentPreTreat
       [1,]                           -3.666667
       [2,]                          -13.666667
       [3,]                            1.666667
       [4,]                          -40.666667
       [5,]                           12.000000
       [6,]                          -46.000000
       [7,]                          -32.000000
       [8,]                            4.666667
       [9,]                           11.333333
      [10,]                            5.666667
           Contrasts
            TreatmentCycle1 - TreatmentPreTreat
       [1,]                           -11.33333
       [2,]                           -19.33333
       [3,]                           -27.33333
       [4,]                           -42.33333
       [5,]                            27.33333
       [6,]                           -32.66667
       [7,]                           -33.00000
       [8,]                           -30.66667
       [9,]                            46.00000
      [10,]                            17.33333