Chapter 5 DE Analysis
5.1 Generate DE results
The top_level_groups
argument can be filled manually e.g. c(“ARC,” “LIV”)
Otherwise, use a quick function to get all possible options and parse that:
- Get all unique entries in Tissue_Region and assign to variable
<- seq_data$Tissue_Region %>% unique()
groups_for_DE
#View output to check
print(groups_for_DE)
## [1] "ARC" "LHA" "VMH" "ABO" "DUO" "RUM" "LIV"
- Notice we use
top_level_groups = groups_for_DE
. Other arguments are also specific to this dataset. To see details of all of these arguments type?auto_generate_DE_results
into the Console (at bottom of screen).
This is the main step where DESeq2 is used and can take a while depending on how many samples there are.
This function will iterate over each level of top_level_groups
, and will automatically detect pairwise comparisons within each top_level_groups
. This is determined based on the column data and is specified by the names given to the funciton. See ?auto_generate_DE_results
for specific details.
It also is set to export a lot of output files to the directory "./Outputs/"
. This can be turned off.
<-
DE_out auto_generate_DE_results(se_data = seq_data,
top_level_groups = groups_for_DE,
top_level_colname = Tissue_Region,
sample_colname = sample_names,
samples_to_remove = NA,
DESeq2_formula_design = ~Treatment,
rowSums_filter = 10, #for dds filtering
results_contrast_factor = Treatment,
results_combinations = NA,
use_IHW_filtering = TRUE,
alpha = 0.05,
gene_annotations = gene_annot,
export_tables = TRUE,
export_dir = "./Outputs/",
whole_data_normalisation = FALSE)
##
##
## ******************* Start of - ARC *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 19758 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 1, 0.0051%
## LFC < 0 (down) : 7, 0.035%
## outliers [1] : 36, 0.18%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 19758 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 44, 0.22%
## LFC < 0 (down) : 80, 0.4%
## outliers [1] : 36, 0.18%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 19758 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 55, 0.28%
## LFC < 0 (down) : 124, 0.63%
## outliers [1] : 36, 0.18%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - ARC *******************
##
##
## ******************* Start of - LHA *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 18718 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 2, 0.011%
## LFC < 0 (down) : 2, 0.011%
## outliers [1] : 104, 0.56%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 18718 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 104, 0.56%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 18718 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 1, 0.0053%
## LFC < 0 (down) : 1, 0.0053%
## outliers [1] : 104, 0.56%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - LHA *******************
##
##
## ******************* Start of - VMH *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 19185 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 1, 0.0052%
## outliers [1] : 40, 0.21%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 19185 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 4, 0.021%
## LFC < 0 (down) : 7, 0.036%
## outliers [1] : 40, 0.21%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 19185 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 1, 0.0052%
## LFC < 0 (down) : 4, 0.021%
## outliers [1] : 40, 0.21%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - VMH *******************
##
##
## ******************* Start of - ABO *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 18134 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 2, 0.011%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 230, 1.3%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 18134 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 9, 0.05%
## LFC < 0 (down) : 10, 0.055%
## outliers [1] : 230, 1.3%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 18134 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 13, 0.072%
## LFC < 0 (down) : 8, 0.044%
## outliers [1] : 230, 1.3%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - ABO *******************
##
##
## ******************* Start of - DUO *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 18958 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 466, 2.5%
## LFC < 0 (down) : 323, 1.7%
## outliers [1] : 59, 0.31%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 18958 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 36, 0.19%
## LFC < 0 (down) : 39, 0.21%
## outliers [1] : 59, 0.31%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 18958 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 64, 0.34%
## LFC < 0 (down) : 146, 0.77%
## outliers [1] : 59, 0.31%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - DUO *******************
##
##
## ******************* Start of - RUM *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 17655 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 207, 1.2%
## LFC < 0 (down) : 280, 1.6%
## outliers [1] : 139, 0.79%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 17655 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 25, 0.14%
## LFC < 0 (down) : 18, 0.1%
## outliers [1] : 139, 0.79%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 17655 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 155, 0.88%
## LFC < 0 (down) : 205, 1.2%
## outliers [1] : 139, 0.79%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - RUM *******************
##
##
## ******************* Start of - LIV *******************
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## Generating pairwise DESeq2 results...
## [1] "Contrast levels: LR, MAL, LAL"
##
##
## log2 fold change (MLE): Treatment LR vs MAL
##
## out of 17765 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 28, 0.16%
## LFC < 0 (down) : 34, 0.19%
## outliers [1] : 43, 0.24%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment LR vs LAL
##
## out of 17765 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 99, 0.56%
## LFC < 0 (down) : 189, 1.1%
## outliers [1] : 43, 0.24%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
##
##
## log2 fold change (MLE): Treatment MAL vs LAL
##
## out of 17765 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 248, 1.4%
## LFC < 0 (down) : 468, 2.6%
## outliers [1] : 43, 0.24%
## [1] see 'cooksCutoff' argument of ?results
## see metadata(res)$ihwResult on hypothesis weighting
## Results generated.
## Plotting MA plots (default DESeq2 style for QC)...
## Finished MA plots.
## Plotting p value histograms (for QC)...
## Finished plotting p value histograms.
## ./Outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./Outputs/
## PIF calculations - Preparing normalised data...
## PIF calculations - Preparing normalised data...COMPLETE
## PIF calculations - Calculating PIF data...
## PIF_LRvsMAL
## PIF_LRvsLAL
## PIF_MALvsLAL
## PIF calculations - Calculating PIF data...COMPLETE
## ./Outputs/ Directory exists
## PIF tables exported to an .xlsx file in the sub-directory: ./Outputs/
## PIF calculations - END
## ./Outputs/ Directory exists
## DE tables exported to an .xlsx file in the sub-directory: ./Outputs/
## Preparing data for output...
## List output succesfully generated.
##
##
## ******************* END of - LIV *******************
#View sructure of DE_out object
str(DE_out, max.level = 2)
## List of 7
## $ ARC_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3
## $ LHA_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3
## $ VMH_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3
## $ ABO_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3
## $ DUO_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3
## $ RUM_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3
## $ LIV_DESeq2_Output:List of 8
## ..$ dds_wald_object :Formal class 'DESeqDataSet' [package "DESeq2"] with 8 slots
## ..$ boxplot_cooks_distance:List of 9
## .. ..- attr(*, "class")= chr [1:2] "gg" "ggplot"
## ..$ DESeq2_res_object :List of 3
## ..$ pairwise_plots :List of 2
## ..$ overall_plots :List of 3
## ..$ normalised_data :List of 2
## ..$ PIF :List of 3
## ..$ DE_by_PIF_df :List of 3