Chapter 9 Whole dataset normalisation

After the DE analysis is complete, it is important to normalise the data across the entire data set for downstream co-expression analysis (PCIT) and other analyses.

This uses the same auto_generate_DE_results() function but with whole_dataset_normalisation = TRUE. The example below shows the minimum information the function needs to run.

full_norm <- 
  auto_generate_DE_results(se_data = seq_data, 
                           top_level_colname = Tissue_Region, 
                           sample_colname = sample_names,
                           samples_to_remove = NA,
                           DESeq2_formula_design = ~Treatment,
                           gene_annotations = gene_annot,
                           export_tables = TRUE,
                           export_dir = "./outputs/",
                           whole_data_normalisation = TRUE)
## Whole data normalisation selected. No pairwise results will be generated.
## renaming the first element in assays to 'counts'
## Warning in DESeq2::DESeqDataSet(se_data0, design = DESeq2_formula_design): some
## variables in design formula are characters, converting to factors
## Beginning DESeq analysis...
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 398 genes
## -- DESeq argument 'minReplicatesForReplace' = 6 
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
## Completed DESeq analysis.
## Plotting cooks distance...
## Cooks distance plot complete.
## ./outputs/ Directory exists
## Normalised tables exported to the sub-directory: ./outputs/

## Preparing data for output...
## List output succesfully generated.
## 
## 
##  ******************* END *******************

9.1 View Cooks distance boxplot for whole dataset

boxplots2 <- 
  GET_boxplot_cooks_distance(auto_DE_output = full_norm)
## $Whole_data_normalisation_output