DiffBind Analysis

Using DBA_DESEQ2 as the differential analysis method

Sample sheet used for the analysis

ABCDEFGHIJ0123456789
Sample
<chr>
Factor
<chr>
bamReads
<chr>
PeakCaller
<chr>
LK_A1atacalignment/LK_A1/atac/LK_A1.atac.sorted.dup.filtered.bammacs
LK_A2atacalignment/LK_A2/atac/LK_A2.atac.sorted.dup.filtered.bammacs
LK_B1atacalignment/LK_B1/atac/LK_B1.atac.sorted.dup.filtered.bammacs
LK_B2atacalignment/LK_B2/atac/LK_B2.atac.sorted.dup.filtered.bammacs

Below table shows information related to samples and macs2 peak files. Such as how many peaks are in each peakset, the total number of unique peaks after merging overlapping ones (in the first line), and the dimensions of the default binding matrix.

## 4 Samples, 5581 sites in matrix (7213 total):
##           ID Factor Condition Intervals
## 1 LK_A1_atac   atac         1     40264
## 2 LK_A2_atac   atac         1     41439
## 3 LK_B1_atac   atac         2     32443
## 4 LK_B2_atac   atac         2     30751

Using the data from the peak calls, a correlation heatmap can be generated which gives an initial clustering of the samples using the cross-correlations of each row of the binding matrix.

Fig 1: Correlation heatmap, using occupancy (peak caller score) data

Fig 1: Correlation heatmap, using occupancy (peak caller score) data

Next step is to Calculate a binding matrix with scores based on read counts for every sample (affinity scores), rather than confidence scores for only those peaks called in a specific sample (occupancy scores. Two additional columns were added to above table in this step. The first shows the total number of aligned reads for each sample (the “Full” library sizes). The second is labeled FRiP, which stands for Fraction of Reads in Peaks. This is the proportion of reads for that sample that overlap a peak in the consensus peak set, and can be used to indicate which samples show more enrichment overall.

## 4 Samples, 5581 sites in matrix:
##           ID Factor Condition   Reads FRiP
## 1 LK_A1_atac   atac         1 4547370 0.20
## 2 LK_A2_atac   atac         1 5115440 0.19
## 3 LK_B1_atac   atac         2 6728834 0.23
## 4 LK_B2_atac   atac         2 6777428 0.24

For each sample, multiplying the value in theReadscolumn by the correspondingFRiPvaluewill yield the number of reads that overlap a consensus peak

ABCDEFGHIJ0123456789
LibReads
<dbl>
FRiP
<dbl>
PeakReads
<dbl>
45473700.20909474
51154400.19971934
67288340.231547632
67774280.241626583

New correlation heatmap based on the count scores

Fig 2.1: Correlation heatmap, using affinity (read count) data

Fig 2.1: Correlation heatmap, using affinity (read count) data

Fig 2.2: PCA plot using affinity (read count) data for all sites

Fig 2.2: PCA plot using affinity (read count) data for all sites

Fig 3.1: Correlation heatmap, using only significantly differentially bound sites

Fig 3.1: Correlation heatmap, using only significantly differentially bound sites

Fig 3.2: PCA plot using affinity data for only differentially bound sites

Fig 3.2: PCA plot using affinity data for only differentially bound sites

Fig 4: Ven diagram of Gain vs Loss differentially bound sites

Fig 4: Ven diagram of Gain vs Loss differentially bound sites

Fig 5: MA plot of Resistant-Responsive contrast. Sites identified as significantly differentially bound shown in red

Fig 5: MA plot of Resistant-Responsive contrast. Sites identified as significantly differentially bound shown in red

Fig 6: Volcano plot of Resistant-Responsive contrast. Sites identified as significantly differentially bound shown in red

Fig 6: Volcano plot of Resistant-Responsive contrast. Sites identified as significantly differentially bound shown in red

Below table displays first 6 rows of the differential analysis results using DBA_DESEQ2

ABCDEFGHIJ0123456789
 
 
seqnames
<fct>
start
<int>
end
<int>
width
<int>
strand
<fct>
Conc
<dbl>
Conc_2
<dbl>
Conc_1
<dbl>
Fold
<dbl>
3115chr193333221033332610401*9.50769710.3167067.4960672.801036
280chr1986402298640629401*10.22989710.9514588.7195862.223765
54chr19296462296862401*10.03212710.7433558.5693872.164132
201chr1965907666591166401*9.7229018.11904910.463614-2.328558
3109chr193330242633302826401*11.84342012.29158211.1898381.097924
716chr191466142914661829401*9.63091410.2817288.4131421.856010
## [1] "complted differential binding analysis"