DiffBind Analysis

Using DBA_DESEQ2 as the differential analysis method

Sample sheet used for the analysis

ABCDEFGHIJ0123456789

Sample <chr>	Factor <chr>	bamReads <chr>	PeakCaller <chr>
LK_A1	atac	alignment/LK_A1/atac/LK_A1.atac.sorted.dup.filtered.bam	macs
LK_A2	atac	alignment/LK_A2/atac/LK_A2.atac.sorted.dup.filtered.bam	macs
LK_B1	atac	alignment/LK_B1/atac/LK_B1.atac.sorted.dup.filtered.bam	macs
LK_B2	atac	alignment/LK_B2/atac/LK_B2.atac.sorted.dup.filtered.bam	macs

Below table shows information related to samples and macs2 peak files. Such as how many peaks are in each peakset, the total number of unique peaks after merging overlapping ones (in the first line), and the dimensions of the default binding matrix.

## 4 Samples, 5581 sites in matrix (7213 total):
##           ID Factor Condition Intervals
## 1 LK_A1_atac   atac         1     40264
## 2 LK_A2_atac   atac         1     41439
## 3 LK_B1_atac   atac         2     32443
## 4 LK_B2_atac   atac         2     30751

Using the data from the peak calls, a correlation heatmap can be generated which gives an initial clustering of the samples using the cross-correlations of each row of the binding matrix.

Fig 1: Correlation heatmap, using occupancy (peak caller score) data

Next step is to Calculate a binding matrix with scores based on read counts for every sample (affinity scores), rather than confidence scores for only those peaks called in a specific sample (occupancy scores. Two additional columns were added to above table in this step. The first shows the total number of aligned reads for each sample (the “Full” library sizes). The second is labeled FRiP, which stands for Fraction of Reads in Peaks. This is the proportion of reads for that sample that overlap a peak in the consensus peak set, and can be used to indicate which samples show more enrichment overall.

## 4 Samples, 5581 sites in matrix:
##           ID Factor Condition   Reads FRiP
## 1 LK_A1_atac   atac         1 4547370 0.20
## 2 LK_A2_atac   atac         1 5115440 0.19
## 3 LK_B1_atac   atac         2 6728834 0.23
## 4 LK_B2_atac   atac         2 6777428 0.24

For each sample, multiplying the value in theReadscolumn by the correspondingFRiPvaluewill yield the number of reads that overlap a consensus peak

ABCDEFGHIJ0123456789

LibReads <dbl>	FRiP <dbl>	PeakReads <dbl>
4547370	0.20	909474
5115440	0.19	971934
6728834	0.23	1547632
6777428	0.24	1626583

New correlation heatmap based on the count scores

Fig 2.1: Correlation heatmap, using affinity (read count) data

Fig 2.2: PCA plot using affinity (read count) data for all sites

Fig 3.1: Correlation heatmap, using only significantly differentially bound sites

Fig 3.2: PCA plot using affinity data for only differentially bound sites

Fig 4: Ven diagram of Gain vs Loss differentially bound sites

Fig 5: MA plot of Resistant-Responsive contrast. Sites identified as significantly differentially bound shown in red

Fig 6: Volcano plot of Resistant-Responsive contrast. Sites identified as significantly differentially bound shown in red

Below table displays first 6 rows of the differential analysis results using DBA_DESEQ2

ABCDEFGHIJ0123456789

	seqnames <fct>	start <int>	end <int>	width <int>	strand <fct>	Conc <dbl>	Conc_2 <dbl>	Conc_1 <dbl>	Fold <dbl>
3115	chr19	33332210	33332610	401	*	9.507697	10.316706	7.496067	2.801036
280	chr19	8640229	8640629	401	*	10.229897	10.951458	8.719586	2.223765
54	chr19	296462	296862	401	*	10.032127	10.743355	8.569387	2.164132
201	chr19	6590766	6591166	401	*	9.722901	8.119049	10.463614	-2.328558
3109	chr19	33302426	33302826	401	*	11.843420	12.291582	11.189838	1.097924
716	chr19	14661429	14661829	401	*	9.630914	10.281728	8.413142	1.856010

## [1] "complted differential binding analysis"

The report for differential binding analysis comparing A_vs_B

This automated report was generated by C3G GenPipes pipeline on

2022-07-17

DiffBind Analysis

Using DBA_DESEQ2 as the differential analysis method

Sample sheet used for the analysis