From: Transcriptomic but not genomic variability confers phenotype of breast cancer stem cells

Identification and investigation of potential breast cancer stem cell (BCSC)-associated mutation hotspots. a Ascending trend of the percentage of the aldehyde dehydrogenase (ALDH)-positive cell population across the samples from the breast cancer cell line MDA-MB-231. b The invasion ability of enriched spheres was analyzed by transwell invasion assay. ***P < 0.001, two-tailed Student’s t tests. Error bars represent mean ± standard deviation (SD). c Expression levels of markers related to cancer stem cells [nanog homeobox (NANOG) and SRY (sex determining region Y)-box 2 (SOX2)] were assessed by real-time quantitative PCR in both enriched spheres (SP) and monolayer parental cells (2D). ***P < 0.001, two-tailed Student’s t-tests. Error bars represent mean ± SD. d Histogram 2D plots, conducted by the R package “plotly”, show the comparison of variant allele frequency (VAF) between every two samples. The VAF of most single nucleotide variant (SNV) sites in the whole genome is observed as being similar. e One hotspot region in chromosome 7 highlighted with a yellow bar is displayed as an example. First, potential SNV sites along the genome were ordered from the first to the last variant on chromosome 7 and colored according to P values. The distance between each mutation and the one prior to it (the inter-SNV distance) is plotted on the vertical axis (rainfall plot). P values were determined by an exponential distribution formula. Additionally, the number of potential SNV sites of each bin was visualized by University of California Santa Cruz Genome Browser (GB), with the whole chromosome divided into 10,000 equal bins. Next, hotspots of parental cells (2D), and derived spheres of the fourth generation (SP4) hotspot was displayed by GB using the sliding window approach, which was performed by shifting one base each time along the chromosome from start to end and calculating the SNV density and VAF level in each 1000 bp window. f Target deep DNA sequencing of comparison of VAF between every two samples revealed no difference from 2D to SP4 (left and middle). R2 was determined by regression analysis. Cor denotes the Pearson correlation coefficient. The dotted line represents the diagonal line. Sanger sequencing validated part of the results of target deep DNA sequencing (right)

