NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE123604 Query DataSets for GSE123604
Status Public on Dec 12, 2018
Title A novel computational complete deconvolution method using RNA-seq data
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Summary The cell type composition of many biological tissues varies widely across samples. Such sample heterogeneity hampers efforts to probe the role of each cell type in the tissue microenvironment. Current approaches that address this issue have drawbacks. Cell sorting or single-cell based experimental techniques disrupt in situ interactions and alter physiological status of cells in tissues. Computational methods are flexible and promising; but they often estimate either sample-specific proportions of each cell type or cell-type-specific gene expression profiles, not both, by requiring the other as input. We introduce a computational Complete Deconvolution method that can estimate both sample-specific proportions of each cell type and cell-type-specific gene expression profiles simultaneously using bulk RNA-Seq data only (CDSeq). We assessed our method’s performance using several synthetic and experimental mixtures of varied but known cell-type composition and compared its performance to the performance of two state-of-the-art deconvolution methods on the same mixtures. The results showed CDSeq can estimate both sample-specific proportions of each component cell type and cell-type-specific gene expression profiles with high accuracy. CDSeq holds promise for computationally deciphering complex mixtures of cell types, each with differing expression profiles, using RNA-seq data measured in bulk tissue .
 
Overall design In brief, total mRNA was prepared from Namalwa (Burkitt’s lymphoma), Hs343T (fibroblast line derived from a mammary gland adenocarcinoma), hTERT-HME1 (normal mammary epithelial cells immortalized with hTERT), and MCF7 (estrogen receptor positive breast cancer cell line). mRNA samples were diluted to 100 ng/μl and mixed in different proportions (Supplementary Table 2). Global mRNA abundance of the four pure cell lines and of the mixed RNA samples was profiled by RNA-sequencing. Sequencing libraries were prepared using TruSeq RNA sample preparation kit v2 (Illumina). 75-bp single end sequencing was performed on the NextSeq sequencer (Illumina). After obtaining the fastq data, we first ran cutadapt (version 1.12) for trimming adapter sequences. Secondly, we mapped reads to the genome using STAR (version 020201). Lastly, we used featureCounts (version 1.5.1) to generate raw read counts data as the input for our algorithm.
 
Contributor(s) Kang K, Meng Q, Shats I, Umbach D, Li M, Li Y, Li X, Li L
Citation(s) 31790389
Submission date Dec 11, 2018
Last update date May 13, 2021
Contact name Kai Kang
E-mail(s) [email protected]
Organization name MIT
Department CSAIL
Street address 32 Vassar Street
City Cambridge
State/province MA
ZIP/Postal code 02139
Country USA
 
Platforms (1)
GPL18573 Illumina NextSeq 500 (Homo sapiens)
Samples (40)
GSM3507820 Tumor-MCF7_pure_1
GSM3507821 CAFs-Hs_343.T_pure_1
GSM3507822 Normal_breast-hMECs-hTERT_pure_1
Relations
BioProject PRJNA509361
SRA SRP173265

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE123604_RAW.tar 3.3 Gb (http)(custom) TAR (of BW)
GSE123604_all_samples_per_gene_counts.txt.gz 1.8 Mb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap