GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM4586169

Query DataSets for GSM4586169

Status

Public on Aug 24, 2020

Title

scRCAT-seq_single_hESC_13

Sample type

SRA

Source name

hESC

Organism

Homo sapiens

Characteristics

strain: H9
tissue: hESC

Extracted molecule

polyA RNA

Extraction protocol

DRG, oocytes, hESC, HEK293T, ARPE, mESC, hESC_drived_organoid were dissected and dissociated into single cells. hESC, HEK293T, ARPE, mESC, hESC_drived_organoid were dissected respectively and mixed together.
The scRCAT-seq library was prepared according to the scRCAT-seq protocol
The smart-seq2 library was prepared according to the smart-seq2 protocol
The Iso-Seq library was prepared according to the Isoform Sequencing protocol

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

HiSeq X Ten

Description

applying single cell method to hESC
hESCsccatUMI_5.bed.gz, hESCsccatUMI_3.bed.gz

Data processing

Illumina Casava1.7 software used for basecalling.
5' and 3' sequences were extracted from paired-end sequences using Perl, Python and shell script
5' and 3' sequences were aligned to mm10, using STAR (2.6.1a), with with parameters (--outFilterMultimapNmax 1 --outFilterScoreMinOverLread 0.6 --outFilterMatchNminOverLread 0.6).
Peaks were called with CAGEr (v 1.24.0) .
For smart-seq2 data, sequenced reads were trimmed for adaptor sequence using cutapapt (v 1.18) and aligned to mm10 using STAR with with parameters as described above.
For Iso-seq data, sequence data were processed using the SMRTlink5.0 software. Circular consensus sequence (CCS) was generated from subread BAM files, parameters: min_length 200, max_drop_fraction 0.8, no_polish TRUE, min_zscore -9999, min_passes 1, min_predicted_accuracy 0.8, max_length 18000. CCS.BAM files were classified into full length and non-full length reads using pbclassify.py script, ignore polyA false, minSeq Length 200. Non-full length and full-length fasta files produced were then fed into the cluster step, which does isoform-level clustering(ICE), followed by final Arrow polishing, hq_quiver_min_accuracy 0.99, bin_by_primer false, bin_size_kb 1, qv_trim_5p 100, qv_trim_3p 30.
Additional nucleotide errors in consensus reads were corrected using the Illumina RNA seq data with the software LoRDEC.
consensus reads were aligned to reference Genome using GMAP with parameters --no-chimeras --cross-species --expand-offsets 1 –B 5 –K 50000 –f samse –n 1 against reference genome.
Gene structure analysis was performed using TAPIS pipeline.
Genome_build: hg38 and mm10
Supplementary_files_format_and_content: [scRCAT-seq] *bed file reports the TSS/TES position
Supplementary_files_format_and_content: [Iso-seq] *gtf reports the isoform information which were detected by Iso-seq.

Submission date

Jun 01, 2020

Last update date

Aug 29, 2020

Contact name

Jiawei Zhong

Organization name

Karolinska Institutet

Department

Department of Medicine, Huddinge

Lab

Mikael Rydén & Niklas Mejhert lab

Street address

Blickagången 16, Flemingsberg

City

Stockholm

ZIP/Postal code

14182

Country

Sweden

Platform ID

GPL20795

Series (1)

GSE134311

scRCAT-seq: simultaneously profile RNA TSS and TES at single-cell level

Relations

BioSample

SAMN15077061

SRA

SRX8450069

Supplementary file	Size	Download	File type/resource
GSM4586169_hESC_8N_H15_3.bed.gz	576.2 Kb	(ftp)(http)	BED
GSM4586169_hESC_8N_H15_5.bed.gz	148.4 Kb	(ftp)(http)	BED
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record