GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM5031743

Query DataSets for GSM5031743

Status

Public on Jan 31, 2021

Title

Mix2_2

Sample type

SRA

Source name

Myeloma cell lines (1200 cells)

Organism

Homo sapiens

Characteristics

cell type: Myeloma cell lines
sample: sample6
cell_number: 1200

Extracted molecule

polyA RNA

Extraction protocol

Samples were processed using the Drop-seq DolomiteBio Nadia encapsulator system.
For nanopore sequencing, cDNA was amplified with 25 SMART PCR reactions and sequencing libraries were prepared using the Oxford Nanopore LSK-109 library preperation kit.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

PromethION

Description

Myeloma cells

Data processing

We performed basecalling on the raw fast5 data using Guppy (v) (guppy_basecaller –compress-fastq -c dna_r9.4.1_450bps_hac.cfg -x “cuda:1”) in GPU mode from Oxford Nanopore Technologies running on a GTX 1080 Ti graphics card. For each read we identify the barcode and UMI sequence by searching for the polyA region and flanking regions before and after the barcode/UMI. Accurately sequenced barcodes were identified based on their dual nucleotide complementarity. Unambiguous barcodes were then used as a guide to error correct the ambiguous barcodes in a second pass correction analysis approach. We performed fuzzy searching using a Levenshtein distance of 4 (unless otherwise stated in the figure legend) and replaced the original ambiguous barcode with the unambiguous sequence. A whitelist of barcodes was then generated using UMI-tools whitelist (umi_tools whitelist --bc-pattern=CCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNNNNNNN --set-cell-number=1000) [3]. This whitelist was used to assess the quality of our cells to read count ratio and used as an input for UMI-tools extract. Next the barcode and UMI sequence of each read was extracted and placed within the read2 header file using UMI-tools extract (umi_tools extract --bc-pattern=CCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNNNNNNN --whitelist=whitelist.txt). Reads were then aligned to the transcriptome using minimap2 [10] (-ax splice -uf --MD --sam-hit-only --junc-bed) using the reference transcriptome for human hg38 and mouse mm10. The resulting sam file was converted to a bam file and then sorted and indexed using samtools [11]. The transcript name was then added as a XT tag within the bam file using pysam. Finally, UMI-tools count (umi_tools count –per-gene –gene-tag=XT –per-cell –double-barcode) was used to count features to cells before being converted to a market matrix format. We modified UMI-tools count to handle the double nucleotide UMIs as defined below. This counts matrix was then used as an input into the standard Seurat pipeline.
Genome_build: hg38
Supplementary_files_format_and_content: mtx

Submission date

Jan 22, 2021

Last update date

Feb 01, 2021

Contact name

Adam Cribbs

E-mail(s)

[email protected]

Organization name

University of Oxford

Department

NDORMS

Street address

Windmill Road

City

Oxford

ZIP/Postal code

OX37LD

Country

United Kingdom

Platform ID

GPL26167

Series (1)

GSE162053

High throughput error correction using dual nucleotide dimer blocks allows direct single-cell nanopore transcriptome sequencing

Relations

BioSample

SAMN17496940

SRA

SRX9920602

Supplementary file	Size	Download	File type/resource
GSM5031743_Mix2_2_genes.barcodes.txt.gz	11.9 Kb	(ftp)(http)	TXT
GSM5031743_Mix2_2_genes.genes.txt.gz	17.0 Kb	(ftp)(http)	TXT
GSM5031743_Mix2_2_genes.mtx.gz	2.3 Mb	(ftp)(http)	MTX
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file