GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM4468590

Query DataSets for GSM4468590

Status

Public on May 28, 2020

Title

ctrl replicate 1 Dataset#3

Sample type

SRA

Source name

leaves

Organism

Nicotiana benthamiana

Characteristics

construction: ctrl
biological replicate: replicate 1
dataset: Dataset#3
reference sequence: GFP

Treatment protocol

Different URT1 constructions were co-inflitrated with a GFP reporter mRNA and P19 (silencing inhibitor) in leaves.

Growth protocol

Nicotiana benthamiana infiltrated leaves of plants grown during 4 weeks on soil with 16 hr light (22 °C) /8 hr (18 °C) darkness cycles.

Extracted molecule

total RNA

Extraction protocol

Total RNA was extracted from flowers with TRI Reagent® (Molecular Research Center) according to manufacturer’s instructions
3'RACEseq protocol is based on the ligation of a primer at the 3’ end of RNA, and the subsequent targeted amplification by PCR of amplicons suitable for Illumina sequencing.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

RACE

Instrument model

Illumina MiSeq

Description

Processed_data_dataset3.txt

Data processing

After initial data processing by the MiSeq Control Software v 2.5. (Illumina), base calls were retrieved and further analysed by a suite of home made python scripts (v2.7) using biopython (v1.63) and regex (v2.4) libraries
Data processing pipeline was adapted from (Sikorska et al., 2017). Reads with low quality bases (=<Q10) within the 15 –base random sequence of the read 2 or within the 30 bases downstream the delimiter sequence, were filtered out
Sequences with identical nucleotides in 15 –base random sequence were deduplicated.
Next, 20 nucleotides sequences corresponding to nucleotides of the transcript were searched into reads 1 to identify the corresponding target mRNAs. One mismatch was tolerated. Matched reads 1 and their corresponding reads 2 were extracted and annotated.
Reads 2 that contain the delimiter sequence were selected and subsequently trimmed from their random and delimiter sequences.
Then, the analysis was divided into two steps.
The aim of the first step was to identify the position of mRNA 3’ extremities and to detect untemplated nucleotides. To do this, the 30 nucleotide sequences downstream of the read 2 delimiter sequence were mapped to the corresponding reference sequence, which goes from the first nucleotide of the transcript that maps the forward PCR2 primer to the end of the mRNA. Up to four mismatches were tolerated, with the exception of the first five nucleotides downstream of the mapping site that had to perfectly map. To map the 3’ end position of reads 2 with untemplated tails, the sequences of the unmatched reads 2 were successively trimmed from their 3’ end, with a one nucleotide trimming step, until they could be mapped to the reference sequence or until a maximum of 30 nucleotide has been removed. For each successfully mapped read 2, untemplated nucleotides at the 3’ end were extracted
The aim of the second step was to analyze long mRNA poly(A) tail. Sequencing of long homopolymeric stretches causes a rapid decrease of sequencing quality, making it impossible to exactly map the 3’ end of mRNA with long poly(A). We thus looked for long T stretches of at least 10 Ts in the read 2 that failed to map the reference sequence. Poly(A) tails were searched with the constraint that it must begin in the first 30 cycles, which means that the maximal length of the added 3’ end modification is limited to 29 nucleotides.
Finally, results from step 1 and 2 were compiled and 3’ extensions were analyzed.
Supplementary_files_format_and_content: One processed data file is given for each of the tree datasets generated in individual Miseq runs. It includes the processed data for each construction and biological replicate. Each line corresponds to one individual reads.
Supplementary_files_format_and_content: For each read, we indicate the read ID (read.ID), the gene AGI (Gene), poly(A) tail sequence (polyA), poly(A) tail length (polyA.size), non-A extension sequence (modification), length of the non-A extension (modification.size), A + non-A tail sequence (extension), A + non-A tail size (extension.size), tag (classification) indicating the category of the tail, the 15N random sequence used for deduplication (random), the biological replicate (rep) and the construction (construction).

Submission date

Apr 09, 2020

Last update date

May 28, 2020

Contact name

Dominique Gagliardi

E-mail(s)

[email protected]

Organization name

CNRS

Department

IBMP

Street address

12, rue du General Zimmer

City

Strasbourg

ZIP/Postal code

67084

Country

France

Platform ID

GPL22072

Series (2)

GSE148409	Ectopic expression of URT1 remodels poly(A) tail profiles in Nicotiana benthamiana
GSE148449	Molecular connection between the TUTase URT1 and decapping activators

Relations

BioSample

SAMN14569660

SRA

SRX8092650

Supplementary data files not provided

SRA Run Selector

Raw data are available in SRA

Processed data are available on Series record