GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM3146279

Query DataSets for GSM3146279

Status

Public on May 20, 2018

Title

ribosome footprints_ltm_untr

Sample type

SRA

Source name

A549 cells

Organism

Homo sapiens

Characteristics

cell line: A549
cell type: human lung adenocarcinoma epithelial cells

Treatment protocol

At the start of treatment, samples treated with IFN-b were given media containing IFN-b at 20 U/ml, treatment with IFN was for 10 hrs total. 5 hrs after IFN-b treatment was started, samples were infected with influenza virus, viral infection proceeded for 5 hrs. For Ribo-Seq + LTM samples, the last 30 min of treatment occured in the presence of 5 μM

Growth protocol

Cells were seeded 16 hrs prior to start of assay to be 70 % confluent at treatment, using two 15 cm plates per sample

Extracted molecule

total RNA

Extraction protocol

Samples were flash frozen in liquid nitrogen and lysed.
A portion of the lysate was used to generate RNA-Seq libraries using NEBNext 893 Ultra Directional RNA Library Prep Kit after depleting ribosome RNA using the Ribozero Gold kit. For ribosome profiling, we performed nuclease footprinting treatment by adding RNase I . We collected ribosome protected mRNA fragments using MicroSpin S-400 HR Columns, and purified RNA from the flow through for size selection. We gel-purified ribosome protected fragments with length between 26 and 34 nt using RNA oligo size markers. We used polyA tailing instead of linker ligation, performed reverse transcription, circularization of cDNA, and PCR amplification of cDNA. Libraries were sequenced on Illumina HiSeq 2500 in 50bp single end mode.

Library strategy

OTHER

Library source

transcriptomic

Library selection

other

Instrument model

Illumina HiSeq 2500

Description

Ribo-Seq + LTM sample, untreated
ribosome footprints

Data processing

Library strategy: Ribo-Seq
[Pre-processing and alignment] For Ribo-Seq libraries, we trimmed polyA tails using cutadapt 1.12 with parameters --adapter=AAAAAAAAAA --length=40 --minimum-length=25 to retain trimmed reads that were between 25nt and 40nt in length. For RNA-Seq libraries, we first reverse complemented the read to obtain the sense orientation using fastx reverse complement 0.0.13 with parameters --Q33, and then trimmed reads to 40nt. To remove rRNA, we discarded trimmed reads aligning to four human rRNA sequences (28S:NR 003287.2, 18S:NR 003286.2, 5.8S:NR 003285.2, and 5S:NR 023363.1) from the hg38 genome. The alignment was done using bowtie 1.1.1 with parameters --seedlen=23 --threads=8. We aligned the remaining non-rRNA to human transcripts (Gencode v24 using rsem 1.2.31 with parameters --output-genome-bam --sort-bam-by-coordinate. We also aligned the non-rRNA reads to an influenza genome containing both low and high CTG NP using the above rsem parameters including the extra options --seed-length 21 --bowtie-n 3
[Calculating transcript coverage] The posterior probablilty score from rsem (ZW field in the BAM output) was used to calculate coverage for reads on transcripts. For influenza reads, we consider only reads that map to the positive sense of influenza transcripts. We performed variable trimming based on fragment length, reads that were between 25 and 32 nucleotides in length were assigned to a P-site at 14 nucleotides from the 5’ end of the read, and reads between 33 and 39 nucleotides were assigned to a P-site at 15 nucleotides from the 5’ end of the read.For human transcripts, a set of nonredundant protein-coding transcripts was compiled by using the Gencode v24 annotations and applying the following criteria: The corresponding CDS must be part of the Consensus CDS (CCDS) project, must be labeled as a principal transcript by APPRIS; For each gene, we selected the lowest numbered CCDS ID, and for each CCDS, we selected the lowest numbered transcript ID.
[Calling translation initiation sites] We used a zero-truncated negative binomial distribution (ZTNB) to statistically model the background distribution of Ribo-Seq and Ribo-Seq + LTM counts in transcripts with more than 50 positions with non-zero counts. We first added the Ribo-Seq and the Ribo-Seq + LTM read counts of the two neighboring positions to each position in the genome (referred to as pooled counts below). Candidate start sites were identified based on the following criteria: For influenza 949 transcripts, the ZTNB-based P-value for the Ribo-Seq + LTM pooled count at that location must be <0.01 and 1000-fold higher than the P-value of the Ribo-Seq pooled counts at the same location, or must have an absolute value less than 10−7. For host transcripts, we required the Ribo-Seq + LTM P-value to be only 100-fold higher than the P-value of the Ribo-Seq pooled counts. Additionally, for host transcripts, we required that the read counts be greater than an absolute threshold across all transcripts. This threshold was estimated by requiring P<0.05 in a ZTNB model fit to the bottom 99% of all non-zero P-site Ribo-Seq + LTM pooled counts across all transcripts. Only the highest pooled counts within each 30nt window was called as a candidate TIS. From the called TIS, we assigned the identity of the start codon by looking at a window -1 to +1 nucleotides from the TIS peak and assigning the start codon based on following hierarchy: ATG, CTG, TTG, GTG, ATA, ATC, ATT, AAG, ACG, AGG, and other. If there are multiple near cognate codons in the window, the codon was assigned based on the order in the above list.
Genome_build: hg 38
Supplementary_files_format_and_content: Processed CSV data file is output from RSEM and includes host genscript gene id, transcript id(s), gene length, posterior probability score (expected count), transcripts per million (TPM), fragments per kilobase mapped (FPKM)
Processed TSV data files contain all host or influenza translation initiation sites called in any sample and information about the translation initiation site including transcript, transcript position, distance from annotated translation initiation site, codon, peptide length, and reading frame. Host file also contains information for the raw Ribo-Seq and Ribo-Seq + LTM counts and statistics from the ZNTB model.

Submission date

May 18, 2018

Last update date

May 21, 2018

Contact name

Rasi Subramaniam

E-mail(s)

[email protected]

Organization name

Fred Hutch

Lab

Subramaniam

Street address

1100 Fairview Ave N

City

Seattle

ZIP/Postal code

98109

Country

USA

Platform ID

GPL16791

Series (1)

GSE114636

Comprehensive profiling of translation initiation in influenza-virus infected cells

Relations

BioSample

SAMN09223524

SRA

SRX4099349

Supplementary file	Size	Download	File type/resource
GSM3146279_ltm_untr_gencode.genes.results.txt.gz	1.5 Mb	(ftp)(http)	TXT
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record