NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM3146279 Query DataSets for GSM3146279
Status Public on May 20, 2018
Title ribosome footprints_ltm_untr
Sample type SRA
 
Source name A549 cells
Organism Homo sapiens
Characteristics cell line: A549
cell type: human lung adenocarcinoma epithelial cells
Treatment protocol At the start of treatment, samples treated with IFN-b were given media containing IFN-b at 20 U/ml, treatment with IFN was for 10 hrs total. 5 hrs after IFN-b treatment was started, samples were infected with influenza virus, viral infection proceeded for 5 hrs. For Ribo-Seq + LTM samples, the last 30 min of treatment occured in the presence of 5 μM
Growth protocol Cells were seeded 16 hrs prior to start of assay to be 70 % confluent at treatment, using two 15 cm plates per sample
Extracted molecule total RNA
Extraction protocol Samples were flash frozen in liquid nitrogen and lysed.
A portion of the lysate was used to generate RNA-Seq libraries using NEBNext 893 Ultra Directional RNA Library Prep Kit after depleting ribosome RNA using the Ribozero Gold kit. For ribosome profiling, we performed nuclease footprinting treatment by adding RNase I . We collected ribosome protected mRNA fragments using MicroSpin S-400 HR Columns, and purified RNA from the flow through for size selection. We gel-purified ribosome protected fragments with length between 26 and 34 nt using RNA oligo size markers. We used polyA tailing instead of linker ligation, performed reverse transcription, circularization of cDNA, and PCR amplification of cDNA. Libraries were sequenced on Illumina HiSeq 2500 in 50bp single end mode.
 
Library strategy OTHER
Library source transcriptomic
Library selection other
Instrument model Illumina HiSeq 2500
 
Description Ribo-Seq + LTM sample, untreated
ribosome footprints
Data processing Library strategy: Ribo-Seq
[Pre-processing and alignment] For Ribo-Seq libraries, we trimmed polyA tails using cutadapt 1.12 with parameters --adapter=AAAAAAAAAA --length=40 --minimum-length=25 to retain trimmed reads that were between 25nt and 40nt in length. For RNA-Seq libraries, we first reverse complemented the read to obtain the sense orientation using fastx reverse complement 0.0.13 with parameters --Q33, and then trimmed reads to 40nt. To remove rRNA, we discarded trimmed reads aligning to four human rRNA sequences (28S:NR 003287.2, 18S:NR 003286.2, 5.8S:NR 003285.2, and 5S:NR 023363.1) from the hg38 genome. The alignment was done using bowtie 1.1.1 with parameters --seedlen=23 --threads=8. We aligned the remaining non-rRNA to human transcripts (Gencode v24 using rsem 1.2.31 with parameters --output-genome-bam --sort-bam-by-coordinate. We also aligned the non-rRNA reads to an influenza genome containing both low and high CTG NP using the above rsem parameters including the extra options --seed-length 21 --bowtie-n 3
[Calculating transcript coverage] The posterior probablilty score from rsem (ZW field in the BAM output) was used to calculate coverage for reads on transcripts. For influenza reads, we consider only reads that map to the positive sense of influenza transcripts. We performed variable trimming based on fragment length, reads that were between 25 and 32 nucleotides in length were assigned to a P-site at 14 nucleotides from the 5’ end of the read, and reads between 33 and 39 nucleotides were assigned to a P-site at 15 nucleotides from the 5’ end of the read.For human transcripts, a set of nonredundant protein-coding transcripts was compiled by using the Gencode v24 annotations and applying the following criteria: The corresponding CDS must be part of the Consensus CDS (CCDS) project, must be labeled as a principal transcript by APPRIS; For each gene, we selected the lowest numbered CCDS ID, and for each CCDS, we selected the lowest numbered transcript ID.
[Calling translation initiation sites] We used a zero-truncated negative binomial distribution (ZTNB) to statistically model the background distribution of Ribo-Seq and Ribo-Seq + LTM counts in transcripts with more than 50 positions with non-zero counts. We first added the Ribo-Seq and the Ribo-Seq + LTM read counts of the two neighboring positions to each position in the genome (referred to as pooled counts below). Candidate start sites were identified based on the following criteria: For influenza 949 transcripts, the ZTNB-based P-value for the Ribo-Seq + LTM pooled count at that location must be <0.01 and 1000-fold higher than the P-value of the Ribo-Seq pooled counts at the same location, or must have an absolute value less than 10−7. For host transcripts, we required the Ribo-Seq + LTM P-value to be only 100-fold higher than the P-value of the Ribo-Seq pooled counts. Additionally, for host transcripts, we required that the read counts be greater than an absolute threshold across all transcripts. This threshold was estimated by requiring P<0.05 in a ZTNB model fit to the bottom 99% of all non-zero P-site Ribo-Seq + LTM pooled counts across all transcripts. Only the highest pooled counts within each 30nt window was called as a candidate TIS. From the called TIS, we assigned the identity of the start codon by looking at a window -1 to +1 nucleotides from the TIS peak and assigning the start codon based on following hierarchy: ATG, CTG, TTG, GTG, ATA, ATC, ATT, AAG, ACG, AGG, and other. If there are multiple near cognate codons in the window, the codon was assigned based on the order in the above list.
Genome_build: hg 38
Supplementary_files_format_and_content: Processed CSV data file is output from RSEM and includes host genscript gene id, transcript id(s), gene length, posterior probability score (expected count), transcripts per million (TPM), fragments per kilobase mapped (FPKM)
Processed TSV data files contain all host or influenza translation initiation sites called in any sample and information about the translation initiation site including transcript, transcript position, distance from annotated translation initiation site, codon, peptide length, and reading frame. Host file also contains information for the raw Ribo-Seq and Ribo-Seq + LTM counts and statistics from the ZNTB model.
 
Submission date May 18, 2018
Last update date May 21, 2018
Contact name Rasi Subramaniam
E-mail(s) [email protected]
Organization name Fred Hutch
Lab Subramaniam
Street address 1100 Fairview Ave N
City Seattle
ZIP/Postal code 98109
Country USA
 
Platform ID GPL16791
Series (1)
GSE114636 Comprehensive profiling of translation initiation in influenza-virus infected cells
Relations
BioSample SAMN09223524
SRA SRX4099349

Supplementary file Size Download File type/resource
GSM3146279_ltm_untr_gencode.genes.results.txt.gz 1.5 Mb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap