GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Series GSE31363

Query DataSets for GSE31363

Status

Public on Aug 12, 2011

Title

Transcription Factor Binding Sites by Epitope-Tag ChIP-seq from ENCODE/University of Chicago

Project

ENCODE

Organism

Homo sapiens

Experiment type

Genome binding/occupancy profiling by high throughput sequencing

Summary

This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Kevin White mailto:[email protected] (Principal Investigator), Subhradip Karmakar mailto:[email protected] (Project Lead), Nick Bild mailto:[email protected] (Data Analyst), Alina Choudhury mailto:[email protected] (Laboratory Technician), Marc Domanus mailto:[email protected] (Sequencing Technician at Argonne National Lab)). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:[email protected]).

This ENCODE track maps human transcription factor binding sites, genome-wide using second generation massively parallel sequencing. This mapping uses expressed transcription factors as GFP tagged fusion proteins after BAC (Bacterial artificial chromosomes) recombineering.
The U. of Chicago and Max Planck Institute (Dresden) pipeline generates recombineered (recombination-mediated genetic engineering) BACs for the production of cell lines or animals that express fusion proteins from epitope tagged transgenes.

For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf

Overall design

Cells were grown according to the approved ENCODE cell culture protocols. (http://hgwdev/ENCODE/protocols/cell)
Recombineering strategy: To facilitate high-throughput production of the transgenic constructs, the program BACFinder (Crowe, Rana et al. 2002) automatically selects the most suitable BAC clone for any given human gene and generates the sets of PCR primers required for tagging and verification (Poser, Sarov et al. 2008). Recombineering is used for tagging cassettes at either the N or C terminus of the protein. The N-terminal cassette has a dual eukaryotic-prokaryotic promoter (PGK-gb2) driving a neomycin-kanamycin resistance gene within an artificial intron inside the tag coding sequence. The selection cassette is flanked by two loxP sites and can be permanently removed by Cre recombinase-mediated excision. The C-terminal cassette contains the sequence encoding the tag followed by an internal ribosome entry site (IRES) in front of the neomycin resistance gene. In addition, a short bacterial promoter (Gb3) drives the expression of the neomycin-kanamycin resistance gene in E. coli. The tagging cassettes, containing 50 nucleotides of PCR-introduced homology arms are inserted into the BAC by recombineering, either behind the start codon (for the N-terminal tag) or in front of the stop codon (for the C-terminal tag) of the gene. E. coli cells that have successfully recombined the cassette are selected for kanamycin resistance in liquid culture. Each saturated culture from a specific recombineering reaction derived 10-200 independent recombination events. Checking two independent clones for each PCR through the tag insertion point, 97% (85/88) yielded a PCR product of the expected size. Most of the clones that failed to grow were missing the targeted genomic region. An estimated 10% of the BACs used are chimeric, rearranged or wrongly mapped. Thus, initial results indicate that the necessary recombineering steps can be carried out with high fidelity. The White lab produced all epitope tagged transcription and chromatin factor BACs, as well as the genome wide ChIP data and analysis. An application of this approach to the analysis of closely related paralogs (RARa and RARg) yielded transcription factors, chromatin factors, cell lines, ChIP chip data and ChIP-seq data (Hua, Kittler et al. Cell 2009). Such paralogous transcription factors often can not otherwise be distinguished by antibodies.
Sample Preparation: ChIP DNA from samples are sheared to ~800bp using a nebulizer. The ends of the DNA are polished, and two unique adapters are ligated to the fragments. Ligated fragments of 150-200bp are isolated by gel extraction and amplified using limited cycles of PCR.
Sequencing System: Illumina GAIIx and HySeq next-generation sequencing produced all ChIP-seq data.
Processing and Analysis Software: Raw sequencing reads are aligned using Bowtie version 0.12.5 (Langmead et al. 2009). The "-m 1" parameter is applied to suppress alignments mapping more than once in the genome. Reads are aligned to the UCSC hg19 assembly. Wiggle format signal files are generated with SPP 2.7.1 for R 2.7.1. Macs 1.3.7 is used to call peaks. The Macs parameters used vary by experiment. The White lab used goat anti-GFP antibody to perform ChIP in untagged K562 cells as a background control. The test IP was performed in the same way as the background control. Results are expressed as values of the test normalized to the background

Web link

http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUchicagoTfbs

Contributor(s)

White K, Karmakar S, Bild N, Choudhury A, Domanus M

Citation missing

Has this study been published? Please login to update or notify GEO.

BioProject

PRJNA63447

Submission date

Aug 12, 2011

Last update date

May 15, 2019

Contact name

ENCODE DCC

E-mail(s)

[email protected]

Organization name

ENCODE DCC

Street address

300 Pasteur Dr

City

Stanford

State/province

ZIP/Postal code

94305-5120

Country

USA

Platforms (1)

GPL9115

Illumina Genome Analyzer II (Homo sapiens)

Samples (12)

More...

GSM777637	UChicago_ChipSeq_K562_eGFP-NR4A1_Control_eGFP-NR4A1
GSM777638	UChicago_ChipSeq_K562_eGFP-JunB_Control_eGFP-JunB
GSM777639	UChicago_ChipSeq_K562_eGFP-JunD_Control_eGFP-JunD

Relations

SRA

SRP007860

Download family	Format
SOFT formatted family file(s)	SOFT
MINiML formatted family file(s)	MINiML
Series Matrix File(s)	TXT

Supplementary file	Size	Download	File type/resource
GSE31363_RAW.tar	1.6 Gb	(http)(custom)	TAR (of BIGWIG, NARROWPEAK)
GSE31363_run_info.txt.gz	855 b	(ftp)(http)	TXT
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file