|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Aug 20, 2015 |
Title |
GM12878_HIC_1 |
Sample type |
SRA |
|
|
Source name |
Lymphoblastoid Cell Line
|
Organism |
Homo sapiens |
Characteristics |
cell line: GM12878
|
Treatment protocol |
Cross linking was performed in 1% formaldehyde for 10 minutes at room temperature. Followed by quenching with Glycin with a final concentration of 125mM
|
Growth protocol |
LCLs were grown to a density of 0.6-0.8 x 10^6/mL in RPMI1640 with 15% fetal bovine serum and 1% PenStrep.
|
Extracted molecule |
genomic DNA |
Extraction protocol |
25 million cells for GM12878 were cross linked and chromatin digested with HindIII. DNA overhangs were biotinylated and proximity ligated under dilute conditions to favor ligation of fragments in three-dimensional proximity. DNA was then sheared, biotinylated fragments were enriched with streptavidin beads and prepared for Illumina sequencing For ChIP-Seq, nuclear lysates were sonicated using a Branson 250 Sonifier (power setting 2, 100% duty cycle for 7 x 30-s intervals). Clarified lysates corresponding to 20 million cells were treated with 1-5ug of antibody coupled to Protein G Dynabeads (Invitrogen #10003D, New York). The protein-DNA complexes were washed with RIPA buffer and eluted in 1% SDS TE at 65°C. ChIP DNA sequencing libraries were generated according to Illumina DNA Tru-Seq DNA Sample Preparation Kit Instructions (Illumina Part # FC-121-2001, San Diego, CA).
|
|
|
Library strategy |
OTHER |
Library source |
genomic |
Library selection |
other |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
HiC.GM12878.correlations.txt.gz library strategy: HiC
|
Data processing |
Personal genomes were created by adding the SNPs of each individual into hg19. SNPs were obtained from the 1000 Genomes Project or imputed using shapeIt and IMPUTE2 For ChIP-seq, ChIP-seq reads were aligned to personal genomes with BWA 0.6.1 (options -q 20, -t 4 and the rest set to defaults) For ChIP-seq, peaks were called using MACS2 on merged replicates, subsampled to 50 million single reads. Genome-wide signal tracks (bigWig) were generated using wiggler, also on the subsampled data. For HiC, we aligned reads using HICUP, which aligns reads to an in-silico HindIII digested genome in the hg19 assembly. Our HiC-interaction analysis is based on restriction-fragment level resolution. We obtained the interaction count for each pair of fragments. To estimate the proximity between two restriction fragments A and B we calculated their co-variance as the fraction of fragments that interact with both A and B, normalized by the number of fragments interacting only with A or only with B. The processed data file contains unique interactions, specified with the following entries: i=id of first interaction fragment, j=id of second interaction fragment, x=number of shared interaction partners between fragment i and j, cor=proportion of interaction partners of i and j that are shared between i and j, pos_i=midpoint of fragment i, pos_j=midpoint of fragment j, pair_id=unique identifier for the interaction. The file contains only interactions with an entry for cor > 0.2 For ChIA-PET, data analysis was carried out using software developed in-house, Mango (paper submitted). PETs were trimmed to remove linker sequences. In addition, only PETs that have the same linker sequences at both ends are kept for further processing. The resulting reads were aligned to the genome using the Bowtie software suite(Ben Langmead et al., 2009). Duplicate reads were removed that may be due to PCR duplication. MACS2 was used to call binding peaks, which are subsequently used as anchor regions for the detection of interactions in the next step(Zhang et al., 2008). The probability of observing a PET linking any two peaks was modeled as a function of both genomic distance and the read depth of each peak. Using this model statistical confidence estimates are assigned to interactions. The resulting P-values are corrected to account for multiple hypothesis testing using the Benjamini-Hochberg method and filtered to a user defined false discovery rate (FDR). The processed files are in bedpe format and contain the following entries: chrom1: chromosome of anchor 1, start1: start position of anchor 1, end1: end position of anchor 1, chrom2: chromosome of anchor 2, start2: start position of anchor 2, end2: end position of anchor 1, name: unique ID for each interaction, peak1: # of mid range PETs in anchor1, peak2: # of mid range PETs in anchor2, PETs: # of PETs linking anchor 1 to anchor 2, distance: # the distance in between anchor 1 to anchor 2, P_IAB_distance: The probability of observing a PET with the distance that these two anchors are separated by. P_combos_distance: The probability of observing two anchors separated by the distance that these two anchors are separated by. P_IAB_depth: The probability of observing a PET linking two anchors with the same read depths as these two anchors, P_combos_depth: The probability of observing a pair of loci with the same read depths as these two anchors, p_binom: # The binomial probability of observing a single PET linking these two loci, P: # The actual P-value of the interaction (calculated using the binomial distribution), Q: # The P-value after Benjamini-Hochberg correction (for all possible pairs of loci not just once linked by >= 1 PET). Genome_build: hg19 Supplementary_files_format_and_content: Peaks are in the narrowPeak format. bigWig files were generated using wiggler.
|
|
|
Submission date |
Feb 12, 2015 |
Last update date |
May 15, 2019 |
Contact name |
Fabian Grubert |
E-mail(s) |
[email protected]
|
Organization name |
Stanford University
|
Street address |
300 Pasteur Drive
|
City |
Stanford |
ZIP/Postal code |
94305 |
Country |
USA |
|
|
Platform ID |
GPL11154 |
Series (1) |
GSE62742 |
Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions |
|
Relations |
Reanalyzed by |
GSE85977 |
Reanalyzed by |
GSE115407 |
BioSample |
SAMN03342393 |
SRA |
SRX876306 |
Supplementary data files not provided |
SRA Run Selector |
Processed data are available on Series record |
Raw data are available in SRA |
|
|
|
|
|