NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM1419086 Query DataSets for GSM1419086
Status Public on Feb 18, 2015
Title Hi-C, Liver STL011, replicate one
Sample type SRA
 
Source name Liver STL011
Organism Homo sapiens
Characteristics tissue: Liver
Treatment protocol None
Growth protocol Tissues samples were obtained from deceased donors at the time of organ procurement at the Barnes-Jewish Hospital (St. Louis, USA). Samples were flash frozen and pulverized prior to formaldehyde cross-linking. Research consent from family was obtained, and this study was approved by Mid-American Transplant Services.
Extracted molecule genomic DNA
Extraction protocol Hi-C experiments were conducted using HindIII according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).).
Sequencing libraries were constructed according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).).
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2000
 
Description Sample 4
Data processing library strategy: HaploSeq
fastq: Illumina's HiSeq Control Software
For Hi-C read alignment, we aligned Hi-C reads to the hg18 (human) genome. We masked any bases in the genome that were genotyped as SNPs in the individual genome. These bases were masked to ā€œNā€ in order to reduce reference bias mapping artifacts. Hi-C reads were aligned as single end reads using Novoalign. After mapping was finished, read pairs were re-constructed from single reads using an in house pipeline. Unmapped reads were filtered out and PCR duplicate reads were removed.
Haplotypes were generated from the final aligned bam file using the HapCUT algorithm. The details of HapCUT are described previously (Bansal and Bafna, Bioinformatics 24, i153-159, 2008).
The final processed haplotypes were generateds after removing local biases through following three steps. First the alignment biases were removed by aligning simulated reads spanning surrounding variants location. If there is more than 5% difference between alleles those variant loci were considered to subject an inherent mapping bias. Second, we removed alleles located in copy number variable regions and allelic biased copy number variable regions by comparing the coverage between two alleles based on WGS data. Any variants that had more coverage than three standard deviation above the mean of each haplotype were excluded. Any variants showing biased WGS coverage between two alleles were also excluded (binomial test p-value 0.05 after Benjamini correction). Lastly, we remove erroneously called as heterozygous variant during genotyping. We calculated the probability of each heterozygous variants were actually homozygous from the likelihood of observing the coverage on each allele from whole genome sequencing. Only heterozygous SNPs that had a FDR of less than 0.5% were included in downstream analysis.
HaploSeq generates two haplotypes for each chromosome, one for the maternal allele and one for the paternal allele. One allele is named as P1 (parent1) and another allele is named as P2 (parent2) since we do not have information regarding the parent of origin in each donor genome. For the chr9 we can generate haplotypes in each chromosome arm. The haplotypes in chrX STL002 was independtly generated based on hg19 genome build.
Genome_build: hg18
Supplementary_files_format_and_content: The processed haplotypes for the individual genome ("*_haps.vcf") are available in VCF format.
 
Submission date Jun 24, 2014
Last update date Feb 22, 2021
Contact name Inkyung Jung
E-mail(s) [email protected]
Organization name KAIST
Department Biological Sciences
Street address KAIST, 291 Daehak-ro, Yuseong-gu
City Daejeon
ZIP/Postal code 34141
Country South Korea
 
Platform ID GPL11154
Series (1)
GSE58752 Integrative analysis of haplotype-resolved epigenomes across human tissues
Relations
Reanalyzed by GSE87112
Reanalyzed by GSE167200
BioSample SAMN02898031
SRA SRX641267

Supplementary file Size Download File type/resource
GSM1419086_STL011_haps.vcf.gz 12.4 Mb (ftp)(http) VCF
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap