GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM1419086

Query DataSets for GSM1419086

Status

Public on Feb 18, 2015

Title

Hi-C, Liver STL011, replicate one

Sample type

SRA

Source name

Liver STL011

Organism

Homo sapiens

Characteristics

tissue: Liver

Treatment protocol

None

Growth protocol

Tissues samples were obtained from deceased donors at the time of organ procurement at the Barnes-Jewish Hospital (St. Louis, USA). Samples were flash frozen and pulverized prior to formaldehyde cross-linking. Research consent from family was obtained, and this study was approved by Mid-American Transplant Services.

Extracted molecule

genomic DNA

Extraction protocol

Hi-C experiments were conducted using HindIII according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).).
Sequencing libraries were constructed according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).).

Library strategy

OTHER

Library source

genomic

Library selection

other

Instrument model

Illumina HiSeq 2000

Description

Sample 4

Data processing

library strategy: HaploSeq
fastq: Illumina's HiSeq Control Software
For Hi-C read alignment, we aligned Hi-C reads to the hg18 (human) genome. We masked any bases in the genome that were genotyped as SNPs in the individual genome. These bases were masked to “N” in order to reduce reference bias mapping artifacts. Hi-C reads were aligned as single end reads using Novoalign. After mapping was finished, read pairs were re-constructed from single reads using an in house pipeline. Unmapped reads were filtered out and PCR duplicate reads were removed.
Haplotypes were generated from the final aligned bam file using the HapCUT algorithm. The details of HapCUT are described previously (Bansal and Bafna, Bioinformatics 24, i153-159, 2008).
The final processed haplotypes were generateds after removing local biases through following three steps. First the alignment biases were removed by aligning simulated reads spanning surrounding variants location. If there is more than 5% difference between alleles those variant loci were considered to subject an inherent mapping bias. Second, we removed alleles located in copy number variable regions and allelic biased copy number variable regions by comparing the coverage between two alleles based on WGS data. Any variants that had more coverage than three standard deviation above the mean of each haplotype were excluded. Any variants showing biased WGS coverage between two alleles were also excluded (binomial test p-value 0.05 after Benjamini correction). Lastly, we remove erroneously called as heterozygous variant during genotyping. We calculated the probability of each heterozygous variants were actually homozygous from the likelihood of observing the coverage on each allele from whole genome sequencing. Only heterozygous SNPs that had a FDR of less than 0.5% were included in downstream analysis.
HaploSeq generates two haplotypes for each chromosome, one for the maternal allele and one for the paternal allele. One allele is named as P1 (parent1) and another allele is named as P2 (parent2) since we do not have information regarding the parent of origin in each donor genome. For the chr9 we can generate haplotypes in each chromosome arm. The haplotypes in chrX STL002 was independtly generated based on hg19 genome build.
Genome_build: hg18
Supplementary_files_format_and_content: The processed haplotypes for the individual genome ("*_haps.vcf") are available in VCF format.

Submission date

Jun 24, 2014

Last update date

Feb 22, 2021

Contact name

Inkyung Jung

E-mail(s)

[email protected]

Organization name

KAIST

Department

Biological Sciences

Street address

KAIST, 291 Daehak-ro, Yuseong-gu

City

Daejeon

ZIP/Postal code

34141

Country

South Korea

Platform ID

GPL11154

Series (1)

GSE58752

Integrative analysis of haplotype-resolved epigenomes across human tissues

Relations

Reanalyzed by

Reanalyzed by

BioSample

SRA

Supplementary file	Size	Download	File type/resource
GSM1419086_STL011_haps.vcf.gz	12.4 Mb	(ftp)(http)	VCF
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file