NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM1245589 Query DataSets for GSM1245589
Status Public on Sep 03, 2014
Title Genomic DNA of neurodevelopmental disorder patient CD8
Sample type SRA
 
Source name mononuclear blood cells
Organism Homo sapiens
Characteristics tissue type: non affected tissue
tissue: blood
Treatment protocol no treatment
Growth protocol primary tissue, no growth conditions
Extracted molecule genomic DNA
Extraction protocol genomic DNA from whole blood was isolated using QIAamp DNA Blood Midi Kit (Qiagen ) according to the manufacturer's recommendations
DNA-PET libraries were constructed for each sample, six patients and two family members as follows: genomic DNA was hydrosheared to 7-12 kb DNA fragments. Long Mate Paired (LMP) cap adaptors were ligated to the hydrosheared and end repaired DNA fragments. The cap adaptor-ligated DNA fragments were separated by agarose gel electrophoresis, recovered using the QIAEXII Gel Extraction Kit (QIAGEN) and circularized with a biotinylated adaptor that connects the cap adaptors at both ends of the DNA fragments. Missing 5’ phosphate groups of cap adaptors created a nick on each strand after circularization of the DNA. Both nicks were translated outwards by >50 bp into the circularized genomic DNA fragment by DNA polymerase I (NEB). The nick-translated constructs were then digested with T7 exonuclease and S1 nuclease (NEB), to release paired-end tag (PETs) library constructs. These constructs were ligated with SOLiD sequencing adaptors P1 and P2 (Life Technologies), and amplified using 2x HF Phusion Master Mix (Finnzymes OY) for sequencing. High throughput sequencing of the 2x50 bp libraries was performed on SOLiD sequencers (v3plus and v4, respectively) according to the manufacturer’s recommendation (Life Technologies). Sample CD15 has been sequenced by 2x35 bp. Sequence tags were mapped to the human reference sequence (NCBI Build 36) and paired using SOLiD System Analysis Pipeline Tool Bioscope, allowing up to 12 color code mismatches per 50 bp tag and up to 4 color code mismatches per 35 bp tag. For sample CD5 and CD8, two DNA size fractions were merged for library construction which resulted in a reduced sensitivity to identify small deletions.
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model AB SOLiD System 3.0
 
Data processing The Solid paired tags designated as R3 and F3 were mapped individually to the reference sequence (NCBI build 36) in color space by the ABI SOLiD pipeline Bioscope v1.0.1 (Applied Biosystems). Contigs of the reference sequence with unresolved location (random_chr) and alternative MHC haplotypes were excluded from the reference for mapping since they caused ambiguous mapping due to high sequence similarity to other sequences in the reference.
The pairing and rescuing was performed according to the ABI SOLiD pipeline to generate paired-end tags (PETs). Based on ABI SOLiD pairing report, we further separated all the PETs into concordant PETs (cPETs) and discordant PETs (dPETs).
cPETs were defined as those where both tags mapped to same chromosome, same strand, in the correct 5’ to 3’ ordering and within expected span range. The span range was determined by a procedure based on the gradient of the insert size distribution. The PETs which were rejected by cPET criteria were classified as dPETs. These were further split into five distinct categories; (i) two tags mapped on different chromosomes, (ii) two tags mapped on the same chromosome, but different strand, (iii) two tags mapped on the same chromosome, but wrong order (5’ downstream of 3’), (iv) two tags mapped on the same chromosome, same strand, correct order, but with larger span distance than 1.1x the maximum library size, (v) two tags mapped on same chromosome, same strand, correct order, but with smaller span distance than the minimum library size. Category (v) was excluded from further analysis.
A form of ‘normalization’ was carried out to convert PETs from one strand to the other, so that all dPETs were perceived to have come from a single strand. For each discordant mapping, both the original mapping and its reverse complement mapping were considered. One of the two mappings was selected according to the following hierarchy of preferences. i) Mapping with the lower chromosome number for the 5’ tag was chosen. ii) If both chromosomes were the same, mapping which results in 5’ tag mapping to the ‘+’ strand was chosen. iii) If no such mapping existed, mapping which resulted in the 5’ tag having the smaller coordinate was chosen.
To cluster different dPETs which span the same fusion point, the following procedure was applied: the mapping location of the 5’ and 3’ tags of a given dPET was extended by the maximum insert size of the respective genomic library in both directions, creating 5’ and 3’ windows. If the 5’ and 3’ tags of a second dPET mapped within the 5’ and 3’ window of the first dPET, the two PETs were defined as a cluster of the size 2 and the 5’ and 3’ windows were adjusted so that they contained the tag extensions (by the maximum library size) of the second dPET. dPETs which subsequently mapped with their 5’ and 3’ tags within the 5’ and 3’ windows, respectively, were assigned to this cluster and the windows were adjusted, if necessary. The number of dPETs clustering together around a fusion point was represented by the cluster size. The genomic region which was covered by the 5’ tags of a cluster was defined as the 5’ anchor and the genomic region which was covered by the 3’ tags of a cluster was defined as the 3’ anchor.
Genome_build: hg18
Supplementary_files_format_and_content: Column 1: cluster ID; Column 2: chromosome of the 5’ anchor mapping location (chr. 23 = chr. X; chr. 24 = chr. Y; chr. 25 = chr. M); Column 3: strand of the 5’ anchor mapping location; Column 4: start coordinate of the 5’ anchor mapping location; Column 5: end coordinate of the 5’ anchor mapping location; Column 6: distance between start and end of the 5’ anchor mapping location; Column 7: chromosome of the 3’ anchor mapping location; Column 8: strand of the 3’ anchor mapping location; Column 9: start coordinate of the 3’ anchor mapping location; Column 10: end coordinate of the 3’ anchor mapping location; Column 11: distance between start and end of the 3’ anchor mapping location; Column 12: dPET cluster size (number of PETs connecting both sides) based on non-redundant PETs; Column 13: dPET cluster size including redundant reads which are likely derived from the same PCR product. For all coordinates, only position 1 of the alignment of a 50 bp and 35 bp sequence tag, respectively, was considered.
 
Submission date Oct 18, 2013
Last update date May 15, 2019
Contact name Axel HILLMER
E-mail(s) ahillmer@uni-koeln.de
Organization name University of Cologne
Department Institute of Pathology
Street address Kerpener Str. 62
City Cologne
ZIP/Postal code 50937
Country Germany
 
Platform ID GPL9442
Series (1)
GSE51430 Genome sequencing coupled with iPSC technology identifies GTDC1 as a novel candidate gene involved in Neurodevelopmental Disorders
Relations
BioSample SAMN02378340
SRA SRX365506

Supplementary file Size Download File type/resource
GSM1245589_hillmer_DHB004.clusters.txt.gz 53.6 Kb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap