NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM3038848 Query DataSets for GSM3038848
Status Public on Jun 20, 2018
Title Chimpanzee_RepliSeq
Sample type SRA
 
Source name Chimpanzee lymphocyte
Organism Pan troglodytes
Characteristics cell type: lymphocyte
Growth protocol GM12878 growth condition is described at https://data.4dnucleome.org/biosources/4DNSRH17RFKR/, https://data.4dnucleome.org/biosamples/4DNBS3I5U7BY/. For the lymphocyte cells of chimpanzee, orangutan, gibbon and green monkey, all suspensions (lymphocytes) were cultured using RPMI,10%FBS,1%P/S,1%Lglut.Cultures were maintained in T25 flasks at 37˚C, 5% CO2 and passaged to maintain adequate confluency.
Extracted molecule genomic DNA
Extraction protocol https://data.4dnucleome.org/protocols/5e017160-ad8b-49c1-a129-ccbd29966b6d/
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina HiSeq 2500
 
Description Files named *.24k.norm.bw are imputed, smoothed and scaled processed data. human.R1.6k.bw is intermediate processed file with non-imputed, non-smoothed and non-scaled data. Human has two replicates R1 and R2. Raw files with ID 1108,1208 are for R2. Raw files with ID 0908, 1008 are for R1. For human only R2 is used for study and has imputed, smoothed and scaled processed data.
Data processing Library strategy: RepliSeq
Quality control for fastq files were performed with low quality reads trimed using fastqc.
The adapter sequences were removed using FASTX-Toolkit with parameters fastx_clipper -Q 33 -v -n -l 10. Reads were mapped to hg19 (panTro4,ponAbe2,nomLeu3,chlSab2 respectively, for reads from different species) with Bowtie2.
SAM to BAM format conversion, sorting of BAM file and removal of duplicate reads were performed using Samtools.BAM file was converted to BED file using bamToBed.
The reference genome hg19 were divided into 6kb bins and aligned to the genome of each of the other primate species to the human genome using liftOver, to obtain the orthologous region of each bin in human in the other primate species, with command: liftOver oldFile map.chain newFile unMapped -minMatch=0.5
Read count in each given genome window (orthologous region) was calculated and normalized by the total read count in replication timing (RT) early or late fraction on the whole genome, respectively.
Base 2 logarithm ratio of read count per million reads between the early and late fractions of RT within a genome window was calculated as the original RT signal of this window.
Missing data within gaps smaller than 48kb were imputed using nearest neighbor imputation
RT signals of each species were smoothed with wavelet smoothing using HMMSeg with window size of 24kb.
Orthologous regioins across five species where RT signals of five species are all available are extracted. RT signals in these regions are scaled to be [0,5] for early RT and [-5,0] for late RT for each species.
Genome_build: hg19
Genome_build: panTro4
Genome_build: ponAbe2
Genome_build: nomLeu3
Genome_build: chlSab2
Supplementary_files_format_and_content: bigWig files were generated with the value column represents the RT signal (log2 ratio of read count per million reads between the early and late fractions of RT within the orthologous region). The region in each file is the aligned region in the reference genome (hg19). human.R1.6k.bw is intermediated processsed file for human replicate 1 of which the RT signals were non-imputed, non-smoothed and non-scaled.
 
Submission date Mar 12, 2018
Last update date Jun 20, 2018
Contact name Yang Yang
E-mail(s) yy3@andrew.cmu.edu
Organization name Carnegie Mellon University
Department Computational Biology
Street address 5000 Forbes Ave
City Pittsburgh
State/province PA
ZIP/Postal code 15213
Country USA
 
Platform ID GPL19148
Series (1)
GSE111733 Continuous-trait probabilistic model for comparing multi-species functional genomic data
Relations
BioSample SAMN08688287
SRA SRX3783218

Supplementary file Size Download File type/resource
GSM3038848_chimp.s24k.norm.bw 4.1 Mb (ftp)(http) BW
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap