NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE97211 Query DataSets for GSE97211
Status Public on Apr 01, 2017
Title High-confidence Coding and Noncoding Transcriptome Maps
Sample organisms Homo sapiens; Mus musculus
Experiment type Expression profiling by high throughput sequencing
Third-party reanalysis
Summary The advent of high-throughput RNA sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising 230 billion RNA-seq reads from the ENCODE, Human BodyMap Projects, The Cancer Genome Atlas, and GTEx, CAFE enabled us to predict the directions of about 220 billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalogue that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of non-coding genomes.
 
Overall design The direction of unstranded reads (from ENCODE, Human BodyMap Projects, GTEx and TCGA as well as from HeLa and mES cells) were predicted using k-order Markov chain models (kMC) generating a read with a predicted direction (RPD) and were used to assemble transcriptome maps (BIGTranscriptome). Those transcriptome maps were next used for quantification of RPDs.

The following sites contain the bam files for this study:
http://big.hanyang.ac.kr/downloads/datasets/RPDs/
All individual paths to the Sample bam files are listed in the attached metadata sheet.
 
Contributor(s) Nam J, You B, Yoon S
Citation(s) 28396519
Submission date Mar 29, 2017
Last update date Jul 01, 2017
Contact name Jin-Wu Nam
E-mail(s) jwnam@hanyang.ac.kr
Phone +82 2-2220-2428
Organization name Hanyang University
Department Department of Life Science
Street address Seongdong-Gu Hangdang-dong
City Seoul
ZIP/Postal code 133-791
Country South Korea
 
This SubSeries is part of SuperSeries:
GSE97212 The Project for High-Confidence Coding and Noncoding Transcriptome Maps
Relations
Reanalysis of GSM958728
Reanalysis of GSM958742
Reanalysis of GSM958747
Reanalysis of GSM958748
Reanalysis of GSM958733
Reanalysis of GSM958743
Reanalysis of GSM958735
Reanalysis of GSM958732
Reanalysis of GSM958744
Reanalysis of GSM958734
Reanalysis of GSM958729
Reanalysis of GSM958745
Reanalysis of GSM958736
Reanalysis of GSM958746
Reanalysis of GSM759491
Reanalysis of GSM759490
Reanalysis of GSM759493
Reanalysis of GSM759492
Reanalysis of GSM759495
Reanalysis of GSM759494
Reanalysis of GSM759497
Reanalysis of GSM759496
Reanalysis of GSM759499
Reanalysis of GSM759498
Reanalysis of GSM759501
Reanalysis of GSM759500
Reanalysis of GSM759503
Reanalysis of GSM759502
Reanalysis of GSM759505
Reanalysis of GSM759504
Reanalysis of GSM759507
Reanalysis of GSM759506
Reanalysis of GSM759509
Reanalysis of GSM759508
Reanalysis of GSM759511
Reanalysis of GSM759510
Reanalysis of GSM759513
Reanalysis of GSM759512
Reanalysis of GSM759515
Reanalysis of GSM759514
Reanalysis of GSM759517
Reanalysis of GSM759516
Reanalysis of GSM759519
Reanalysis of GSM759518
Reanalysis of GSM759521
Reanalysis of GSM759520
Reanalysis of GSM591659
Reanalysis of GSM2254467
Reanalysis of GSM591682
BioProject PRJNA381218

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE97211_BIGTranscriptome-TS_Cerebellum.gtf.gz 38.2 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Cortex.gtf.gz 36.0 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_ESCA.gtf.gz 41.6 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Esophagus.gtf.gz 43.9 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_FrontalCortex.gtf.gz 35.3 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_HNSC.gtf.gz 36.2 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Hippocampus.gtf.gz 34.9 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Hypothalamus.gtf.gz 35.6 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Kidney.gtf.gz 33.9 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_LIHC.gtf.gz 39.9 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_LUAD.gtf.gz 36.5 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_LUSC.gtf.gz 35.3 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Liver.gtf.gz 36.9 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Lung.gtf.gz 42.2 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Ovary.gtf.gz 37.0 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Pancreas.gtf.gz 41.5 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Prostate.gtf.gz 38.2 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_Testis.gtf.gz 41.9 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome-TS_WholeBlood.gtf.gz 38.5 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome.gtf.gz 19.0 Mb (ftp)(http) GTF
GSE97211_BIGTranscriptome_lncRNA_catalog.gtf.gz 1.1 Mb (ftp)(http) GTF
GSE97211_ENCODE_GM12878_2x75_200_Rep1.assigned.count.txt.gz 243.0 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12878_2x75_200_Rep2.assigned.count.txt.gz 245.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12878_2x75_400_Rep2.assigned.count.txt.gz 233.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12891_2x75_200_Rep1.assigned.count.txt.gz 242.9 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12891_2x75_200_Rep2.assigned.count.txt.gz 239.6 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12892_2x75_200_Rep1.assigned.count.txt.gz 243.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12892_2x75_200_Rep2.assigned.count.txt.gz 240.0 Kb (ftp)(http) TXT
GSE97211_ENCODE_GM12892_2x75_200_Rep3.assigned.count.txt.gz 239.9 Kb (ftp)(http) TXT
GSE97211_ENCODE_H1-hESC_2x75_200_Rep1.assigned.count.txt.gz 245.3 Kb (ftp)(http) TXT
GSE97211_ENCODE_H1-hESC_2x75_200_Rep2.assigned.count.txt.gz 245.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_H1-hESC_2x75_200_Rep3.assigned.count.txt.gz 245.0 Kb (ftp)(http) TXT
GSE97211_ENCODE_H1-hESC_2x75_200_Rep4.assigned.count.txt.gz 241.7 Kb (ftp)(http) TXT
GSE97211_ENCODE_H1-hESC_2x75_400_Rep1.assigned.count.txt.gz 238.6 Kb (ftp)(http) TXT
GSE97211_ENCODE_HSMM_2x75_200_Rep1.assigned.count.txt.gz 243.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_HSMM_2x75_200_Rep2.assigned.count.txt.gz 243.8 Kb (ftp)(http) TXT
GSE97211_ENCODE_HUVEC_2x75_200_Rep1.assigned.count.txt.gz 239.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_HUVEC_2x75_200_Rep2.assigned.count.txt.gz 237.7 Kb (ftp)(http) TXT
GSE97211_ENCODE_HeLa-S3_2x75_200_Rep1.assigned.count.txt.gz 238.0 Kb (ftp)(http) TXT
GSE97211_ENCODE_HeLa-S3_2x75_200_Rep2.assigned.count.txt.gz 241.2 Kb (ftp)(http) TXT
GSE97211_ENCODE_HepG2_2x75_200_Rep1.assigned.count.txt.gz 240.8 Kb (ftp)(http) TXT
GSE97211_ENCODE_HepG2_2x75_200_Rep2.assigned.count.txt.gz 241.1 Kb (ftp)(http) TXT
GSE97211_ENCODE_K562_2x75_200_Rep1.assigned.count.txt.gz 245.7 Kb (ftp)(http) TXT
GSE97211_ENCODE_K562_2x75_200_Rep2.assigned.count.txt.gz 246.2 Kb (ftp)(http) TXT
GSE97211_ENCODE_MCF-7_2x75_200_Rep1.assigned.count.txt.gz 246.8 Kb (ftp)(http) TXT
GSE97211_ENCODE_MCF-7_2x75_200_Rep2.assigned.count.txt.gz 242.0 Kb (ftp)(http) TXT
GSE97211_ENCODE_MCF-7_2x75_200_Rep3.assigned.count.txt.gz 245.4 Kb (ftp)(http) TXT
GSE97211_ENCODE_NHEK_2x75_200_Rep1.assigned.count.txt.gz 240.9 Kb (ftp)(http) TXT
GSE97211_ENCODE_NHEK_2x75_200_Rep2.assigned.count.txt.gz 241.9 Kb (ftp)(http) TXT
GSE97211_ENCODE_NHLF_2x75_200_Rep1.assigned.count.txt.gz 242.2 Kb (ftp)(http) TXT
GSE97211_ENCODE_NHLF_2x75_200_Rep2.assigned.count.txt.gz 240.3 Kb (ftp)(http) TXT
GSE97211_GTEx_Brain_cerebellum.assigned.count.tar.gz 33.7 Mb (ftp)(http) TAR
GSE97211_GTEx_Brain_cortex.assigned.count.tar.gz 30.5 Mb (ftp)(http) TAR
GSE97211_GTEx_Brain_frontal_cortex.assigned.count.tar.gz 26.9 Mb (ftp)(http) TAR
GSE97211_GTEx_Brain_hippocampus.assigned.count.tar.gz 23.7 Mb (ftp)(http) TAR
GSE97211_GTEx_Brain_hypothalamus.assigned.count.tar.gz 24.1 Mb (ftp)(http) TAR
GSE97211_GTEx_Esophagus_Mucosa.assigned.count.tar.gz 77.3 Mb (ftp)(http) TAR
GSE97211_GTEx_Kidney.assigned.count.tar.gz 8.6 Mb (ftp)(http) TAR
GSE97211_GTEx_Liver.assigned.count.tar.gz 30.9 Mb (ftp)(http) TAR
GSE97211_GTEx_Lung.assigned.count.tar.gz 85.4 Mb (ftp)(http) TAR
GSE97211_GTEx_Ovary.assigned.count.tar.gz 25.2 Mb (ftp)(http) TAR
GSE97211_GTEx_Pancreas.assigned.count.tar.gz 45.3 Mb (ftp)(http) TAR
GSE97211_GTEx_Prostate.assigned.count.tar.gz 28.4 Mb (ftp)(http) TAR
GSE97211_GTEx_Testis.assigned.count.tar.gz 49.6 Mb (ftp)(http) TAR
GSE97211_GTEx_Whole_blood.assigned.count.tar.gz 100.6 Mb (ftp)(http) TAR
GSE97211_HeLa-S3_paired_end.assigned.count.txt.gz 95.2 Kb (ftp)(http) TXT
GSE97211_HeLa-S3_transcripts.gtf.gz 2.8 Mb (ftp)(http) GTF
GSE97211_Human_BodyMap_Adipose_paired_end.assigned.count.txt.gz 246.4 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Adipose_single_end.assigned.count.txt.gz 245.7 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Adrenal_paired_end.assigned.count.txt.gz 250.9 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Adrenal_single_end.assigned.count.txt.gz 250.4 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Brain_paired_end.assigned.count.txt.gz 250.9 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Brain_single_end.assigned.count.txt.gz 249.2 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Breast_paired_end.assigned.count.txt.gz 249.1 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Breast_single_end.assigned.count.txt.gz 248.7 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Colon_paired_end.assigned.count.txt.gz 246.9 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Colon_single_end.assigned.count.txt.gz 246.2 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Heart_paired_end.assigned.count.txt.gz 245.5 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Heart_single_end.assigned.count.txt.gz 244.2 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Kidney_paired_end.assigned.count.txt.gz 249.1 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Kidney_single_end.assigned.count.txt.gz 248.4 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Liver_paired_end.assigned.count.txt.gz 242.3 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Liver_single_end.assigned.count.txt.gz 241.4 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Lung_paired_end.assigned.count.txt.gz 248.4 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Lung_single_end.assigned.count.txt.gz 248.1 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_LymphNode_paired_end.assigned.count.txt.gz 248.7 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_LymphNode_single_end.assigned.count.txt.gz 248.2 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Ovary_paired_end.assigned.count.txt.gz 251.6 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Ovary_single_end.assigned.count.txt.gz 251.2 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Prostate_paired_end.assigned.count.txt.gz 250.9 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Prostate_single_end.assigned.count.txt.gz 250.6 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_SkeletalMuscle_paired_end.assigned.count.txt.gz 239.8 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_SkeletalMuscle_single_end.assigned.count.txt.gz 239.5 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Testis_paired_end.assigned.count.txt.gz 259.9 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Testis_single_end.assigned.count.txt.gz 259.2 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Thyroid_paired_end.assigned.count.txt.gz 251.4 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_Thyroid_single_end.assigned.count.txt.gz 250.7 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_WhiteBloodCell_paired_end.assigned.count.txt.gz 243.1 Kb (ftp)(http) TXT
GSE97211_Human_BodyMap_WhiteBloodCell_single_end.assigned.count.txt.gz 242.6 Kb (ftp)(http) TXT
GSE97211_TCGA_Esophagus_Adenocarcinoma_NOS.assigned.count.tar.gz 24.0 Mb (ftp)(http) TAR
GSE97211_TCGA_Esophagus_Squamous_Cell_Carcinoma.assigned.count.tar.gz 23.8 Mb (ftp)(http) TAR
GSE97211_TCGA_Head_and_Neck_Squamous_Cell_Carcinoma.assigned.count.tar.gz 131.9 Mb (ftp)(http) TAR
GSE97211_TCGA_Liver_Hepatocellular_Carcinoma.assigned.count.tar.gz 98.5 Mb (ftp)(http) TAR
GSE97211_TCGA_Lung_Adenocarcinoma.assigned.count.tar.gz 143.4 Mb (ftp)(http) TAR
GSE97211_TCGA_Lung_Squamous_Cell_Carcinoma.assigned.count.tar.gz 133.1 Mb (ftp)(http) TAR
GSE97211_mES_paired_end.assigned.count.txt.gz 93.8 Kb (ftp)(http) TXT
GSE97211_sample_meta_paths_to_rpds_files.xls.gz 26.2 Kb (ftp)(http) XLS
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap