GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM1378957

Query DataSets for GSM1378957

Status

Public on May 01, 2017

Title

rhesus dsRNA-seq

Sample type

SRA

Source name

brain tissue

Organism

Macaca mulatta

Characteristics

library strategy: dsRNA-seq
enzyme treatment: ssRNase in vitro
ngs platform: Illumina Hi-Seq2000 50 nt strand-specific
tissue: brain

Treatment protocol

For all the in vivo dsRNA-seq and ssRNA-seq libraries, a 37% formaldehyde solution (Sigma, St. Louis, MO) was added drop-wise with mixing directly to cell culture dishes containing 90% confluent cells to a final concentration of 1% and incubated at room temperature for 10 minutes to cross-link the RNAs with proteins. The cross-linking reaction was quenched with 1 M glycine (Sigma, St Louis, MO) at a final concentration of 125 mM for 5 minutes with mixing. Then, cells were washed twice with ice-cold PBS and collected.

Growth protocol

For invivo HEK293T dsRNA-seq and ssRNA-seq libraries, HEK293T cells were seeded in 15-cm standard Corning tissue-culture treated culture dishes (Sigma, St Louis, MO), grown to 90% confluence (approximately 18 million cells) in DMEM media (Life Technologies, San Diego, CA) supplemented with L-glutamine, 4.5 g/L D-glucose, 10% fetal bovine serum (FBS (Atlanta Biologics, Atlanta, GA)) and Pen/Strep (Fisher Scientific, Waltham, MA).

Extracted molecule

total RNA

Extraction protocol

For all the in vitro dsRNA-seq and ssRNA-seq libraries, RNAs were treated with ssRNase (RNaseOne) and dsRNase (RNase V1), respectively; For mRNA-seq libraries poly A+ RNAs were isolated from total RNA using Dynabeads mRNA direct kit (Ambion, Austin, TX); For smRNA-seq libraries RNAs were prepared as previously described in F. Li et al., The Plant cell 24, 4346. Briefly, total RNA was run on a 15% TBE-Urea Gel and gel slices containing 15-35 nt RNAs were isolated and eluted from acrylimalyde by mixing with 0.3 M NaCl for 4 hours at RT.
The dsRNA-seq, ssRNA-seq, mRNA-seq and smRNA-seq libraries were constructed as previously described in F. Li et al., The Plant cell 24, 4346. The 3’ and 5’ adapters, Reverse Transcriptase Primer, 5’ PCR primer, and 3’ PCR Primers matching the TruSeq small RNA sequencing kit (Illumina, San Diego, CA) were synthesized and HPLC purified by IDT (Coralville, IA).

Library strategy

OTHER

Library source

transcriptomic

Library selection

other

Instrument model

Illumina HiSeq 2000

Description

dsRNA (double-stranded RNA)
RNA is treated with ssRNase (RNaseOne) in vitro
rhesus_ensGene_mRNA_structScore.txt
rhesus_ensGene_mRNA_struct_hotspot_anno.txt

Data processing

Quality control of the reads. The quality scores for the raw read were carefully checked for their average values, minimum, maximum and standard deviations to make sure all library qualities were in good and comparable status. The correct labeling of 3'-adapters and multiplex indices were also checked to confirm the libraries.
Trimming of 3'-adapters. Reads were trimmed to remove 3'-adapters using Cutadapt (v1.0) program with ≤10% mismatches and ≥10 nt aligned bases. Reads ≤15 nt after trimming were discarded before all subsequent analysis. Untrimmed reads were kept separately from trimmed reads and processed independently in following steps.
Reducing to NR-tags. Both trimmed and untrimmed reads were reduced to NR-tags to save processing time and space requirements. However, the clone abundance of each NR-tag was retained and used in all subsequent abundance calculations in this study, therefore, we will call reads and NR-tags interchangeably hereafter.
Mapping to primate genomes. NR-tags (reads) were mapped to UCSC hg19, refMac2 and rheMac2 genomes using Bowtie (v.0.12.9) program for human, rhesus and cynomolgus, respectively. Note that cynomolgus only has a draft genome release and no corresponding gene model annotations, so we used the reference genome as well as genome annotation of rhesus for all cynomolgus analysis in this study. The mapping parameters for Bowtie program were carefully tuned to insure all alignments with ≤6% mismatches in the first 34 nt seeds and ≤8% mismatches in the whole reads will be reported, and up-to 100 random mapping hits were allowed. Note that for GMUCT reads up-to 10 random hits were allowed. Also note that for all libraries from cynomolgus, the mismatch criteria were 2% looser (8% for seed and 10% for entire read) for the mapping, because they were mapping to a slightly divergent genome instead their own. All mapping results were post-filtered to guarantee the mapping criteria listed above using in-house programs.
For reads with multiple hits on the genomes (multiply mapped), a max-diverge filter was also implemented, to select only those hits with a mismatch percentage no more than 4% to the best hit of each read. This filter is similar to UCSC BLAT and Bowtie “--best” mode. All mapping information, such as insert length, clone abundance, and number of locations on the genome were summarized and loaded into local MySQL databases for further queries.
Mapping across transcript splicing boundaries. To map across splicing boundaries, we first compiled our own GFF annotation files of transcript exons, including coding mRNAs (RefSeq-mRNA for human, Ensembl-mRNA for rhesus and cynomolgus), all non-coding RNAs (Rfam annotated structural RNAs for human and Ensembl-ncRNAs for rhesus and cynomolgus), and also lincRNAs for human. The unmapped reads from previous steps were collected and remapped to the genomes provided these GFF annotations using TopHap (v2.0.8) program. The parameters were tuned to meet the exact same criteria as for the Bowtie mapped reads. At last, the Tophat hits (spliced reads) were also selected for the max-diverge filter and summaries and stored into local MySQL database.
Genome_build: UCSC hg19 for human, UCSC rheMac2 for rhesus and cynomolgus libraries
Supplementary_files_format_and_content: The "*_expr.txt" files are tab-delimited text files containing the expression values of all primate mRNA transcripts in raw clone number (weighted), RPM and RPKM values.
Supplementary_files_format_and_content: The "*_structScore.txt" files are tab-delimited text files containing the calculated "structure score" of every nucleotide of all primate mRNA transcripts with sufficient dsRNA-seq and ssRNA-seq coverages, as well as other pertinent information such as their gene length, mRNA length, and (weighted) clone number of the dsRNA-seq and ssRNA-seq reads.
Supplementary_files_format_and_content: The "*_struct_hotspot_anno.txt" files are tab-delimited text files containing numerous information for all predicted "structure hotspots", defined as long regions with significantly high structure scores in three primates. They also contain many other pertinent information, such as the class, summit, average and max structure score of the hotspots, RPKM values from mRNA-seq, smRNA-seq and public GMUCT libraries for the entire mRNAs or just the structure hotspot regions. Information regarding whether they are bound by the DGCR8/Drosha microprocessor and whether are conserved between human and rhesus is also included.

Submission date

May 05, 2014

Last update date

May 15, 2019

Contact name

Qi Zheng

E-mail(s)

e00011027@gmail.com

Phone

1-215-898-0808

Organization name

University of Pennsylvania