NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM999533 Query DataSets for GSM999533
Status Public on May 15, 2013
Title PolyA
Sample type SRA
 
Source name human CML cell line K-562 grown in tissue culture (Ambion)
Organism Homo sapiens
Characteristics cell line: K-562
cell type: CML
total rna input: 1000 ng (10 ng polyA+)
rna state: intact
Extracted molecule total RNA
Extraction protocol Library construction method: Oligo (dT) Dynabeads (Invitrogen)
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina HiSeq 2000
 
Data processing **********************************************************************

PROCESSED DATA FILE NAME: GSM999533_PolyA.genes.results.txt.gz
PROCESSED DATA FILE NAME: GSM999533_PolyA.isoforms.results.txt.gz
Removed rRNA reads by aligning reads to rRNA transcripts using BWA version 0.6.1 using default parameters
Reads where neither mate aligned to rRNA were extracted using samtools 0.1.9 using the following commands: samtools view -S -b -f 4 -F 264 rRNA.sam > rRNA.1.bam; samtools view -S -b -f 8 -F 260 rRNA.sam > rRNA.2.bam; samtools view -S -b -f 12 -F 256 rRNA.sam > rRNA.3.bam
Output bam files were merged and reads were extracted using Picard
These reads were aligned to the USCS transcriptome twice. The first alignment was done to create a set of uniquely mapped reads so that they could be downsampled to a fixed number of reads which are known to map to the transcriptome. This was done using bowtie version 0.12.7 allowing for one hit per read. The following parameters were used: -q --phred33-quals -v 2 -e 99999999 -l 25 -I 1 -X 1000 -p 16 -k 1 -m 200
This alignment was then downsampled to a fixed number of reads using Picard
The downsampled portion of the reads were then aligned to the USCS transcriptome using bowtie version 0.12.7 allowing for one multiple hits per read. This was the second, final alignment. The following parameters were used: bowtie -q --phred33-quals -v 2 -e 99999999 -l 25 -I 1 -X 1000 -p 15 -a -m 200
Transcripts and genes were quantified using RSEM version 1.1.17 run with default settings using the final Bowtie alignment
genome build: UCSC Genome Browser17 knownGene transcript dataset (version 05-Feb-2012)
processed data files format and content: rsem sample_name.genes.results and sample_name.isoforms.results documentation on these file formats can be found at http://deweylab.biostat.wisc.edu/rsem/rsem-calculate-expression.html

**********************************************************************

PROCESSED DATA FILE NAME: GSE40705_K562_Isoform.txt
PROCESSED DATA FILE NAME: GSE40705_K562_gene.txt
Removed rRNA reads by aligning reads to rRNA transcripts using BWA version 0.6.1 using default parameters
We created a set of reads that did not map to rRNA to be used for further processing. To do this, reads where at least one mate did not align to rRNA were extracted for further use using samtools 0.1.9 using the following commands: samtools view -S -b -f 4 -F 264 rRNA.sam > rRNA.1.bam; samtools view -S -b -f 8 -F 260 rRNA.sam > rRNA.2.bam; samtools view -S -b -f 12 -F 256 rRNA.sam > rRNA.3.bam
Output bam files were merged and reads were extracted using Picard
These reads were aligned to the UCSC transcriptome twice. The first alignment was done to create a set of uniquely mapped reads so that they could be downsampled to a fixed number of reads which are known to map to the transcriptome. This was done using bowtie version 0.12.7 allowing for one hit per read. The following parameters were used: -q --phred33-quals -v 2 -e 99999999 -l 25 -I 1 -X 1000 -p 16 -k 1 -m 200
This alignment was then downsampled to a fixed number of reads using Picard
The downsampled portion of the reads were then aligned to the UCSC transcriptome using bowtie version 0.12.7 allowing for multiple hits per read. This was the second, final alignment. The following parameters were used: bowtie -q --phred33-quals -v 2 -e 99999999 -l 25 -I 1 -X 1000 -p 15 -a -m 200
Transcripts and genes were quanitifed using RSEM version 1.1.17 run with default settings using the final Bowtie alignment
genome build: UCSC Genome Browser17 knownGene transcript dataset (version 05-Feb-2012)
processed data files format and content: Processed data files are tab-delimited files that consist of the transcripts per million (TPM) expression level values as calculated by RSEM. Results are calculated at the gene (sample_gene.txt) and isoform (sample_isoform.txt) file. Each file contains TPM values calculated for each of the protocols listed.
 
Submission date Sep 07, 2012
Last update date May 15, 2019
Contact name Joshua Z Levin
Organization name Broad Institute of MIT & Harvard
Street address 320 Charles Street
City Cambridge
State/province MA
ZIP/Postal code 02141
Country USA
 
Platform ID GPL11154
Series (1)
GSE40705 Comprehensive comparative analysis of RNA sequencing methods for degraded or low input samples
Relations
SRA SRX185071
BioSample SAMN01163401

Supplementary file Size Download File type/resource
GSM999533_PolyA.genes.results.txt.gz 541.4 Kb (ftp)(http) TXT
GSM999533_PolyA.isoforms.results.txt.gz 937.2 Kb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data not provided for this record
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap