NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM1528544 Query DataSets for GSM1528544
Status Public on Dec 03, 2014
Title Sleckman-LH8-110-inv-B5L-P548-K-Ind29-GTTCTCA_S22_L001_R2_001
Sample type SRA
 
Source name Abelson transformed pre-B cells
Organism Mus musculus
Characteristics sequencing run: set 5
cell type: Abelson transformed pre-B cells
genotype/variation: Ligase IV -/- H2AX -/- Line #8 CJ #148 Substrate PCR for de novo DSBs. Sequences mixed with regular substrate PCR sequences. Pull out sequences of interest using primer specific primer sequence.
Treatment protocol Genomic DNA was isolated from Abelson kinase transformed pre-B cells harboring single double stranded DNA breaks. The genomic DNA was treated with single stranded DNA ligase followed by sodium bisulphite treatment and PCR as described for HCoDES in the following paper: Yair Dorsett1, Yanjiao Zhou2,3, Anthony T. Tubbs1, Bo-Ruei Chen1, Caitlin Purman1, Baeck-Seung Lee1, Rosmy George1, Andrea L. Bredemeyer1, Jiang-yang Zhao1, Erica Sodergen3*, George M. Weinstock3*, , Nathon, D, Han4, Alejandro Reyes1,4,+, Eugene M. Oltz1, Dale Dorsett5, Ziva Misulovin5, Jacqueline E. Payton1 and Barry P. Sleckman1 (2014), Molec Cell In Press
Growth protocol Abelson transformed pre-B cells were grown as previously described in the following paper: Bredemeyer, A.L., Sharma, G.G., Huang, C.Y., Helmink, B.A., Walker, L.M., Khor, K.C., Nuskey, B., Sullivan, K.E., Pandita, T.K., Bassing, C.H., and Sleckman, B.P. (2006). ATM stabilizes DNA double-strand-break complexes during V(D)J recombination. Nature 442, 466-470.
Extracted molecule genomic DNA
Extraction protocol DNA extraction was done as described in the following paper. Blondal, T., Thorisdottir, A., Unnsteinsdottir, U., Hjorleifsdottir, S., Aevarsson, A., Ernstsson, S., Fridjonsson, O.H., Skirnisdottir, S., Wheat, J.O., Hermannsdottir, A.G., et al. (2005). Isolation and characterization of a thermostable RNA ligase 1 from a Thermus scotoductus bacteriophage TS2126 with good single-stranded DNA ligation properties. Nucleic Acids Res 33, 135-142.
Sequencing adaptors for Illumina MiSeq Sequencing was done by PCR as described for the HCoDES procedure in the reference above. Amplified PCR products were then pooled in apporxomately equimolar concentrations for Illumiina MiSeq using 40% Phi-X.
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina MiSeq
 
Description Ligase IV -/- H2AX -/- Line #8 CJ #148 Substrate PCR for de novo DSBs. Sequences mixed with regular substrate PCR sequences. Pull out sequences of interest using primer specific primer sequence.
Data processing Paired end reads from the MiSeq Desktop Sequencer were de-multiplexed by the sample indexes allowing for a maximum 1 base pair mismatch within the index.
For samples where the same Index was used, but different primer sequences were used, the appropiate sequences were first extracted by the appropaite primer sequence and then by the Index sequence.
Adaptor sequences were then clipped from reads 1 and 2 using FLEXBAR version 2.4 (parameters: -m 10 -u 5 -ae RIGHT -r -aa -at 0.5). (Dodt, M., Roehr, J.T., Ahmed, R., and Dieterich, C. (2012). FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. Biology (Basel) 1, 895-905.). 
Paired reads were assembled using FLASH-1.2.7 software (parameters: -m 10 -M 100 -x 0.25). (Magoc, T., and Salzberg, S.L. (2011). FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957-2963.).
The assembly parameters were adjusted based on the read length distribution of a particular sample. When a sample was dominated by short reads that resulted in 100% sequence overlap between read 1 and 2,  ~50 bases from the end of the reads was removed using PrinSeq in order to the prevent mis-assembly. Schmieder, R., and Edwards, R. (2011). Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863-864.
The trimmed reads were assembled again by FLASH.
Individual assembled reads (contigs) were then aligned pairwise to the reference sequence using the MUSCLE aligner as a scheduled batch job using The Genome Institute clusters (parameters: default, reference sequences: for Igkappa GenBank NG_005612.1, for TCRbeta AC117678.16, for recombination substrate refer to Bredemeyer, A.L., Sharma, G.G., Huang, C.Y., Helmink, B.A., Walker, L.M., Khor, K.C., Nuskey, B., Sullivan, K.E., Pandita, T.K., Bassing, C.H., and Sleckman, B.P. (2006) ATM stabilizes DNA double-strand-break complexes during V(D)J recombination. Nature 442, 466-470).
Using customized Perl scripts, alignments with >5bp deletions on either strand (mis-assembled sequences due to poor sequence quality) were removed.
Using customized Perl scripts, alignments without >10 bps of continuous reference sequence (primer dimers) were removed.
Using customized Perl scripts, alignments where assembled sequences appeared to be longer than the full length coding end (usually mis-assembled reads, small deletions near coding end or insertions) were removed.
Using customized perl scripts, DNA end structures were then calculated and the structures from triplicate PCR reactions were pooled together into a single file. As a convention for calculation of DNA end structure, nucleotides at the ligation junction that could be assigned to either the 5’ or 3’ end were assigned to the 5’ end except for the analysis of Cas9 and Eb:ZFN DNA ends where the nucleotides were assigned to generate blunt and 4 nucleotide 5’ overhangs, respectively, when possible.
Using R, products smaller than 50 base pairs and products with only single reads were removed from further analysis.
Graphs were generated using customized R scripts.
Genome_build: For Igkappa GenBank NG_005612.1, for TCRbeta AC117678.16, for recombination substrate refer to Bredemeyer et al. Nature 442, 466-470)
Supplementary_files_format_and_content: .txt files reporting: ced (The distance of a particular 5' end from the double stranded DNA break), overhang (The length of the overhang associated with that particular 5' end. A negative value means a 5' overhang and a positive value means a 3' overhang. A zero value means a blunt end.), and u2..3. (The number of sequencing reads that had that particular DNA end structure (position of 5' end and overhang).
 
Submission date Oct 21, 2014
Last update date May 15, 2019
Contact name Jacqueline Payton
E-mail(s) jpayton@wustl.edu
Organization name Washington University School of Medicine
Department Pathology and Immunology
Street address 660 S. Euclid Ave
City St. Louis
State/province MO
ZIP/Postal code 63110
Country USA
 
Platform ID GPL16417
Series (1)
GSE62534 Hairpin Capture of DNA End Structures reveals chromosomal DNA end structure with single nucleotide resolution
Relations
BioSample SAMN03121553
SRA SRX736726

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap