GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM5979748

Query DataSets for GSM5979748

Status

Public on Oct 04, 2022

Title

FUL-SEP1_Round0

Sample type

SRA

Source name

in vitro generated 40bp dsDNA sequences

Organism

synthetic construct

Characteristics

protein complex interrogated: FUL-SEP1
round of selex enrichment: Round 0

Extracted molecule

genomic DNA

Extraction protocol

SELEX-seq dsDNA libraries were generated from ssDNA sequences by a single-cycle PCR amplification round with complementary primers essentially as described by Jolma et al. (2010). dsDNA sequences contain a 40 bp random sequence flanked by barcodes needed for later characterization after multiplex sequencing. In addition, the dsDNA sequences contained all features needed for direct sequencing on an HiSeq 2000 sequencer (Illumina). First, purified FUL antibodies resuspended in 1X PDS were coupled to magnetic beads according to manufacturer’s instructions (MyOne, Invitrogen). As for EMSA, proteins were synthesized using TNT SP6 Quick Coupled Transcription/ Translation System (Promega) according to manufactures instructions in a total volume of 20 µl. Binding reaction mix was prepared essentially as for EMSA experiments with a total volume of 120 µl (Smaczniak et al., 2012b), the mix contained 20 µl in vitro synthesized protein and 50-100 ng dsDNA. The binding mix was incubated on ice for 1 hour. Followed by immunoprecipitation using 0.5 mg FUL-antibody coupled to magnetic beads (MyOne, Invitrogen) in a thermomixer at 4ºC at 700 rpm. After immunoprecipitation, magnetic beads were washed 5 times with 150 µl binding buffer without salmon-sperm DNA. Bound DNA was elucidated in 50 µl 1X TE in a 90ºC thermomixer at full speed. Next, magnetic beads were immobilized and supernatant transferred to a new tube. To allow a second round of SELEX, DNA fragments were amplified with 8 to 16 cycles of PCR with SELEX round specific primers (Jolma et al., 2010). The amplification efficiency was checked on agarose gel by comparison to a sample of known concentration. The total amplicon was used for a subsequent round of SELEX. Round for sequencing were cut out from gel after PCR amplification using MinElute Gel Extraction Kit (Qiagen). Multiple SELEX samples were multiplexed in quimolar amounts, sequencing was performed in a HiSeq 2000 sequencer (Illumina).

Library strategy

SELEX

Library source

genomic

Library selection

other

Instrument model

Illumina HiSeq 2000

Data processing

Fastq reads that did not pass the filter quality of CASAVA 1.8 or mapped with no mismatches to the phix174 genome using SOAPv2.21 were eliminated
The remaining sequences in fasta format were extracted and grouped according to library specific barcodes, allowing no mismatches. Barcodes were removed leading to 40 bp sequence libraries used in the data analysis. The 40 bp sequences that were present in libraries in an unexpected high number (>1000) were eliminated, as well as 40 bp reads containing the sequence “TCGTATGCCG” which is part of the Illumina adapter sequence used for sequencing. Data analysis was essentially performed as described before (Slaterry et al 2011). We based our analysis in 14-mer sequences. Frequencies of kmer sequences in each round except Round 0 was calculated directly from the data using the function oligonucleotideFrequency from the Bioconductor R package: Biostrings. Sequences in Round 0 represent a set of randomly synthesized oligonucleotides and their complexity did not allow for the direct calculation of 14-mer frequencies. Therefore, the sequence frequency in Round 0 was estimated by the sixth-order Monte Carlo model, as proposed before (Slattery et al., 2011). We chose the sixth-order Monte Carlo model because when the model was trained using 75% of the sequencing data, it resulted in the highest prediction value as measured by the Pearson correlation coefficient between the predicted and observed frequencies in the other 25% of the sequencing data. Relative affinity for each Kmer was calculated as the ratio between its frequency in Round 5 versus Round 0 to the power of ⅕. Kmers with less than 10 reads in Round 5 were eliminated. Finally, relative affinities were normalized to 1 by dividing for the highest affinity-predicted Kmer.
Supplementary files format and content: CSV file, with 4 columns: 14-mer sequence, DNA binding speficicity, SE of the estimation, Number of reads supporting this Kmer In Round5

Submission date

Mar 28, 2022

Last update date

Oct 04, 2022

Contact name

Jose M. Muino

E-mail(s)

jose.muino@hu-berlin.de

Organization name

Humboldt University

Department

Department of Biology