GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM2747922

Query DataSets for GSM2747922

Status

Public on Apr 06, 2018

Title

trans input 3

Sample type

SRA

Source name

deep mutational scanning

Organism

synthetic construct

Characteristics

molecule: PCR amplicon

Growth protocol

Yeast was grown in synthetic complete media -HIS at 30ºC (for input samples) and SC -HIS 1M NaCl at 37ºC (output samples). Cells were harvest, centrifuged and pellets saved at -20ºC for DNA extraction

Extracted molecule

other

Extraction protocol

The sequencing libraries were constructed by two consecutive PCR reactions using a method adapted from Levy et al 35. The first PCR was designed to amplify the region of interest, i.e. from directly upstream of the first mutated codon in FOS to directly upstream of the first mutated codon in JUN (the two genes are in head-to-tail orientation). The first PCR also added Unique Molecular Barcodes (UMIs) and the first half of the illumina adapter sequences. A small number of cycles of the first PCR would limit the number of differently barcoded molecules that derive from the same template molecule. The second PCR would then add the remainder of the Illumina adapter sequences. Plasmid concentrations in the total DNA extractions were first quantified by qPCR using primer pair OGD241-OGD242 that bind in ori region of the plasmid. For each six samples of each of the cis and trans libraries, respectively 4 and 48 PCRs were performed using Q5 Hot Start High-Fidelity DNA Polymerase (New England Biolabs) according to manufacturer’s protocol in 50 μL reactions with 4.2 x 107 molecules of plasmid from the DNA extraction, 25 pmol of primers OGD237 and OGD238, and a melting temperature of 66˚C (previously determined by temperature gradient), an extension time of 30 sec or 1 min, respectively, for 4 cycles. A total of 2 x 109 molecules of plasmids were then used to prepare the sequencing libraries from each sample. Excess primers were removed by adding 2 μL of ExoSAP-IT (Affymetrix) and incubating for 20 min at 37˚C followed by an inactivation for 15 min at 80˚C. This step was necessary because these 60 nt primers are not efficiently removed by the following column purification step. The 48 PCRs of each sample of the trans library were then pooled and purified using eight Qiagen PCR purification kit (Qiagen) columns per sample. According to manufacturer’s protocol, one column is able to bind 10 μg of DNA, which corresponds to ~8 x 108 genomes. Eight columns were then used to ensure that they are not saturated by the genomic DNA carried over from the DNA extraction. The DNA was eluted in 2 x 50 μL of EB buffer (provided by the manufacturer) and pooled for each sample. The eluted DNA was then split into 24 PCR reactions per, which were performed using Kapa HiFi HotStart DNA polymerase (Kapa Biosystems) according to manufacturer’s protocol in 50 μL with 15 pmol of illumina adapter primers. The reverse primers carried a different index for each of the six samples of the same library. For this PCR step, Kapa was chosen over Q5 because it was less efficient in the first PCR reactions (higher optimal melting temperature) and thus would lead to a lower re-barcoding of amplicons with new UMIs present on primers from the first PCR reaction that would have been carried over. Each sample was loaded on an agarose gel to check for correct amplification. A strong non-specific band of lower size was observed, which seemed to gradually disappear as the number of cycles in the first PCR was increased. However, increasing the number of cycles would increase the probability of producing amplicons with different UMIs that derived from the same original template molecule. The number of cycles in the first PCR was thus kept to four and the band of correct size was extracted by gel purification from each sample. To this end, the 24 PCRs of each sample were pooled, concentrated on four Qiagen PCR purification kit (Qiagen) columns per sample and each eluted with 2 x 50 μL of EB buffer (provided by the manufacturer). The bands of correct size were then purified on 2% agarose gel starting from 100 μL of each sample using 10 μL QIAEX II beads (Qiagen) according to manufacturer’s protocol and eluted in 20 μL EB buffer. DNA concentration was determined by picogreen in triplicates and the six samples were pooled at equimolar ratio. The pooled sample was sequenced in a single lane of an Illumina HiSeq2500 with 125 bp paired-end reads at the EMBL Genomics Core Facilities in Heidelberg, Germany.

Library strategy

OTHER

Library source

genomic

Library selection

other

Instrument model

Illumina HiSeq 2500

Description

from plasmid library
TableS1.xlsx

Data processing

trans library filtering (sample 1-6): Paired reads for which one of the two variable regions had an average Phred score below or equal to 20 were also discarded. Additionally, paired reads were also filtered-out if they had i) one or more non-resolved bases (Ns) in the variable or UMI regions, ii) more than one mutated codon in each gene’s variable region or iii) if the mutated codon ended in an A or T (since these were not encoded by the NNS mutagenic primers). (custom perl script)
cis library (sample 7-12): Paired-end reads, which were overlapping over the full length of the variable region, were assembled using PEAR version 0.9.6 38 with a p-value threshold for correct assembly of 0.05, a maximum and minimum fragment length of 150 nt, and a minimum overlap size of 100 nt. These parameters force the assembly of sequences of the unique expected length. An average of 30% and 20% of the paired end reads from the three input and output libraries, respectively, were not assembled and filtered out. These included the adapter sequences and indels (resulting from the <100% coupling efficiency inherent to DNA synthesis). Additionally, assembled reads with Ns in any part of the sequence were filtered out.
variant calling (custom perl script)
UMIs counting (custom perl script)
Supplementary_files_format_and_content: Excel file with read count for each variants in each sample

Submission date

Aug 21, 2017

Last update date

May 15, 2019

Contact name

Guillaume Diss

E-mail(s)

guillaume.diss@gmail.com

Organization name

CRG

Street address

C/ Dr. aiguader, 88